The dataset contains (almost) the entire OpenSubtittles database for Japanese:
File contents:
OpenSubtitles.parquet: The text and the time data.
OpenSubtitles_meta.parquet: The existing metadata for each title.
OpenSubtitles-OA.parquet: The dataset coded with two columns SOURCE(the name of the movie/tv show), and TEXT (the subtittles) following the Open Assistant rules.
Both tables can be joined by the ID column. (The value can be NULL in the meta table).