数据集:
hebrew_this_world
语言:
he计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
expert-generated源数据集:
original许可:
agpl-3.0HebrewThisWorld is a data set consists of 2028 issues of the newspaper 'This World' edited by Uri Avnery and were published between 1950 and 1989. Released under the AGPLv3 license.
Data Annotation:
Language modeling
Hebrew
csv file with "," delimeter
Sample:
{ "issue_num": 637, "page_count": 16, "date": "1950-01-01", "date_he": "1 בינואר 1950", "year": "1950", "href": "https://thisworld.online/1950/637", "pdf": "https://olam.eu-central-1.linodeobjects.com/pdfs/B-I0637-D010150.pdf", "coverpage": "https://olam.eu-central-1.linodeobjects.com/pages/637/t-1.png", "backpage": "https://olam.eu-central-1.linodeobjects.com/pages/637/t-16.png", "content": "\nלפיד\nהנוער ־ בירושלים צילומים :\n\nב. רותנברג\n\nוזהו הלפיד\n...", "url": "https://thisworld.online/api/1950/637" }
train | |
---|---|
corpus | 2028 |
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?Researchers
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
GNU AGPLv3+
This is free software, and you are welcome to redistribute it under certain conditions.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/ .
Thanks to @lhoestq , @imvladikon for adding this dataset.