数据集:
hebrew_this_world
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
expert-generated源数据集:
original许可:
HebrewThisWorld is a data set consists of 2028 issues of the newspaper 'This World' edited by Uri Avnery and were published between 1950 and 1989. Released under the AGPLv3 license.
Data Annotation:
Language modeling
Hebrew
csv file with "," delimeter
Sample:
{
"issue_num": 637,
"page_count": 16,
"date": "1950-01-01",
"date_he": "1 בינואר 1950",
"year": "1950",
"href": "https://thisworld.online/1950/637",
"pdf": "https://olam.eu-central-1.linodeobjects.com/pdfs/B-I0637-D010150.pdf",
"coverpage": "https://olam.eu-central-1.linodeobjects.com/pages/637/t-1.png",
"backpage": "https://olam.eu-central-1.linodeobjects.com/pages/637/t-16.png",
"content": "\nלפיד\nהנוער ־ בירושלים צילומים :\n\nב. רותנברג\n\nוזהו הלפיד\n...",
"url": "https://thisworld.online/api/1950/637"
}
| train | |
|---|---|
| corpus | 2028 |
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?Researchers
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
GNU AGPLv3+
This is free software, and you are welcome to redistribute it under certain conditions.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see http://www.gnu.org/licenses/ .
Thanks to @lhoestq , @imvladikon for adding this dataset.