数据集:
persiannlp/parsinlu_entailment
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
expert-generated批注创建人:
expert-generated源数据集:
extended|translated|mnli预印本库:
arxiv:2012.06154许可:
A Persian textual entailment task (deciding sent1 entails sent2 ). The questions are partially translated from the SNLI dataset and partially generated by expert annotators.
[More Information Needed]
The text dataset is in Persian ( fa ).
Here is an example from the dataset:
{
  "sent1": "سالها است که کنگره در تلاش است تا اثربخشی مدیریت اطلاعات و فناوری را در دولت فدرال افزایش دهد.",
  "sent2": "کنگره بودجه ویژه ای برای مدیریت اطلاعات و فناوری در دولت فدرال دارد.",
  "label": "n",
  "category": "translation-train"
}
 The train/dev/test splits contains 756/271/1751 samples.
For details, check the corresponding draft .
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
CC BY-NC-SA 4.0 License
@article{huggingface:dataset,
    title = {ParsiNLU: A Suite of Language Understanding Challenges for Persian},
    authors = {Khashabi, Daniel and Cohan, Arman and Shakeri, Siamak and Hosseini, Pedram and Pezeshkpour, Pouya and Alikhani, Malihe and Aminnaseri, Moin and Bitaab, Marzieh and Brahman, Faeze and Ghazarian, Sarik and others},
    year={2020}
    journal = {arXiv e-prints},
    eprint = {2012.06154},    
}
 Thanks to @danyaljj for adding this dataset.