数据集:
anli
任务:
语言:
计算机处理:
monolingual大小:
100K<n<1M语言创建人:
found预印本库:
arxiv:1910.14599许可:
Adversarial Natural Language Inference (ANLI) 是一个新的大规模 NLI 基准数据集。该数据集是通过迭代的、对抗性的人工与模型相结合的过程收集的。ANLI 比其先前的数据集,包括 SNLI 和 MNLI 要困难得多。它包含三个轮次,每个轮次有训练集、开发集和测试集。
英语
'train_r2' 的一个示例如下所示。
This example was too long and was cropped:
{
    "hypothesis": "Idris Sultan was born in the first month of the year preceding 1994.",
    "label": 0,
    "premise": "\"Idris Sultan (born January 1993) is a Tanzanian Actor and comedian, actor and radio host who won the Big Brother Africa-Hotshot...",
    "reason": "",
    "uid": "ed5c37ab-77c5-4dbc-ba75-8fd617b19712"
}
 所有拆分之间的数据字段相同。
plain_text| name | train_r1 | dev_r1 | train_r2 | dev_r2 | train_r3 | dev_r3 | test_r1 | test_r2 | test_r3 | 
|---|---|---|---|---|---|---|---|---|---|
| plain_text | 16946 | 1000 | 45460 | 1000 | 100459 | 1200 | 1000 | 1000 | 1200 | 
cc-4 Attribution-NonCommercial
@InProceedings{nie2019adversarial,
    title={Adversarial NLI: A New Benchmark for Natural Language Understanding},
    author={Nie, Yixin
                and Williams, Adina
                and Dinan, Emily
                and Bansal, Mohit
                and Weston, Jason
                and Kiela, Douwe},
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    year = "2020",
    publisher = "Association for Computational Linguistics",
}
 感谢 @thomwolf , @easonnie , @lhoestq , @patrickvonplaten 添加了这个数据集。