数据集:
xcsr
任务:
子任务:
multiple-choice-qa计算机处理:
multilingual大小:
1K<n<10K批注创建人:
crowdsourced预印本库:
arxiv:2106.06937许可:
为了在多语言通识推理(ML-LMs)中评估跨语言零-shot转移(X-CSR)设置下的模型,即在英语进行训练并在其他语言进行测试,我们创建了两个基准数据集,即X-CSQA和X-CODAH。具体来说,我们自动将原始的仅有英文版本的CSQA和CODAH数据集翻译成其他15种语言,形成用于研究X-CSR的开发和测试集。鉴于我们的目标是在一个统一的评估协议中评估不同的ML-LMs,我们认为这些翻译的示例虽然可能会包含噪声,但可以作为我们获得有意义的分析的起点,直到将来获得更多人工翻译的数据集为止。
https://inklab.usc.edu//XCSR/leaderboard
X-CSR 的总共16种语言:{英语, 中文, 德语, 西班牙语, 法语, 意大利语, 日语, 荷兰语, 波兰语, 葡萄牙语, 俄语, 阿拉伯语, 越南语, 印地语, 斯瓦希里语, 乌尔都语}。
X-CSQA 数据集的一个例子:
{
"id": "be1920f7ba5454ad", # an id shared by all languages
"lang": "en", # one of the 16 language codes.
"question": {
"stem": "What will happen to your knowledge with more learning?", # question text
"choices": [
{"label": "A", "text": "headaches" },
{"label": "B", "text": "bigger brain" },
{"label": "C", "text": "education" },
{"label": "D", "text": "growth" },
{"label": "E", "text": "knowing more" }
] },
"answerKey": "D" # hidden for test data.
}
X-CODAH 数据集的一个例子:
{
"id": "b8eeef4a823fcd4b", # an id shared by all languages
"lang": "en", # one of the 16 language codes.
"question_tag": "o", # one of 6 question types
"question": {
"stem": " ", # always a blank as a dummy question
"choices": [
{"label": "A",
"text": "Jennifer loves her school very much, she plans to drop every courses."},
{"label": "B",
"text": "Jennifer loves her school very much, she is never absent even when she's sick."},
{"label": "C",
"text": "Jennifer loves her school very much, she wants to get a part-time job."},
{"label": "D",
"text": "Jennifer loves her school very much, she quits school happily."}
]
},
"answerKey": "B" # hidden for test data.
}
为了评估多语言通识推理模型(ML-LMs)在跨语言零-shot转移(X-CSR)设置下的性能,即在英语训练、其他语言测试的情况下,我们创建了两个基准数据集,分别是X-CSQA和X-CODAH。
数据集构造的详细信息,特别是翻译过程的细节,可以在附录A的第 paper 部分中找到。
[需要更多信息]
源语言的生产者是谁?[需要更多信息]
[需要更多信息]
注释者是谁?[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
# X-CSR
@inproceedings{lin-etal-2021-common,
title = "Common Sense Beyond {E}nglish: Evaluating and Improving Multilingual Language Models for Commonsense Reasoning",
author = "Lin, Bill Yuchen and
Lee, Seyeon and
Qiao, Xiaoyang and
Ren, Xiang",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
month = aug,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.acl-long.102",
doi = "10.18653/v1/2021.acl-long.102",
pages = "1274--1287",
abstract = "Commonsense reasoning research has so far been limited to English. We aim to evaluate and improve popular multilingual language models (ML-LMs) to help advance commonsense reasoning (CSR) beyond English. We collect the Mickey corpus, consisting of 561k sentences in 11 different languages, which can be used for analyzing and improving ML-LMs. We propose Mickey Probe, a language-general probing task for fairly evaluating the common sense of popular ML-LMs across different languages. In addition, we also create two new datasets, X-CSQA and X-CODAH, by translating their English versions to 14 other languages, so that we can evaluate popular ML-LMs for cross-lingual commonsense reasoning. To improve the performance beyond English, we propose a simple yet effective method {---} multilingual contrastive pretraining (MCP). It significantly enhances sentence representations, yielding a large performance gain on both benchmarks (e.g., +2.7{\%} accuracy for X-CSQA over XLM-R{\_}L).",
}
# CSQA
@inproceedings{Talmor2019commonsenseqaaq,
address = {Minneapolis, Minnesota},
author = {Talmor, Alon and Herzig, Jonathan and Lourie, Nicholas and Berant, Jonathan},
booktitle = {Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)},
doi = {10.18653/v1/N19-1421},
pages = {4149--4158},
publisher = {Association for Computational Linguistics},
title = {CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge},
url = {https://www.aclweb.org/anthology/N19-1421},
year = {2019}
}
# CODAH
@inproceedings{Chen2019CODAHAA,
address = {Minneapolis, USA},
author = {Chen, Michael and D{'}Arcy, Mike and Liu, Alisa and Fernandez, Jared and Downey, Doug},
booktitle = {Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for {NLP}},
doi = {10.18653/v1/W19-2008},
pages = {63--69},
publisher = {Association for Computational Linguistics},
title = {CODAH: An Adversarially-Authored Question Answering Dataset for Common Sense},
url = {https://www.aclweb.org/anthology/W19-2008},
year = {2019}
}
感谢 Bill Yuchen Lin , Seyeon Lee , Xiaoyang Qiao , Xiang Ren 添加了这个数据集。