Fig-QA 数据集卡片

数据集概述

这是 Testing the Ability of Language Models to Interpret Figurative Language 论文的数据集。Fig-QA包含10256个人写的创意隐喻，它们被成对作为Winograd模式。它可用于评估模型的常识推理能力。隐喻本身也可以用作其他任务（如隐喻检测或生成）的训练数据。

支持的任务和排行榜

您可以通过在Explainaboard上提交结果来评估模型在测试集上的表现。点击"New"并选择任务字段为qa-multiple-choice。选择评价指标为accuracy。您应该以JSON或JSONL格式的系统输出文件的形式上传结果。

语言

这是英文版本。多语言版本请参考 here 。

数据集划分

Train-{S, M（无后缀）, XL}：不同的训练集大小DevTest（测试集不提供标签）

使用数据时的注意事项

偏见讨论

这些隐喻是人为生成的，可能包含侮辱或其他明确的内容。论文的作者手动删除了冒犯性内容，但用户应注意数据集中可能仍存在一些潜在冒犯的内容。

其他信息

许可信息

MIT许可证

引用信息

如果您发现该数据集有用，请引用这篇论文：

@misc{https://doi.org/10.48550/arxiv.2204.12632,
  doi = {10.48550/ARXIV.2204.12632},
  url = {https://arxiv.org/abs/2204.12632},
  author = {Liu, Emmy and Cui, Chen and Zheng, Kenneth and Neubig, Graham},
  keywords = {Computation and Language (cs.CL), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Testing the Ability of Language Models to Interpret Figurative Language},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution Share Alike 4.0 International}
}

作者:

nightingal3

数据集大小:

1.22 MB