数据集:

movie_rationales

任务:

文本分类

子任务:

sentiment-classification

语言:

计算机处理:

monolingual

大小:

1K<n<10K

语言创建人:

found

批注创建人:

crowdsourced

源数据集:

original

许可:

license:unknown

数据集介绍文件清单

英文

"movie_rationales"数据集卡片

数据集摘要

电影理由数据集包含了电影评论的人工标注理由。

支持的任务和排行榜

More Information Needed

语言

More Information Needed

数据集结构

数据实例

default

下载的数据集文件大小：3.90 MB
生成的数据集大小：8.73 MB
总磁盘使用量：12.62 MB

'validation'的一个例子如下所示。

{
    "evidences": ["Fun movie"],
    "label": 1,
    "review": "Fun movie\n"
}

数据字段

所有拆分的数据字段都相同。

default

评论：字符串特征。
标签：分类标签，可能的值包括NEG（0），POS（1）。
证据：字符串特征列表。

数据拆分

name	train	validation	test
default	1600	200	199

数据集创建

策划原因

More Information Needed

源数据

初始数据收集和规范化

More Information Needed

谁是源语言的制作人？

More Information Needed

注释

注释过程

More Information Needed

谁是注释者？

More Information Needed

个人和敏感信息

More Information Needed

使用数据时的注意事项

其他信息

数据集策划者

More Information Needed

许可信息

More Information Needed

引用信息

@inproceedings{deyoung-etal-2020-eraser,
    title = "{ERASER}: {A} Benchmark to Evaluate Rationalized {NLP} Models",
    author = "DeYoung, Jay  and
      Jain, Sarthak  and
      Rajani, Nazneen Fatema  and
      Lehman, Eric  and
      Xiong, Caiming  and
      Socher, Richard  and
      Wallace, Byron C.",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.acl-main.408",
    doi = "10.18653/v1/2020.acl-main.408",
    pages = "4443--4458",
}
@InProceedings{zaidan-eisner-piatko-2008:nips,
  author    =  {Omar F. Zaidan  and  Jason Eisner  and  Christine Piatko},
  title     =  {Machine Learning with Annotator Rationales to Reduce Annotation Cost},
  booktitle =  {Proceedings of the NIPS*2008 Workshop on Cost Sensitive Learning},
  month     =  {December},
  year      =  {2008}
}

贡献者

感谢 @thomwolf ， @patrickvonplaten ， @lewtun 添加了该数据集。

作者:

佚名

数据集大小:

13.84 KB

"movie_rationales"数据集卡片

数据集摘要

支持的任务和排行榜

语言

数据集结构

数据实例

数据字段

数据拆分

数据集创建

策划原因

源数据

注释

个人和敏感信息

使用数据时的注意事项

数据的社会影响

偏见讨论

其他已知限制

其他信息

数据集策划者

许可信息

引用信息

贡献者