英文

《纽约客字幕比赛基准数据集》数据卡

数据集摘要

查看 capcon.dev 以获取更多信息!

数据来源: Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest

@article{hessel2022androids,
  title={Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest},
  author={Hessel, Jack and Marasovi{\'c}, Ana and Hwang, Jena D and Lee, Lillian and Da, Jeff and Zellers, Rowan and Mankoff, Robert and Choi, Yejin},
  journal={arXiv preprint arXiv:2209.06293},
  year={2022}
}

如果您使用此数据集,我们希望您引用我们的工作,并引用我们构建此语料库的多篇论文。参见引用信息。

我们挑战AI模型“展示对纽约客字幕比赛的复杂多模态幽默的理解”。具体而言,我们为此开发了三个经过仔细界定的任务,为了理解可能复杂而不同寻常的图像和字幕之间的潜在关系,以及对人类经验的广泛变化的复杂而不同寻常的提及,掌握这些关系可能是足够的,但不是必需的。

支持的任务和排行榜

支持三个任务:

  • “匹配”:模型必须识别出关于卡通的字幕(而不是其他选项);
  • “质量排序”:模型必须通过对同一比赛中质量较低的选项打分,评估字幕的质量;
  • “解释”:模型必须解释为什么某个笑话是有趣的。

目前尚无官方排行榜。

语言

英语

数据集结构

这里是来自“匹配”的示例实例:

{'caption_choices': ['Tell me about your childhood very quickly.',
                     "Believe me . . . it's what's UNDER the ground that's "
                     'most interesting.',
                     "Stop me if you've heard this one.",
                     'I have trouble saying no.',
                     'Yes, I see the train but I think we can beat it.'],
 'contest_number': 49,
 'entities': ['https://en.wikipedia.org/wiki/Rule_of_three_(writing)',
              'https://en.wikipedia.org/wiki/Bar_joke',
              'https://en.wikipedia.org/wiki/Religious_institute'],
 'from_description': 'scene: a bar description: Two priests and a rabbi are '
                     'walking into a bar, as the bartender and another patron '
                     'look on. The bartender talks on the phone while looking '
                     'skeptically at the incoming crew. uncanny: The scene '
                     'depicts a very stereotypical "bar joke" that would be '
                     'unlikely to be encountered in real life; the skepticism '
                     'of the bartender suggests that he is aware he is seeing '
                     'this trope, and is explaining it to someone on the '
                     'phone. entities: Rule_of_three_(writing), Bar_joke, '
                     'Religious_institute. choices A: Tell me about your '
                     "childhood very quickly. B: Believe me . . . it's what's "
                     "UNDER the ground that's most interesting. C: Stop me if "
                     "you've heard this one. D: I have trouble saying no. E: "
                     'Yes, I see the train but I think we can beat it.',
 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=L size=323x231 at 0x7F34F283E9D0>,
 'image_description': 'Two priests and a rabbi are walking into a bar, as the '
                      'bartender and another patron look on. The bartender '
                      'talks on the phone while looking skeptically at the '
                      'incoming crew.',
 'image_location': 'a bar',
 'image_uncanny_description': 'The scene depicts a very stereotypical "bar '
                              'joke" that would be unlikely to be encountered '
                              'in real life; the skepticism of the bartender '
                              'suggests that he is aware he is seeing this '
                              'trope, and is explaining it to someone on the '
                              'phone.',
 'instance_id': '21125bb8787b4e7e82aa3b0a1cba1571',
 'label': 'C',
 'n_tokens_label': 1,
 'questions': ['What is the bartender saying on the phone in response to the '
               'living, breathing, stereotypical bar joke that is unfolding?']}

“C”标签表示在“caption_choices”中的第3个选择正确。

这里是从“排序”中的示例实例(在像素设置中 --- 但是,这也在描述设置中可用)

{'caption_choices': ['I guess I misunderstood when you said long bike ride.',
                     'Does your divorce lawyer have any other cool ideas?'],
 'contest_number': 582,
 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=L size=600x414 at 0x7F8FF9F96610>,
 'instance_id': 'dd1c214a1ca3404aa4e582c9ce50795a',
 'label': 'A',
 'n_tokens_label': 1,
 'winner_source': 'official_winner'}

标签表示字幕选择列表中的第一个选择(在这里是“A”)更高评分。

这里是从“解释”的示例实例:

{'caption_choices': 'The classics can be so intimidating.',
 'contest_number': 752,
 'entities': ['https://en.wikipedia.org/wiki/Literature',
              'https://en.wikipedia.org/wiki/Solicitor'],
 'from_description': 'scene: a road description: Two people are walking down a '
                     'path. A number of giant books have surrounded them. '
                     'uncanny: There are book people in this world. entities: '
                     'Literature, Solicitor. caption: The classics can be so '
                     'intimidating.',
 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=L size=800x706 at 0x7F90003D0BB0>,
 'image_description': 'Two people are walking down a path. A number of giant '
                      'books have surrounded them.',
 'image_location': 'a road',
 'image_uncanny_description': 'There are book people in this world.',
 'instance_id': 'eef9baf450e2fab19b96facc128adf80',
 'label': 'A play on the word intimidating --- usually if the classics (i.e., '
          'classic novels) were to be intimidating, this would mean that they '
          'are intimidating to read due to their length, complexity, etc. But '
          'here, they are surrounded by anthropomorphic books which look '
          'physically intimidating, i.e., they are intimidating because they '
          'may try to beat up these people.',
 'n_tokens_label': 59,
 'questions': ['What do the books want?']}

标签是笑话的解释,用作自回归目标。

数据实例

见上文

数据字段

见上文

数据拆分

可以访问数据拆分,例如:

from datasets import load_dataset
dset = load_dataset("jmhessel/newyorker_caption_contest", "matching")
dset = load_dataset("jmhessel/newyorker_caption_contest", "ranking")
dset = load_dataset("jmhessel/newyorker_caption_contest", "explanation")

或者,在像素设置中,例如:

from datasets import load_dataset
dset = load_dataset("jmhessel/newyorker_caption_contest", "ranking_from_pixels")

由于数据集很小,我们最初报告的是5折交叉验证设置。默认分割是分割0。您可以访问其他分割,例如:

from datasets import load_dataset

# the 4th data split
dset = load_dataset("jmhessel/newyorker_caption_contest", "explanation_4")

数据集创建

完整细节请参阅论文。

策划理由

请参阅论文中的理由/动机。

源数据

请参阅下面的引用。我们结合了3个数据源,并添加了我们自己的重要注释。

初始数据收集和规范化

完整细节请参阅论文。

源语言生产者是谁?

我们支付了15美元/小时给众包人员对语料库进行注释。此外,作者们还进行了重要的注释工作。

注释

请参阅论文中的详细信息。

注释过程

请参阅论文中的详细信息。

注释者是谁?

是本论文的作者和众包工作者的混合。

个人和敏感信息

已从数据集中删除。图片已在《纽约客》上发布。

数据使用的考虑因素

数据集的社会影响

幽默可能会延续负面刻板印象。此语料库中的笑话是一组众包输入,经过高评分,以及在纽约客中发表的笑话。

偏见讨论

幽默是主观的,其中一些笑话可能被认为是冒犯性的。图像可能包含成人主题和少许动画裸体。

其他已知限制

详细信息请见论文。

其他信息

数据集策划者

该数据集由AI2研究人员策划

许可信息

我们提供的注释采用CC-BY-4.0许可。有关更多信息,请参见 www.capcon.dev

引用信息

@article{hessel2022androids,
  title={Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest},
  author={Hessel, Jack and Marasovi{\'c}, Ana and Hwang, Jena D and Lee, Lillian and Da, Jeff and Zellers, Rowan and Mankoff, Robert and Choi, Yejin},
  journal={arXiv preprint arXiv:2209.06293},
  year={2022}
}

我们的数据贡献如下:

  • 卡通级别的注释;
  • 笑话的解释;
  • 以及任务的构架

我们在CC-BY下发布这些数据(请参阅DATASET_LICENSE)。如果您在工作中发现此数据有用,请除了引用我们的贡献外,还引用以下来源,这些来源提供了我们语料库中使用的卡通/字幕:

@misc{newyorkernextmldataset,
  author={Jain, Lalit  and Jamieson, Kevin and Mankoff, Robert and Nowak, Robert and Sievert, Scott},
  title={The {N}ew {Y}orker Cartoon Caption Contest Dataset},
  year={2020},
  url={https://nextml.github.io/caption-contest-data/}
}

@inproceedings{radev-etal-2016-humor,
  title = "Humor in Collective Discourse: Unsupervised Funniness Detection in The {New Yorker} Cartoon Caption Contest",
  author = "Radev, Dragomir  and
      Stent, Amanda  and
      Tetreault, Joel  and
      Pappu, Aasish  and
      Iliakopoulou, Aikaterini  and
      Chanfreau, Agustin  and
      de Juan, Paloma  and
      Vallmitjana, Jordi  and
      Jaimes, Alejandro  and
      Jha, Rahul  and
      Mankoff, Robert",
  booktitle = "LREC",
  year = "2016",
}

@inproceedings{shahaf2015inside,
  title={Inside jokes: Identifying humorous cartoon captions},
  author={Shahaf, Dafna and Horvitz, Eric and Mankoff, Robert},
  booktitle={KDD},
  year={2015},
}