英文

KILT数据集卡片

数据集概述

KILT是由11个代表5种类型任务的数据集构建而成:

  • 事实检查
  • 实体链接
  • 槽位填充
  • 开放领域问答
  • 对话生成

所有这些数据集都基于一个预处理过的维基百科数据集,从而实现更公平、更一致的评估,并为多任务学习和迁移学习等任务设置提供了支持。KILT还提供了用于分析和理解模型预测的工具,以及它们提供的证据。

加载KILT知识源和任务数据

原始的KILT release 只提供了TriviaQA任务的问题ID。要使用完整数据集,需要将其映射回TriviaQA问题,可以按以下步骤完成:

from datasets import load_dataset

# Get the pre-processed Wikipedia knowledge source for kild
kilt_wiki = load_dataset("kilt_wikipedia")

# Get the KILT task datasets
kilt_triviaqa = load_dataset("kilt_tasks", name="triviaqa_support_only")

# Most tasks in KILT already have all required data, but KILT-TriviaQA
# only provides the question IDs, not the questions themselves.
# Thankfully, we can get the original TriviaQA data with:
trivia_qa = load_dataset('trivia_qa', 'unfiltered.nocontext')

# The KILT IDs can then be mapped to the TriviaQA questions with:
triviaqa_map = {}

def add_missing_data(x, trivia_qa_subset, triviaqa_map):
    i = triviaqa_map[x['id']]
    x['input'] = trivia_qa_subset[i]['question']
    x['output']['original_answer'] = trivia_qa_subset[i]['answer']['value']
    return x
    
for k in ['train', 'validation', 'test']:
    triviaqa_map = dict([(q_id, i) for i, q_id in enumerate(trivia_qa[k]['question_id'])])
    kilt_triviaqa[k] = kilt_triviaqa[k].filter(lambda x: x['id'] in triviaqa_map)
    kilt_triviaqa[k] = kilt_triviaqa[k].map(add_missing_data, fn_kwargs=dict(trivia_qa_subset=trivia_qa[k], triviaqa_map=triviaqa_map))

支持的任务和排行榜

该数据集支持使用任务特定的评估指标(如F1或EM)以及模型从维基百科检索支持信息的能力来评估模型的排名。

当前最佳模型的表现如下 here

语言

所有任务都使用英文( en )。

数据集结构

数据实例

来自自然问题 nq 配置的开放领域问答示例如下:

{'id': '-5004457603684974952',
 'input': 'who is playing the halftime show at super bowl 2016',
 'meta': {'left_context': '',
  'mention': '',
  'obj_surface': [],
  'partial_evidence': [],
  'right_context': '',
  'sub_surface': [],
  'subj_aliases': [],
  'template_questions': []},
 'output': [{'answer': 'Coldplay',
   'meta': {'score': 0},
   'provenance': [{'bleu_score': 1.0,
     'end_character': 186,
     'end_paragraph_id': 1,
     'meta': {'annotation_id': '-1',
      'evidence_span': [],
      'fever_page_id': '',
      'fever_sentence_id': -1,
      'yes_no_answer': ''},
     'section': 'Section::::Abstract.',
     'start_character': 178,
     'start_paragraph_id': 1,
     'title': 'Super Bowl 50 halftime show',
     'wikipedia_id': '45267196'}]},
  {'answer': 'Beyoncé',
   'meta': {'score': 0},
   'provenance': [{'bleu_score': 1.0,
     'end_character': 224,
     'end_paragraph_id': 1,
     'meta': {'annotation_id': '-1',
      'evidence_span': [],
      'fever_page_id': '',
      'fever_sentence_id': -1,
      'yes_no_answer': ''},
     'section': 'Section::::Abstract.',
     'start_character': 217,
     'start_paragraph_id': 1,
     'title': 'Super Bowl 50 halftime show',
     'wikipedia_id': '45267196'}]},
  {'answer': 'Bruno Mars',
   'meta': {'score': 0},
   'provenance': [{'bleu_score': 1.0,
     'end_character': 239,
     'end_paragraph_id': 1,
     'meta': {'annotation_id': '-1',
      'evidence_span': [],
      'fever_page_id': '',
      'fever_sentence_id': -1,
      'yes_no_answer': ''},
     'section': 'Section::::Abstract.',
     'start_character': 229,
     'start_paragraph_id': 1,
     'title': 'Super Bowl 50 halftime show',
     'wikipedia_id': '45267196'}]},
  {'answer': 'Coldplay with special guest performers Beyoncé and Bruno Mars',
   'meta': {'score': 0},
   'provenance': []},
  {'answer': 'British rock group Coldplay with special guest performers Beyoncé and Bruno Mars',
   'meta': {'score': 0},
   'provenance': []},
  {'answer': '',
   'meta': {'score': 0},
   'provenance': [{'bleu_score': 0.9657992720603943,
     'end_character': 341,
     'end_paragraph_id': 1,
     'meta': {'annotation_id': '2430977867500315580',
      'evidence_span': [],
      'fever_page_id': '',
      'fever_sentence_id': -1,
      'yes_no_answer': 'NONE'},
     'section': 'Section::::Abstract.',
     'start_character': 0,
     'start_paragraph_id': 1,
     'title': 'Super Bowl 50 halftime show',
     'wikipedia_id': '45267196'}]},
  {'answer': '',
   'meta': {'score': 0},
   'provenance': [{'bleu_score': -1.0,
     'end_character': -1,
     'end_paragraph_id': 1,
     'meta': {'annotation_id': '-1',
      'evidence_span': ['It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars',
       'It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars, who previously had headlined the Super Bowl XLVII and Super Bowl XLVIII halftime shows, respectively.',
       "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars",
       "The Super Bowl 50 Halftime Show took place on February 7, 2016, at Levi's Stadium in Santa Clara, California as part of Super Bowl 50. It was headlined by the British rock group Coldplay with special guest performers Beyoncé and Bruno Mars,"],
      'fever_page_id': '',
      'fever_sentence_id': -1,
      'yes_no_answer': ''},
     'section': 'Section::::Abstract.',
     'start_character': -1,
     'start_paragraph_id': 1,
     'title': 'Super Bowl 50 halftime show',
     'wikipedia_id': '45267196'}]}]}

数据字段

所有配置的示例都具有以下特征:

  • 输入:表示查询的字符串特征。
  • 输出:包含每个答案信息的特征列表,由以下组成:
    • 答案:表示可能答案的字符串特征。
    • 证据:表示支持答案的维基百科段落的特征列表,其中包含:
      • 标题:字符串特征,表示所提取段落的维基百科文章的标题。
      • 部分:字符串特征,表示维基百科文章中的部分标题。
      • 维基百科编号:字符串特征,表示维基百科文章的唯一标识符。
      • 起始字符:int32特征。
      • 起始段落ID:int32特征。
      • 结束字符:int32特征。
      • 结束段落ID:int32特征。

数据拆分

配置具有以下拆分:

Train Validation Test
triviaqa 61844 5359 6586
fever 104966 10444 10100
aidayago2 18395 4784 4463
wned 3396 3376
cweb 5599 5543
trex 2284168 5000 5000
structured_zeroshot 147909 3724 4966
nq 87372 2837 1444
hotpotqa 88869 5600 5569
eli5 272634 1507 600
wow 94577 3058 2944

数据集创建

策划原理

[需要更多信息]

源数据

初始数据收集和规范化

[需要更多信息]

谁是源语言的制作者?

[需要更多信息]

注释

注释过程

[需要更多信息]

谁是注释员?

[需要更多信息]

个人和敏感信息

[需要更多信息]

使用数据的注意事项

数据的社会影响

[需要更多信息]

偏见讨论

[需要更多信息]

其他已知限制

[需要更多信息]

附加信息

数据集策划者

[需要更多信息]

许可信息

[需要更多信息]

引用信息

引用方式:

@inproceedings{kilt_tasks,
  author    = {Fabio Petroni and
               Aleksandra Piktus and
               Angela Fan and
               Patrick S. H. Lewis and
               Majid Yazdani and
               Nicola De Cao and
               James Thorne and
               Yacine Jernite and
               Vladimir Karpukhin and
               Jean Maillard and
               Vassilis Plachouras and
               Tim Rockt{\"{a}}schel and
               Sebastian Riedel},
  editor    = {Kristina Toutanova and
               Anna Rumshisky and
               Luke Zettlemoyer and
               Dilek Hakkani{-}T{\"{u}}r and
               Iz Beltagy and
               Steven Bethard and
               Ryan Cotterell and
               Tanmoy Chakraborty and
               Yichao Zhou},
  title     = {{KILT:} a Benchmark for Knowledge Intensive Language Tasks},
  booktitle = {Proceedings of the 2021 Conference of the North American Chapter of
               the Association for Computational Linguistics: Human Language Technologies,
               {NAACL-HLT} 2021, Online, June 6-11, 2021},
  pages     = {2523--2544},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
  url       = {https://www.aclweb.org/anthology/2021.naacl-main.200/}
}

贡献者

感谢 @thomwolf @yjernite 添加了此数据集。