数据集:
asnq
许可:
预印本库:
arxiv:1911.04118批注创建人:
crowdsourced语言创建人:
found大小:
10M<n<100M计算机处理:
monolingual语言:
子任务:
multiple-choice-qa任务:
ASNQ is a dataset for answer sentence selection derived from Google's Natural Questions (NQ) dataset (Kwiatkowski et al. 2019).
Each example contains a question, candidate sentence, label indicating whether or not the sentence answers the question, and two additional features -- sentence_in_long_answer and short_answer_in_sentence indicating whether ot not the candidate sentence is contained in the long_answer and if the short_answer is in the candidate sentence.
For more details please see https://arxiv.org/abs/1911.04118
and
https://research.google/pubs/pub47761/
An example of 'validation' looks as follows.
{
"label": 0,
"question": "when did somewhere over the rainbow come out",
"sentence": "In films and TV shows ( edit ) In the film Third Finger , Left Hand ( 1940 ) with Myrna Loy , Melvyn Douglas , and Raymond Walburn , the tune played throughout the film in short sequences .",
"sentence_in_long_answer": false,
"short_answer_in_sentence": false
}
The data fields are the same among all splits.
default| name | train | validation |
|---|---|---|
| default | 20377568 | 930062 |
The data is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License: https://github.com/alexa/wqa_tanda/blob/master/LICENSE
@article{Garg_2020,
title={TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection},
volume={34},
ISSN={2159-5399},
url={http://dx.doi.org/10.1609/AAAI.V34I05.6282},
DOI={10.1609/aaai.v34i05.6282},
number={05},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
publisher={Association for the Advancement of Artificial Intelligence (AAAI)},
author={Garg, Siddhant and Vu, Thuy and Moschitti, Alessandro},
year={2020},
month={Apr},
pages={7780–7788}
}
Thanks to @mkserge for adding this dataset.