数据集:
squad_v1_pt
任务:
语言:
计算机处理:
monolingual大小:
10K<n<100K语言创建人:
crowdsourced批注创建人:
crowdsourced源数据集:
original预印本库:
arxiv:1606.05250许可:
SQuAD 数据集的葡萄牙语翻译版。该翻译是使用 Google Cloud API 自动完成的。
'train' 的一个示例如下所示。
This example was too long and was cropped:
{
    "answers": {
        "answer_start": [0],
        "text": ["Saint Bernadette Soubirous"]
    },
    "context": "\"Arquitetonicamente, a escola tem um caráter católico. No topo da cúpula de ouro do edifício principal é uma estátua de ouro da ...",
    "id": "5733be284776f41900661182",
    "question": "A quem a Virgem Maria supostamente apareceu em 1858 em Lourdes, na França?",
    "title": "University_of_Notre_Dame"
}
 各拆分中的数据字段相同。
default| name | train | validation | 
|---|---|---|
| default | 87599 | 10570 | 
@article{2016arXiv160605250R,
       author = {{Rajpurkar}, Pranav and {Zhang}, Jian and {Lopyrev},
                 Konstantin and {Liang}, Percy},
        title = "{SQuAD: 100,000+ Questions for Machine Comprehension of Text}",
      journal = {arXiv e-prints},
         year = 2016,
          eid = {arXiv:1606.05250},
        pages = {arXiv:1606.05250},
archivePrefix = {arXiv},
       eprint = {1606.05250},
}
 感谢 @thomwolf , @albertvillanova , @lewtun , @patrickvonplaten 添加此数据集。