数据集:
allenai/scicite
任务:
语言:
计算机处理:
monolingual大小:
10K<n<100K语言创建人:
found源数据集:
original预印本库:
arxiv:1904.01608许可:
这是一个用于对学术论文中的引用意图进行分类的数据集。每个 Json 对象的主要引用意图标签由 labelkey 指定,而引用上下文则在 context 键中指定。示例:{'string': 'In chacma baboons, male-infant relationships can be linked to both formation of friendships and paternity success [30,31].', 'sectionName': 'Introduction', 'label': 'background', 'citingPaperId': '7a6b2d4b405439', 'citedPaperId': '9d1abadc55b5e0', ...}您可以使用提供的 Semantic Scholar API( https://api.semanticscholar.org/ )获得关于论文的完整信息。标签有:Method(方法)、Background(背景)、Result(结果)。
'validation' 的一个示例如下所示。
{
"citeEnd": 68,
"citeStart": 64,
"citedPaperId": "5e413c7872f5df231bf4a4f694504384560e98ca",
"citingPaperId": "8f1fbe460a901d994e9b81d69f77bfbe32719f4c",
"excerpt_index": 0,
"id": "8f1fbe460a901d994e9b81d69f77bfbe32719f4c>5e413c7872f5df231bf4a4f694504384560e98ca",
"isKeyCitation": false,
"label": 2,
"label2": 0,
"label2_confidence": 0.0,
"label_confidence": 0.0,
"sectionName": "Discussion",
"source": 4,
"string": "These results are in contrast with the findings of Santos et al.(16), who reported a significant association between low sedentary time and healthy CVF among Portuguese"
}
所有拆分中的数据字段相同。
默认| name | train | validation | test |
|---|---|---|---|
| default | 8194 | 916 | 1859 |
@inproceedings{cohan-etal-2019-structural,
title = "Structural Scaffolds for Citation Intent Classification in Scientific Publications",
author = "Cohan, Arman and
Ammar, Waleed and
van Zuylen, Madeleine and
Cady, Field",
booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
month = jun,
year = "2019",
address = "Minneapolis, Minnesota",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/N19-1361",
doi = "10.18653/v1/N19-1361",
pages = "3586--3596",
}
感谢 @lewtun , @patrickvonplaten , @mariamabarham , @thomwolf 添加此数据集。