数据集:
tner/mit_movie_trivia
MIT电影NER数据集是 TNER 项目的一部分格式化的数据集。
train的一个示例如下。
{
'tags': [0, 13, 14, 0, 0, 0, 3, 4, 4, 4, 4, 4, 4, 4, 4],
'tokens': ['a', 'steven', 'spielberg', 'film', 'featuring', 'a', 'bluff', 'called', 'devil', 's', 'tower', 'and', 'a', 'spectacular', 'mothership']
}
label2id字典可以在 here 处找到。
{
"O": 0,
"B-Actor": 1,
"I-Actor": 2,
"B-Plot": 3,
"I-Plot": 4,
"B-Opinion": 5,
"I-Opinion": 6,
"B-Award": 7,
"I-Award": 8,
"B-Year": 9,
"B-Genre": 10,
"B-Origin": 11,
"I-Origin": 12,
"B-Director": 13,
"I-Director": 14,
"I-Genre": 15,
"I-Year": 16,
"B-Soundtrack": 17,
"I-Soundtrack": 18,
"B-Relationship": 19,
"I-Relationship": 20,
"B-Character_Name": 21,
"I-Character_Name": 22,
"B-Quote": 23,
"I-Quote": 24
}
| name | train | validation | test |
|---|---|---|---|
| mit_movie_trivia | 6816 | 1000 | 1953 |