数据集:
jfleg
JFLEG(JHU FLuency-Extended GUG)是一个英语语法错误纠正(GEC)语料库。它是一个用于开发和评估GEC系统的黄金标准基准,具有流畅度(文本是否地道)和语法性。对于每个源文档,都有四个人工编写的纠正版本。
语法错误修正。
英语(母语和L2学习者)
每个实例包含一个源句子和四个纠正版本。例如:
{
'sentence': "They are moved by solar energy ."
'corrections': [
"They are moving by solar energy .",
"They are moved by solar energy .",
"They are moved by solar energy .",
"They are propelled by solar energy ."
]
}
[需要更多信息]
[需要更多信息]
源语言制作人是谁?[需要更多信息]
[需要更多信息]
注释者是谁?[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
[需要更多信息]
该工作根据许可证编号 Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License 进行许可。
此基准由 Napoles et al., 2020 提出。
@InProceedings{napoles-sakaguchi-tetreault:2017:EACLshort,
author = {Napoles, Courtney and Sakaguchi, Keisuke and Tetreault, Joel},
title = {JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers},
month = {April},
year = {2017},
address = {Valencia, Spain},
publisher = {Association for Computational Linguistics},
pages = {229--234},
url = {http://www.aclweb.org/anthology/E17-2037}
}
@InProceedings{heilman-EtAl:2014:P14-2,
author = {Heilman, Michael and Cahill, Aoife and Madnani, Nitin and Lopez, Melissa and Mulholland, Matthew and Tetreault, Joel},
title = {Predicting Grammaticality on an Ordinal Scale},
booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
month = {June},
year = {2014},
address = {Baltimore, Maryland},
publisher = {Association for Computational Linguistics},
pages = {174--180},
url = {http://www.aclweb.org/anthology/P14-2029}
}
感谢 @j-chim 添加此数据集。