模型:
T-Systems-onsite/mt5-small-sum-de-en-v2
这是一个用于英语和德语的双语摘要模型。它基于多语言T5模型 google/mt5-small 。
训练使用以下超参数进行:
数据集的预处理如下:
摘要使用 google/mt5-small 令牌化器进行标记。然后只选择具有不超过94个摘要令牌的记录。
MLSUM数据集具有特殊的特点。在文本中,摘要通常完整地作为一句话或多句话包含在内。这些已从文本中删除。原因是我们不希望训练一个最终只提取句子作为摘要的模型。
此模型在以下数据集上进行训练:
| Name | Language | License | 
|---|---|---|
| 1233321 | en | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially. | 
| 1234321 | en | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially. | 
| 1235321 | de | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see 1236321 ). | 
| 1237321 | de | The license is unclear. The data was published in the 1238321 . We assume that they may be used for research purposes and not commercially. | 
| Language | Size | 
|---|---|
| German | 302,607 | 
| English | 422,228 | 
| Total | 724,835 | 
| Model | rouge1 | rouge2 | rougeL | rougeLsum | 
|---|---|---|---|---|
| 1239321 | 18.3607 | 5.3604 | 14.5456 | 16.1946 | 
| 12310321 | 21.7336 | 7.2614 | 17.1323 | 19.3977 | 
| T-Systems-onsite/mt5-small-sum-de-en-v2 (this) | 21.7756 | 7.2662 | 17.1444 | 19.4242 | 
| Model | rouge1 | rouge2 | rougeL | rougeLsum | 
|---|---|---|---|---|
| 12311321 | 26.7664 | 8.8243 | 18.3703 | 23.2614 | 
| 12312321 | 28.5374 | 9.8565 | 19.4829 | 24.7364 | 
| 12313321 | 37.576 | 14.7389 | 24.0254 | 34.4634 | 
| 12310321 | 37.6339 | 16.5317 | 27.1418 | 34.9951 | 
| T-Systems-onsite/mt5-small-sum-de-en-v2 (this) | 37.8096 | 16.6646 | 27.2239 | 35.1916 | 
| Model | rouge1 | rouge2 | rougeL | rougeLsum | 
|---|---|---|---|---|
| 12313321 | 18.6204 | 3.535 | 12.3997 | 15.2111 | 
| 12312321 | 28.5374 | 9.8565 | 19.4829 | 24.7364 | 
| 12310321 | 32.3416 | 10.6191 | 25.3799 | 25.3908 | 
| T-Systems-onsite/mt5-small-sum-de-en-v2 (this) | 32.4828 | 10.7004 | 25.5238 | 25.5369 | 
| 12311321 | 44.2553 ♣ | 21.4289 ♣ | 36.2639 ♣ | 36.2696 ♣ | 
♣:这些值似乎异常高。可能是测试集在训练数据中使用了。
版权所有(c)2021年Philip May,T-Systems on site services GmbH
本作品根据 Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) 许可证授权。