数据集:
lexlms/legal_lama
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
extended预印本库:
arxiv:2305.07507许可:
LegalLAMA是一个多样化的探索基准套件,包含8个子任务,旨在评估语言预训练模型(PLMs)在法律知识方面的了解程度。
| Corpus | Corpus alias | Examples | Avg. Tokens | Labels |
|---|---|---|---|---|
| Criminal Code Sections (Canada) | canadian_sections | 321 | 72 | 144 |
| Legal Terminology (EU) | cjeu_term | 2,127 | 164 | 23 |
| Contractual Section Titles (US) | contract_sections | 1,527 | 85 | 20 |
| Contract Types (US) | contract_types | 1,089 | 150 | 15 |
| ECHR Articles (CoE) | ecthr_articles | 5,072 | 69 | 13 |
| Legal Terminology (CoE) | ecthr_terms | 6,803 | 97 | 250 |
| Crime Charges (US) | us_crimes | 4,518 | 118 | 59 |
| Legal Terminology (US) | us_terms | 5,829 | 308 | 7 |
@inproceedings{chalkidis-garneau-etal-2023-lexlms,
title = {{LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development}},
author = "Chalkidis*, Ilias and
Garneau*, Nicolas and
Goanta, Catalina and
Katz, Daniel Martin and
Søgaard, Anders",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
month = june,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2305.07507",
}