数据集:
lexlms/legal_lama
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
extended预印本库:
arxiv:2305.07507许可:
LegalLAMA是一个多样化的探索基准套件,包含8个子任务,旨在评估语言预训练模型(PLMs)在法律知识方面的了解程度。
| Corpus | Corpus alias | Examples | Avg. Tokens | Labels | 
|---|---|---|---|---|
| Criminal Code Sections (Canada) | canadian_sections | 321 | 72 | 144 | 
| Legal Terminology (EU) | cjeu_term | 2,127 | 164 | 23 | 
| Contractual Section Titles (US) | contract_sections | 1,527 | 85 | 20 | 
| Contract Types (US) | contract_types | 1,089 | 150 | 15 | 
| ECHR Articles (CoE) | ecthr_articles | 5,072 | 69 | 13 | 
| Legal Terminology (CoE) | ecthr_terms | 6,803 | 97 | 250 | 
| Crime Charges (US) | us_crimes | 4,518 | 118 | 59 | 
| Legal Terminology (US) | us_terms | 5,829 | 308 | 7 | 
@inproceedings{chalkidis-garneau-etal-2023-lexlms,
    title = {{LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development}},
    author = "Chalkidis*, Ilias and 
              Garneau*, Nicolas and
              Goanta, Catalina and 
              Katz, Daniel Martin and 
              Søgaard, Anders",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
    month = june,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2305.07507",
}