模型:
lexlms/legal-longformer-base
This is a derivative model based on the LexLM (base) RoBERTa model. All model parameters where cloned from the original model, while the positional embeddings were extended by cloning the original embeddings multiple times following Beltagy et al. (2020) using a python script similar to this one ( https://github.com/allenai/longformer/blob/master/scripts/convert_model_to_long.ipynb ).
LexLM (Base/Large) are our newly released RoBERTa models. We follow a series of best-practices in language model development:
@inproceedings{chalkidis-garneau-etal-2023-lexlms,
title = {{LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development}},
author = "Chalkidis*, Ilias and
Garneau*, Nicolas and
Goanta, Catalina and
Katz, Daniel Martin and
Søgaard, Anders",
booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics",
month = july,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/2305.07507",
}