数据集:
lexlms/legal_lama
语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
extended预印本库:
arxiv:2305.07507许可:
cc-by-nc-sa-4.0LegalLAMA is a diverse probing benchmark suite comprising 8 sub-tasks that aims to assess the acquaintance of legal knowledge that PLMs acquired in pre-training.
Corpus | Corpus alias | Examples | Avg. Tokens | Labels |
---|---|---|---|---|
Criminal Code Sections (Canada) | canadian_sections | 321 | 72 | 144 |
Legal Terminology (EU) | cjeu_term | 2,127 | 164 | 23 |
Contractual Section Titles (US) | contract_sections | 1,527 | 85 | 20 |
Contract Types (US) | contract_types | 1,089 | 150 | 15 |
ECHR Articles (CoE) | ecthr_articles | 5,072 | 69 | 13 |
Legal Terminology (CoE) | ecthr_terms | 6,803 | 97 | 250 |
Crime Charges (US) | us_crimes | 4,518 | 118 | 59 |
Legal Terminology (US) | us_terms | 5,829 | 308 | 7 |
@inproceedings{chalkidis-garneau-etal-2023-lexlms, title = {{LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development}}, author = "Chalkidis*, Ilias and Garneau*, Nicolas and Goanta, Catalina and Katz, Daniel Martin and Søgaard, Anders", booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics", month = june, year = "2023", address = "Toronto, Canada", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/2305.07507", }