数据集:

rcds/swiss_court_view_generation

中文

Dataset Card for Swiss Court View Generation

Dataset Summary

Swiss Court View Generation is a multilingual, diachronic dataset of 404K Swiss Federal Supreme Court (FSCS) cases. This dataset is part of a challenging text generation task. This dataset contains court views for different languages and court chambers. It includes information such as decision id, language, chamber, file name, url, and the number of tokens in the facts and considerations sections. Main (L1) contains all the data, Origin (L2) contains only data with complete origin facts & origin considerations.

Supported Tasks and Leaderboards

Languages

Switzerland has four official languages with three languages German, French and Italian being represenated. The decisions are written by the judges and clerks in the language of the proceedings.

Language Subset Number of Documents Main Number of Documents Origin
German de 197K 49
French fr 163K 221
Italian it 44K 0

Dataset Structure

Data Fields

decision_id (string)
facts (string)
considerations (string)
origin_facts (string)
origin_considerations (string)
law_area (string)
language (string)
year (int32)
court (string)
chamber (string)
canton (string)
region (string)

Data Instances

[More Information Needed]

Data Fields

[More Information Needed]

Data Splits

Dataset Creation

Curation Rationale

Source Data

Initial Data Collection and Normalization

The original data are published from the Swiss Federal Supreme Court ( https://www.bger.ch ) in unprocessed formats (HTML). The documents were downloaded from the Entscheidsuche portal ( https://entscheidsuche.ch ) in HTML.

Who are the source language producers?

The decisions are written by the judges and clerks in the language of the proceedings.

Annotations

Annotation process Who are the annotators?

Metadata is published by the Swiss Federal Supreme Court ( https://www.bger.ch ).

Personal and Sensitive Information

The dataset contains publicly available court decisions from the Swiss Federal Supreme Court. Personal or sensitive information has been anonymized by the court before publication according to the following guidelines: https://www.bger.ch/home/juridiction/anonymisierungsregeln.html .

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

We release the data under CC-BY-4.0 which complies with the court licensing ( https://www.bger.ch/files/live/sites/bger/files/pdf/de/urteilsveroeffentlichung_d.pdf ) © Swiss Federal Supreme Court, 2002-2022

The copyright for the editorial content of this website and the consolidated texts, which is owned by the Swiss Federal Supreme Court, is licensed under the Creative Commons Attribution 4.0 International licence. This means that you can re-use the content provided you acknowledge the source and indicate any changes you have made. Source: https://www.bger.ch/files/live/sites/bger/files/pdf/de/urteilsveroeffentlichung_d.pdf

Citation Information

Please cite our ArXiv-Preprint

@misc{rasiah2023scale,
      title={SCALE: Scaling up the Complexity for Advanced Language Model Evaluation}, 
      author={Vishvaksenan Rasiah and Ronja Stern and Veton Matoshi and Matthias Stürmer and Ilias Chalkidis and Daniel E. Ho and Joel Niklaus},
      year={2023},
      eprint={2306.09237},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contributions