数据集:

opus_euconst

任务:

翻译

语言:

计算机处理:

multilingual

大小:

10K<n<100K

语言创建人:

found

批注创建人:

found

源数据集:

original

许可:

license:unknown

数据集介绍文件清单

中文

Dataset Card for [Dataset Name]

Dataset Summary

A parallel corpus collected from the European Constitution. 21 languages, 210 bitexts

Supported Tasks and Leaderboards

The underlying task is machine translation.

Languages

[More Information Needed]

Dataset Structure

Data Instances

[More Information Needed]

Data Fields

[More Information Needed]

Data Splits

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

J. Tiedemann, 2012, Parallel Data, Tools and Interfaces in OPUS. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012)

Contributions

Thanks to @patil-suraj for adding this dataset.

作者:

佚名

数据集大小:

277.07 KB