数据集:

yoruba_wordsim353

语言:

en yo

计算机处理:

multilingual

大小:

n<1K

语言创建人:

expert-generated

批注创建人:

crowdsourced

源数据集:

original
中文

Dataset Card for wordsim-353 in Yorùbá (yoruba_wordsim353)

Dataset Summary

A translation of the word pair similarity dataset wordsim-353 to Yorùbá.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

Yorùbá (ISO 639-1: yo)

Dataset Structure

Data Instances

An instance consists of a pair of words as well as their similarity. The dataset contains both the original English words (from wordsim-353) as well as their translation to Yorùbá.

Data Fields

  • english1 : the first word of the pair; the original English word
  • english2 : the second word of the pair; the original English word
  • yoruba1 : the first word of the pair; translation to Yorùbá
  • yoruba2 : the second word of the pair; translation to Yorùbá
  • similarity : similarity rating according to the English dataset

Data Splits

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

[More Information Needed]

Citation Information

[More Information Needed]

Contributions

Thanks to @michael-aloys for adding this dataset.