数据集:

gsarti/mt_geneval

预印本库:

arxiv:2211.01355

源数据集:

original

批注创建人:

expert-generated

语言创建人:

expert-generated

大小:

1K<n<10K

计算机处理:

translation

任务:

翻译
中文

Dataset Card for MT-GenEval

Dataset Summary

The MT-GenEval benchmark evaluates gender translation accuracy on English -> {Arabic, French, German, Hindi, Italian, Portuguese, Russian, Spanish}. The dataset contains individual sentences with annotations on the gendered target words, and contrastive original-invertend translations with additional preceding context.

Disclaimer : The MT-GenEval benchmark was released in the EMNLP 2022 paper MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation by Anna Currey, Maria Nadejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, and Georgiana Dinu and is hosted through Github by the Amazon Science organization. The dataset is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License .

Supported Tasks and Leaderboards

Machine Translation

Refer to the original paper for additional details on gender accuracy evaluation with MT-GenEval.

Languages

The dataset contains source English sentences extracted from Wikipedia translated into the following languages: Arabic ( ar ), French ( fr ), German ( de ), Hindi ( hi ), Italian ( it ), Portuguese ( pt ), Russian ( ru ), and Spanish ( es ).

Dataset Structure

Data Instances

The dataset contains two configuration types, sentences and context , mirroring the original repository structure, with source and target language specified in the configuration name (e.g. sentences_en_ar , context_en_it ) The sentences configurations contains masculine and feminine versions of individual sentences with gendered word annotations. Here is an example entry of the sentences_en_it split (all sentences_en_XX splits have the same structure):

{
{
  "orig_id": 0,
  "source_feminine": "Pagratidis quickly recanted her confession, claiming she was psychologically pressured and beaten, and until the moment of her execution, she remained firm in her innocence.",
  "reference_feminine": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era stata picchiata, e fino al momento della sua esecuzione, rimase ferma sulla sua innocenza.",
  "source_masculine": "Pagratidis quickly recanted his confession, claiming he was psychologically pressured and beaten, and until the moment of his execution, he remained firm in his innocence.",
  "reference_masculine": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era stato picchiato, e fino al momento della sua esecuzione, rimase fermo sulla sua innocenza.",
  "source_feminine_annotated": "Pagratidis quickly recanted <F>her</F> confession, claiming <F>she</F> was psychologically pressured and beaten, and until the moment of <F>her</F> execution, <F>she</F> remained firm in <F>her</F> innocence.",
  "reference_feminine_annotated": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era <F>stata picchiata</F>, e fino al momento della sua esecuzione, rimase <F>ferma</F> sulla sua innocenza.",
  "source_masculine_annotated": "Pagratidis quickly recanted <M>his</M> confession, claiming <M>he</M> was psychologically pressured and beaten, and until the moment of <M>his</M> execution, <M>he</M> remained firm in <M>his</M> innocence.",
  "reference_masculine_annotated": "Pagratidis subito ritrattò la sua confessione, affermando che era aveva subito pressioni psicologiche e era <M>stato picchiato</M>, e fino al momento della sua esecuzione, rimase <M>fermo</M> sulla sua innocenza.",
  "source_feminine_keywords": "her;she;her;she;her",
  "reference_feminine_keywords": "stata picchiata;ferma",
  "source_masculine_keywords": "his;he;his;he;his",
  "reference_masculine_keywords": "stato picchiato;fermo",
}
}

The context configuration contains instead different English sources related to stereotypical professional roles with additional preceding context and contrastive original-inverted translations. Here is an example entry of the context_en_it split (all context_en_XX splits have the same structure):

{
  "orig_id": 0,
  "context": "Pierpont told of entering and holding up the bank and then fleeing to Fort Wayne, where the loot was divided between him and three others.",
  "source": "However, Pierpont stated that Skeer was the planner of the robbery.",
  "reference_original": "Comunque, Pierpont disse che Skeer era il pianificatore della rapina.",
  "reference_flipped": "Comunque, Pierpont disse che Skeer era la pianificatrice della rapina."
}

Data Splits

All sentences_en_XX configurations have 1200 examples in the train split and 300 in the test split. For the context_en_XX configurations, the number of example depends on the language pair:

Configuration # Train # Test
context_en_ar 792 1100
context_en_fr 477 1099
context_en_de 598 1100
context_en_hi 397 1098
context_en_it 465 1904
context_en_pt 574 1089
context_en_ru 583 1100
context_en_es 534 1096

Dataset Creation

From the original paper:

In developing MT-GenEval, our goal was to create a realistic, gender-balanced dataset that naturally incorporates a diverse range of gender phenomena. To this end, we extracted English source sentences from Wikipedia as the basis for our dataset. We automatically pre-selected relevant sentences using EN gender-referring words based on the list provided by Zhao et al. (2018) .

Please refer to the original article MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation for additional information on dataset creation.

Additional Information

Dataset Curators

The original authors of MT-GenEval are the curators of the original dataset. For problems or updates on this ? Datasets version, please contact gabriele.sarti996@gmail.com .

Licensing Information

The dataset is licensed under the Creative Commons Attribution-ShareAlike 3.0 International License .

Citation Information

Please cite the authors if you use these corpora in your work.

@inproceedings{currey-etal-2022-mtgeneval,
    title = "{MT-GenEval}: {A} Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation",
    author = "Currey, Anna  and
      Nadejde, Maria  and
      Pappagari, Raghavendra  and
      Mayer, Mia  and
      Lauly, Stanislas,  and
      Niu, Xing  and
      Hsu, Benjamin  and
      Dinu, Georgiana",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing",
    month = dec,
    year = "2022",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2211.01355",
}