数据集:

gabeorlanski/bc-humaneval

中文

Dataset Card for BabelCode HumanEval

How To Use This Dataset

To use this dataset, you can either use the original BabelCode Repo , or you can use the bc_eval Metric .

Dataset Summary

The BabelCode-HumaneEval (BC-HumanEval) dataset converts the HumanEval dataset released by OpenAI to 16 programming languages.

Supported Tasks and Leaderboards

Languages

BC-HumanEval supports:

  • C++
  • C#
  • Dart
  • Go
  • Haskell
  • Java
  • Javascript
  • Julia
  • Kotlin
  • Lua
  • PHP
  • Python
  • R
  • Rust
  • Scala
  • TypeScript

Dataset Structure

>>> from datasets import load_dataset
>>> load_dataset("gabeorlanski/bc-humaneval")
DatasetDict({
    test: Dataset({
        features: ['qid', 'title', 'language', 'text', 'signature_with_docstring', 'signature', 'arguments', 'solution', 'question_info'],
        num_rows: 2576
    })
})

Data Fields

  • qid : The question ID used for running tests.
  • title : The title of the question.
  • language : The programming language of the example.
  • text : The description of the problem.
  • signature : The signature for the problem.
  • signature_with_docstring : The signature with the adequately formatted docstring for the given problem.
  • arguments : The arguments of the problem.
  • solution : The solution in Python.
  • question_info : The dict of information used for executing predictions. It has the keys:
    • test_code : The raw testing script used in the language. If you want to use this, replace PLACEHOLDER_FN_NAME (and PLACEHOLDER_CLS_NAME if needed) with the corresponding entry points. Next, replace PLACEHOLDER_CODE_BODY with the postprocessed prediction.
    • test_list : The raw json line of the list of tests for the problem. To load them, use json.loads
    • test_case_ids : The list of test case ids for the problem. These are used to determine if a prediction passes or not.
    • entry_fn_name : The function's name to use an entry point.
    • entry_cls_name : The class name to use an entry point.
    • commands : The commands used to execute the prediction. Includes a __FILENAME__ hole that is replaced with the filename.
    • timeouts : The default timeouts for each command.
    • extension : The extension for the prediction file.

NOTE: If you want to use a different function name (or class name for languages that require class names) for the prediction, you must update the entry_fn_name and entry_cls_name accordingly. For example, if you have the original question with entry_fn_name of add , but want to change it to f , you must update ds["question_info"]["entry_fn_name"] to f :

>>> from datasets import load_dataset
>>> ds = load_dataset("gabeorlanski/bc-humaneval")['test']
>>> # The original entry_fn_name
>>> ds[0]['question_info']['entry_fn_name']
hasCloseElements
>>> # You MUST update the corresponding entry_fn_name
>>> ds[0]['question_info']['entry_fn_name'] = 'f'
>>> ds[0]['question_info']['entry_fn_name']
f

Dataset Creation

See section 2 of the BabelCode Paper to learn more about how the datasets are translated.

For information on how the original HumanEval was curated, please see the Evaluating Large Language Models Trained on Code paper .

Dataset Curators

Google Research

Licensing Information

CC-BY-4.0

Citation Information

@article{orlanski2023measuring,
  title={Measuring The Impact Of Programming Language Distribution},
  author={Orlanski, Gabriel and Xiao, Kefan and Garcia, Xavier and Hui, Jeffrey and Howland, Joshua and Malmaud, Jonathan and Austin, Jacob and Singh, Rishah and Catasta, Michele},
  journal={arXiv preprint arXiv:2302.01973},
  year={2023}
}
@article{chen2021codex,
  title={Evaluating Large Language Models Trained on Code},
  author={Mark Chen and Jerry Tworek and Heewoo Jun and Qiming Yuan and Henrique Ponde de Oliveira Pinto and Jared Kaplan and Harri Edwards and Yuri Burda and Nicholas Joseph and Greg Brockman and Alex Ray and Raul Puri and Gretchen Krueger and Michael Petrov and Heidy Khlaaf and Girish Sastry and Pamela Mishkin and Brooke Chan and Scott Gray and Nick Ryder and Mikhail Pavlov and Alethea Power and Lukasz Kaiser and Mohammad Bavarian and Clemens Winter and Philippe Tillet and Felipe Petroski Such and Dave Cummings and Matthias Plappert and Fotios Chantzis and Elizabeth Barnes and Ariel Herbert-Voss and William Hebgen Guss and Alex Nichol and Alex Paino and Nikolas Tezak and Jie Tang and Igor Babuschkin and Suchir Balaji and Shantanu Jain and William Saunders and Christopher Hesse and Andrew N. Carr and Jan Leike and Josh Achiam and Vedant Misra and Evan Morikawa and Alec Radford and Matthew Knight and Miles Brundage and Mira Murati and Katie Mayer and Peter Welinder and Bob McGrew and Dario Amodei and Sam McCandlish and Ilya Sutskever and Wojciech Zaremba},
  year={2021},
  eprint={2107.03374},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}