This is a dataset of partial solutions to the HumanEval and MBPP code generation benchmarks tranlated into 18+ programming languages. The original benchmark problems were in Python, and we build the dataset as follows:
This notebook carried out the steps described above.
Note that the dataset does not have solutions for every problem-language pair, since code-davinci-002 cannot produce a correct solution to every problem.