数据集:
id_panl_bppt
任务:
翻译计算机处理:
translation大小:
10K<n<100K语言创建人:
expert-generated批注创建人:
expert-generated源数据集:
original许可:
license:unknownParallel Text Corpora for Multi-Domain Translation System created by BPPT (Indonesian Agency for the Assessment and Application of Technology) for PAN Localization Project (A Regional Initiative to Develop Local Language Computing Capacity in Asia). The dataset contains around 24K sentences divided in 4 difference topics (Economic, international, Science and Technology and Sport).
[More Information Needed]
Indonesian
[More Information Needed]
An example of the dataset:
{ 'id': '0', 'topic': 0, 'translation': { 'en': 'Minister of Finance Sri Mulyani Indrawati said that a sharp correction of the composite inde x by up to 4 pct in Wedenesday?s trading was a mere temporary effect of regional factors like decline in plantation commodity prices and the financial crisis in Thailand.', 'id': 'Menteri Keuangan Sri Mulyani mengatakan koreksi tajam pada Indeks Harga Saham Gabungan IHSG hingga sekitar 4 persen dalam perdagangan Rabu 10/1 hanya efek sesaat dari faktor-faktor regional seperti penurunan harga komoditi perkebunan dan krisis finansial di Thailand.' } }
The dataset is splitted in to train, validation and test sets.
[More Information Needed]
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
@inproceedings{id_panl_bppt, author = {PAN Localization - BPPT}, title = {Parallel Text Corpora, English Indonesian}, year = {2009}, url = {http://digilib.bppt.go.id/sampul/p92-budiono.pdf}, }
Thanks to @cahya-wirawan for adding this dataset.