数据集:

nadhifikbarw/id_ohsuhmed

语言:

id
中文

Machine translated Ohsumed collection (EN to ID)

Original corpora: http://disi.unitn.it/moschitti/corpora.htm Translated using: https://huggingface.co/Helsinki-NLP/opus-mt-en-id

Compatible with HuggingFace text-classification script (Tested in 4.17) https://github.com/huggingface/transformers/tree/v4.17.0/examples/pytorch/text-classification

[Moschitti, 2003a]. Alessandro Moschitti, Natural Language Processing and Text Categorization: a study on the reciprocal beneficial interactions, PhD thesis, University of Rome Tor Vergata, Rome, Italy, May 2003.