数据集:
id_puisi
语言:
id计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
original其他:
poem-generation许可:
mitPuisi (poem) is an Indonesian poetic form. The dataset contains 7223 Indonesian puisi with its title and author.
[More Information Needed]
Indonesian
{ 'puisi_with_header': 'TEPERANGKAP Oleh Mangku Langit Jingga Mungkin kau membiarkan aku Membiarkan perasaan ini larut Memberi ruang jiwaku hampa Agar tetap terbiasa nikmati Perangkap yang kau buat Perisai yang kau banggakan Takkan jadi tameng bagimu Aku mengerti betapa hebatnya Perangkap mu hei sang dewi Ku akan terus merasa terbiasa Dengan pesona indahmu Ku masih akan nikmati hadirmu Berjalanlah pada hati yang sama Satu hati denganku Walau ku terperangkap Namunku nikmati dan jalani', 'title': 'TEPERANGKAP', 'author': 'Oleh Mangku Langit Jingga', 'puisi': 'Mungkin kau membiarkan aku Membiarkan perasaan ini larut Memberi ruang jiwaku hampa Agar tetap terbiasa nikmati Perangkap yang kau buat Perisai yang kau banggakan Takkan jadi tameng bagimu Aku mengerti betapa hebatnya Perangkap mu hei sang dewi Ku akan terus merasa terbiasa Dengan pesona indahmu Ku masih akan nikmati hadirmu Berjalanlah pada hati yang sama Satu hati denganku Walau ku terperangkap Namunku nikmati dan jalani', }
The dataset contains only a train set.
The dataset was initially collected as an experiment to generate an Indonesian poem using GPT-2.
The dataset was scraped using BeautifulSoup from lokerpuisi.web.id (the data no longer exist on the original blog). The title and author column was produced using regex match from puisi_with_header column.
Who are the source language producers?The poems were generated by humans. The users of the original blog voluntarily submit their original poems to get published on the blog.
[N/A]
Who are the annotators?[N/A]
[More Information Needed]
[More Information Needed]
[More Information Needed]
The regex match used to extract the title & author from the raw text is not perfect. Some title & text is still failed to get extracted.
Ilham Firdausi Putra
MIT License
[N/A]
Thanks to @ilhamfp for adding this dataset.