数据集:
id_puisi
语言:
计算机处理:
monolingual大小:
1K<n<10K语言创建人:
found批注创建人:
no-annotation源数据集:
original其他:
poem-generation许可:
Puisi (poem) is an Indonesian poetic form. The dataset contains 7223 Indonesian puisi with its title and author.
[More Information Needed]
Indonesian
{
'puisi_with_header': 'TEPERANGKAP
Oleh Mangku Langit Jingga
Mungkin kau membiarkan aku
Membiarkan perasaan ini larut
Memberi ruang jiwaku hampa
Agar tetap terbiasa nikmati
Perangkap yang kau buat
Perisai yang kau banggakan
Takkan jadi tameng bagimu
Aku mengerti betapa hebatnya
Perangkap mu hei sang dewi
Ku akan terus merasa terbiasa
Dengan pesona indahmu
Ku masih akan nikmati hadirmu
Berjalanlah pada hati yang sama
Satu hati denganku
Walau ku terperangkap
Namunku nikmati dan jalani',
'title': 'TEPERANGKAP',
'author': 'Oleh Mangku Langit Jingga',
'puisi': 'Mungkin kau membiarkan aku
Membiarkan perasaan ini larut
Memberi ruang jiwaku hampa
Agar tetap terbiasa nikmati
Perangkap yang kau buat
Perisai yang kau banggakan
Takkan jadi tameng bagimu
Aku mengerti betapa hebatnya
Perangkap mu hei sang dewi
Ku akan terus merasa terbiasa
Dengan pesona indahmu
Ku masih akan nikmati hadirmu
Berjalanlah pada hati yang sama
Satu hati denganku
Walau ku terperangkap
Namunku nikmati dan jalani',
}
The dataset contains only a train set.
The dataset was initially collected as an experiment to generate an Indonesian poem using GPT-2.
The dataset was scraped using BeautifulSoup from lokerpuisi.web.id (the data no longer exist on the original blog). The title and author column was produced using regex match from puisi_with_header column.
Who are the source language producers?The poems were generated by humans. The users of the original blog voluntarily submit their original poems to get published on the blog.
[N/A]
Who are the annotators?[N/A]
[More Information Needed]
[More Information Needed]
[More Information Needed]
The regex match used to extract the title & author from the raw text is not perfect. Some title & text is still failed to get extracted.
Ilham Firdausi Putra
MIT License
[N/A]
Thanks to @ilhamfp for adding this dataset.