数据集:

poem_sentiment

语言:

en

计算机处理:

monolingual

大小:

1K<n<10K

语言创建人:

found

批注创建人:

expert-generated

源数据集:

original

预印本库:

arxiv:2011.02686

许可:

cc-by-4.0
中文

Dataset Card for Gutenberg Poem Dataset

Dataset Summary

Poem Sentiment is a sentiment dataset of poem verses from Project Gutenberg. This dataset can be used for tasks such as sentiment classification or style transfer for poems.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

The text in the dataset is in English ( en ).

Dataset Structure

Data Instances

Example of one instance in the dataset.

{'id': 0, 'label': 2, 'verse_text': 'with pale blue berries. in these peaceful shades--'}

Data Fields

  • id : index of the example
  • verse_text : The text of the poem verse
  • label : The sentiment label. Here
    • 0 = negative
    • 1 = positive
    • 2 = no impact
    • 3 = mixed (both negative and positive)

      Note: The original dataset uses different label indices (negative = -1, no impact = 0, positive = 1)

Data Splits

The dataset is split into a train , validation , and test split with the following sizes:

train validation test
Number of examples 892 105 104

[More Information Needed]

Dataset Creation

Curation Rationale

[More Information Needed]

Source Data

[More Information Needed]

Initial Data Collection and Normalization

[More Information Needed]

Who are the source language producers?

[More Information Needed]

Annotations

[More Information Needed]

Annotation process

[More Information Needed]

Who are the annotators?

[More Information Needed]

Personal and Sensitive Information

[More Information Needed]

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

[More Information Needed]

Additional Information

Dataset Curators

[More Information Needed]

Licensing Information

This work is licensed under a Creative Commons Attribution 4.0 International License

Citation Information

@misc{sheng2020investigating,
      title={Investigating Societal Biases in a Poetry Composition System},
      author={Emily Sheng and David Uthus},
      year={2020},
      eprint={2011.02686},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Contributions

Thanks to @patil-suraj for adding this dataset.