数据集:
blended_skill_talk
任务:
对话子任务:
dialogue-generation语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
crowdsourced批注创建人:
crowdsourced源数据集:
original预印本库:
arxiv:2004.08449许可:
license:unknownA dataset of 7k conversations explicitly designed to exhibit multiple conversation modes: displaying personality, having empathy, and demonstrating knowledge.
An example of 'train' looks as follows.
{ 'personas': ['my parents don t really speak english , but i speak italian and english.', 'i have three children.'], 'additional_context': 'Backstreet Boys', 'previous_utterance': ['Oh, I am a BIG fan of the Backstreet Boys! Have you ever seen them performing live?', "No,I listen to their music a lot, mainly the unbreakable which is the Backstreet Boys' sixth studio album. "], 'context': 'wizard_of_wikipedia', 'free_messages': ['you are very knowledgeable, do you prefer nsync or bsb?', "haha kids of this days don't know them, i'm 46 and i still enjoying them, my kids only listen k-pop", "italian?haha that's strange, i only talk english and a little spanish "], 'guided_messages': ["i don't have a preference, they are both great. All 3 of my kids get annoyed when I listen to them though.", 'Sometimes I sing their songs in Italian, that really annoys them lol.', 'My parents barely speak English, so I was taught both. By the way, what is k-pop?'], 'suggestions': {'convai2': ["i don't have a preference , both are pretty . do you have any hobbies ?", "do they the backstreet boys ? that's my favorite group .", 'are your kids interested in music ?'], 'empathetic_dialogues': ['I actually just discovered Imagine Dragons. I love them!', "Hahaha that just goes to show ya, age is just a umber!'", 'That would be hard! Do you now Spanish well?'], 'wizard_of_wikipedia': ['NSYNC Also had Lance Bass and Joey Fatone, sometimes called the Fat One.', 'Yes, there are a few K-Pop songs that I have heard good big in the USA. It is the most popular in South Korea and has Western elements of pop.', 'English, beleive it or not.']}, 'guided_chosen_suggestions': ['convai2', '', ''], 'label_candidates': []}
The data fields are the same among all splits.
defaultname | train | validation | test |
---|---|---|---|
default | 4819 | 1009 | 980 |
@misc{smith2020evaluating, title={Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills}, author={Eric Michael Smith and Mary Williamson and Kurt Shuster and Jason Weston and Y-Lan Boureau}, year={2020}, eprint={2004.08449}, archivePrefix={arXiv}, primaryClass={cs.CL} }
Thanks to @lewtun , @thomwolf , @lhoestq , @patrickvonplaten , @mariamabarham for adding this dataset.