数据集:
casino
子任务:
dialogue-modeling语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
crowdsourced批注创建人:
expert-generated源数据集:
original许可:
cc-by-4.0We provide a novel dataset (referred to as CaSiNo) of 1030 negotiation dialogues. Two participants take the role of campsite neighbors and negotiate for Food, Water, and Firewood packages, based on their individual preferences and requirements. This design keeps the task tractable, while still facilitating linguistically rich and personal conversations. This helps to overcome the limitations of prior negotiation datasets such as Deal or No Deal and Craigslist Bargain. Each dialogue consists of rich meta-data including participant demographics, personality, and their subjective evaluation of the negotiation in terms of satisfaction and opponent likeness.
Train end-to-end models for negotiation
English
{ "chat_logs": [ { "text": "Hello! \ud83d\ude42 Let's work together on a deal for these packages, shall we? What are you most interested in?", "task_data": {}, "id": "mturk_agent_1" }, ... ], "participant_info": { "mturk_agent_1": { "value2issue": ... "value2reason": ... "outcomes": ... "demographics": ... "personality": ... }, "mturk_agent_2": ... }, "annotations": [ ["Hello! \ud83d\ude42 Let's work together on a deal for these packages, shall we? What are you most interested in?", "promote-coordination,elicit-pref"], ... ] }
No default data split has been provided. Hence, all 1030 data points are under the 'train' split.
Train | |
---|---|
total dialogues | 1030 |
annotated dialogues | 396 |
The dataset was collected to address the limitations in prior negotiation datasets from the perspective of downstream applications in pedagogy and conversational AI. Please refer to the original paper published at NAACL 2021 for details about the rationale and data curation steps ( source paper ).
The dialogues were crowdsourced on Amazon Mechanical Turk. The strategy annotations were performed by expert annotators (first three authors of the paper). Please refer to the original dataset paper published at NAACL 2021 for more details ( source paper ).
Who are the source language producers?The primary producers are Turkers on Amazon Mechanical Turk platform. Two turkers were randomly paired with each other to engage in a negotiation via a chat interface. Please refer to the original dataset paper published at NAACL 2021 for more details ( source paper ).
From the source paper for this dataset:
Three expert annotators independently annotated 396 dialogues containing 4615 utterances. The annotation guidelines were iterated over a subset of 5 dialogues, while the reliability scores were computed on a different subset of 10 dialogues. We use the nominal form of Krippendorff’s alpha (Krippendorff, 2018) to measure the inter-annotator agreement. We provide the annotation statistics in Table 2. Although we release all the annotations, we skip Coordination and Empathy for our analysis in this work, due to higher subjectivity resulting in relatively lower reliability scores.
Who are the annotators?Three expert annotators (first three authors of the paper).
All personally identifiable information about the participants such as MTurk Ids or HIT Ids was removed before releasing the data.
Please refer to Section 8.2 in the source paper .
Please refer to Section 8.2 in the source paper .
Please refer to Section 7 in the source paper .
Corresponding Author: Kushal Chawla ( kchawla@usc.edu ) Affiliation: University of Southern California Please refer to the source paper for the complete author list.
The project is licensed under CC-by-4.0
@inproceedings{chawla2021casino, title={CaSiNo: A Corpus of Campsite Negotiation Dialogues for Automatic Negotiation Systems}, author={Chawla, Kushal and Ramirez, Jaysa and Clever, Rene and Lucas, Gale and May, Jonathan and Gratch, Jonathan}, booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies}, pages={3167--3185}, year={2021} }
Thanks to Kushal Chawla for adding this dataset.