数据集:

TalTechNLP/AMIsum

许可:

cc-by-4.0

源数据集:

original

批注创建人:

expert-generated

大小:

n<1K

计算机处理:

monolingual

语言:

en
中文

Dataset Card for "AMIsum"

Dataset Summary

AMIsum is meeting summaryzation dataset based on the AMI Meeting Corpus ( https://groups.inf.ed.ac.uk/ami/corpus/ ). The dataset utilizes the transcripts as the source data and abstract summaries as the target data.

Supported Tasks and Leaderboards

More Information Needed

Languages

English

Dataset Structure

Data Instances

{'transcript': '<PM> Okay. <PM> Right. <PM> Um well this is the kick-off meeting for our our project. <PM> Um and um this is just what we're gonna be doing over the next twenty five minutes. <ME> Mm-hmm. <PM> Um so first of all, just to kind of make sure that we all know each other, I'm Laura and I'm the project manager. <PM> Do you want to introduce yourself again? <ME> Great. [...]', 'summary': 'The project manager introduced the upcoming project to the team members and then the team members participated in an exercise in which they drew their favorite animal and discussed what they liked about the animal. The project manager talked about the project finances and selling prices. The team then discussed various features to consider in making the remote.', 'id': 'ES2002a', 

Data Fields

transcript: Expert generated transcript.
summary: Expert generated summary.
id: Meeting id.

Data Splits

train validation test
97 20 20