Dataset Card for "xsum_eng2thai ????"
-
This dataset is based on
XSum
.
-
The summaries were translated from English (as in the original XSum) to Thai using Meta's
NLLB-200-3.3B
.
-
The dataset is intended for Cross-Lingual Summarization (English Document -> Thai Summary).
Data Fields
-
id
: BBC ID of the article.
-
document
: a string containing the body of the news article
-
summary
: a string containing a
translated
summary of the article.
Data Structure
{
"id": "29750031",
"document": "news article in English",
"summary": "summary in Thai"
}
Data Splits
train/validation/test = 204045/11332/11334