The Swedish CNN/DailyMail dataset has only been machine-translated to improve downstream fine-tuning on Swedish summarization tasks.
Read about the full details at original English version: https://huggingface.co/datasets/cnn_dailymail
The Swedish CNN/DailyMail dataset follows the same splits as the original English version and has 3 splits: train , validation , and test .
Dataset Split | Number of Instances in Split |
---|---|
Train | 287,113 |
Validation | 13,368 |
Test | 11,490 |