数据集:

Someman/hindi-summarization

语言:

hi

大小:

10K<n<100K

许可:

mit
中文

Dataset Card for Dataset Name

Dataset Summary

Hindi Text Short and Large Summarization Corpus is a collection of ~180k articles with their headlines and summary collected from Hindi News Websites.

This is a first of its kind Dataset in Hindi which can be used to benchmark models for Text summarization in Hindi. This does not contain articles contained in Hindi Text Short Summarization Corpus which is being released parallely with this Dataset.

The dataset retains original punctuation, numbers etc in the articles.

Languages

The language is Hindi.

Licensing Information

MIT

Citation Information

https://www.kaggle.com/datasets/disisbig/hindi-text-short-and-large-summarization-corpus?select=test.csv

Contributions