数据集:
PNLPhub/FarsTail
Persian (Farsi) language is a pluricentric language spoken by around 110 million people in countries like Iran, Afghanistan, and Tajikistan. Here, we present the first relatively large-scale Persian dataset for NLI task, called FarsTail. A total of 10,367 samples are generated from a collection of 3,539 multiple-choice questions. The train, validation, and test portions include 7,266, 1,537, and 1,564 instances, respectively
[More Information Needed]
@article{amirkhani2020farstail, title={FarsTail: A Persian Natural Language Inference Dataset}, author={Hossein Amirkhani, Mohammad Azari Jafari, Azadeh Amirak, Zohreh Pourjafari, Soroush Faridan Jahromi, and Zeinab Kouhkan}, journal={arXiv preprint arXiv:2009.08820}, year={2020} }