数据集:
KBLab/overlim
The OverLim dataset contains some of the GLUE and SuperGLUE tasks automatically translated to Swedish, Danish, and Norwegian (bokmål), using the OpusMT models for MarianMT.
The translation quality was not manually checked and may thus be faulty. Results on these datasets should thus be interpreted carefully.
If you want to have an easy script to train and evaluate your models have a look here
The data contains the following tasks from GLUE and SuperGLUE:
Every task has their own set of features, but all share an idx and label .
In order to have test-split, we repurpose the original validation-split as test-split, and split the training-split into a new training- and validation-split, with an 80-20 distribution.
For more information about the individual tasks see ( https://gluebenchmark.com ) and ( https://super.gluebenchmark.com ).
Training non-English models is easy, but there is a lack of evaluation datasets to compare their actual performance.
[More Information Needed]
Who are the source language producers?[More Information Needed]
[More Information Needed]
Who are the annotators?[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
[More Information Needed]
Thanks to @kb-labb for adding this dataset.