数据集:
aseifert/merlin
Project URL: https://merlin-platform.eu/C_mcorpus.php
Dataset URL: https://clarin.eurac.edu/repository/xmlui/handle/20.500.12124/6
The MERLIN corpus is a written learner corpus for Czech, German, and Italian that has been designed to illustrate the Common European Framework of Reference for Languages (CEFR) with authentic learner data. The corpus contains learner texts produced in standardized language certifications covering CEFR levels A1-C1. The MERLIN annotation scheme includes a wide range of language characteristics that provide researchers with concrete examples of learner performance and progress across multiple proficiency levels.