cc-by-4.0This dataset is composed of the annotations from the AnCora corpus , projected on the Universal Dependencies treebank . We use the POS annotations of this corpus as part of the EvalEs Spanish language benchmark.
POS tagging
The dataset is in Spanish ( es-ES )
Three conllu files.
Annotations are encoded in plain text files (UTF-8, normalized to NFC, using only the LF character as line break, including an LF character at the end of file) with three types of lines:
Word lines contain the following fields:
From: https://universaldependencies.org
The original annotation was done in a constituency framework as a part of the AnCora project at the University of Barcelona. It was converted to dependencies by the Universal Dependencies team and used in the CoNLL 2009 shared task. The CoNLL 2009 version was later converted to HamleDT and to Universal Dependencies.
For more information on the AnCora project, visit the AnCora site .
To learn about the Universal Dependences, visit the webpage https://universaldependencies.org
Who are the source language producers?For more information on the AnCora corpus and its sources, visit the AnCora site .
For more information on the first AnCora annotation, visit the AnCora site .
Who are the annotators?For more information on the AnCora annotation team, visit the AnCora site .
No personal or sensitive information included.
This dataset contributes to the development of language models in Spanish.
This work is licensed under a CC Attribution 4.0 International License .
The following paper must be cited when using this corpus:
Taulé, M., M.A. Martí, M. Recasens (2008) 'Ancora: Multilevel Annotated Corpora for Catalan and Spanish', Proceedings of 6th International Conference on Language Resources and Evaluation. Marrakesh (Morocco).
To cite the Universal Dependencies project:
Rueter, J. (Creator), Erina, O. (Contributor), Klementeva, J. (Contributor), Ryabov, I. (Contributor), Tyers, F. M. (Contributor), Zeman, D. (Contributor), Nivre, J. (Creator) (15 Nov 2020). Universal Dependencies version 2.7 Erzya JR. Universal Dependencies Consortium.