数据集:
wkrl/cord
任务:
标记分类子任务:
parsing语言:
en计算机处理:
monolingual大小:
1K<n<10K语言创建人:
crowdsourced批注创建人:
crowdsourced源数据集:
original许可:
cc-by-4.0[More Information Needed]
[More Information Needed]
[More Information Needed]
{ "id": datasets.Value("string"), "words": datasets.Sequence(datasets.Value("string")), "bboxes": datasets.Sequence(datasets.Sequence(datasets.Value("int64"))), "labels": datasets.Sequence(datasets.features.ClassLabel(names=_LABELS)), "images": datasets.features.Image(), }
Creative Commons Attribution 4.0 International License
@article{park2019cord, title={CORD: A Consolidated Receipt Dataset for Post-OCR Parsing}, author={Park, Seunghyun and Shin, Seung and Lee, Bado and Lee, Junyeop and Surh, Jaeheung and Seo, Minjoon and Lee, Hwalsuk} booktitle={Document Intelligence Workshop at Neural Information Processing Systems} year={2019} }
Thanks to @clovaai for adding this dataset.