数据集:
dali-does/clevr-math
任务:
语言:
计算机处理:
monolingual语言创建人:
machine-generated批注创建人:
machine-generated源数据集:
clevr预印本库:
arxiv:2208.05358许可:
Dataset for compositional multimodal mathematical reasoning based on CLEVR.
Loading the data, preprocessing text with CLIPfrom transformers import CLIPPreprocessor
from datasets import load_dataset, DownloadConfig
dl_config = DownloadConfig(resume_download=True,
num_proc=8,
force_download=True)
# Load 'general' instance of dataset
dataset = load_dataset('dali-does/clevr-math', download_config=dl_config)
# Load version with only multihop in test data
dataset_multihop = load_dataset('dali-does/clevr-math', 'multihop',
download_config=dl_config)
model_path = "openai/clip-vit-base-patch32"
extractor = CLIPProcessor.from_pretrained(model_path)
def transform_tokenize(e):
e['image'] = [image.convert('RGB') for image in e['image']]
return extractor(text=e['question'],
images=e['image'],
padding=True)
dataset = dataset.map(transform_tokenize,
batched=True,
num_proc=8,
padding='max_length')
dataset_subtraction = dataset.filter(lambda e:
e['template'].startswith('subtraction'), num_proc=4)
Leaderboard will be announced at a later date.
The dataset is currently only available in English. To extend the dataset to other languages, the CLEVR templates must be rewritten in the target language.
features = datasets.Features(
{
"template": datasets.Value("string"),
"id": datasets.Value("string"),
"question": datasets.Value("string"),
"image": datasets.Image(),
"label": datasets.Value("int64")
}
)
train/val/test
Data is generated using code provided with the CLEVR-dataset, using blender and templates constructed by the dataset curators.
[More Information Needed]
Adam Dahlgren Lindström - dali@cs.umu.se
Licensed under Creative Commons Attribution Share Alike 4.0 International (CC-by 4.0).
[More Information Needed]
@misc{https://doi.org/10.48550/arxiv.2208.05358,
doi = {10.48550/ARXIV.2208.05358},
url = {https://arxiv.org/abs/2208.05358},
author = {Lindström, Adam Dahlgren and Abraham, Savitha Sam},
keywords = {Machine Learning (cs.LG), Computation and Language (cs.CL), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences, I.2.7; I.2.10; I.2.6; I.4.8; I.1.4},
title = {CLEVR-Math: A Dataset for Compositional Language, Visual, and Mathematical Reasoning},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution Share Alike 4.0 International}
}
Thanks to @dali-does for adding this dataset.