数据集:

flaviagiammarino/vqa-rad

语言:

en

大小:

1K<n<10K

其他:

medical

许可:

cc0-1.0
中文

Dataset Card for VQA-RAD

Dataset Description

VQA-RAD is a dataset of question-answer pairs on radiology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from MedPix , which is a free open-access online database of medical images. The question-answer pairs were manually generated by a team of clinicians.

Homepage: Open Science Framework Homepage Paper: A dataset of clinically generated visual questions and answers about radiology images Leaderboard: Papers with Code Leaderboard

Dataset Summary

The dataset was downloaded from the Open Science Framework Homepage on June 3, 2023. The dataset contains 2,248 question-answer pairs and 315 images. Out of the 315 images, 314 images are referenced by a question-answer pair, while 1 image is not used. The training set contains 3 duplicate image-question-answer triplets. The training set also has 1 image-question-answer triplet in common with the test set. After dropping these 4 image-question-answer triplets from the training set, the dataset contains 2,244 question-answer pairs on 314 images.

Supported Tasks and Leaderboards

This dataset has an active leaderboard on Papers with Code where models are ranked based on three metrics: "Close-ended Accuracy", "Open-ended accuracy" and "Overall accuracy". "Close-ended Accuracy" is the accuracy of a model's generated answers for the subset of binary "yes/no" questions. "Open-ended accuracy" is the accuracy of a model's generated answers for the subset of open-ended questions. "Overall accuracy" is the accuracy of a model's generated answers across all questions.

Languages

The question-answer pairs are in English.

Dataset Structure

Data Instances

Each instance consists of an image-question-answer triplet.

{
  'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=566x555>,
  'question': 'are regions of the brain infarcted?',
  'answer': 'yes'
}

Data Fields

  • 'image' : the image referenced by the question-answer pair.
  • 'question' : the question about the image.
  • 'answer' : the expected answer.

Data Splits

The dataset is split into training and test. The split is provided directly by the authors.

Training Set Test Set
QAs 1,793 451
Images 313 203

Additional Information

Licensing Information

The authors have released the dataset under the CC0 1.0 Universal License.

Citation Information

@article{lau2018dataset,
    title={A dataset of clinically generated visual questions and answers about radiology images},
    author={Lau, Jason J and Gayen, Soumya and Ben Abacha, Asma and Demner-Fushman, Dina},
    journal={Scientific data},
    volume={5},
    number={1},
    pages={1--10},
    year={2018},
    publisher={Nature Publishing Group}
}