数据集:

atasoglu/flickr8k-dataset

中文

You must download the dataset files manually. You can visit this page or run download.sh to get files.

After, you can load dataset by referencing the directory:

import datasets
ds = datasets.load_dataset("atasoglu/flickr8k-dataset", data_dir="data")
print(ds)
DatasetDict({
    train: Dataset({
        features: ['image_id', 'image_path', 'captions'],
        num_rows: 6000
    })
    test: Dataset({
        features: ['image_id', 'image_path', 'captions'],
        num_rows: 1000
    })
    validation: Dataset({
        features: ['image_id', 'image_path', 'captions'],
        num_rows: 1000
    })
})

I don't own the copyright of the images. Please visit for more.