数据集:

BeIR/hotpotqa-generated-queries

任务:

文本检索

子任务:

entity-linking-retrieval fact-checking-retrieval

语言:

计算机处理:

monolingual

许可:

cc-by-sa-4.0

数据集介绍文件清单

中文

Dataset Card for BEIR Benchmark

Dataset Summary

BEIR is a heterogeneous benchmark that has been built from 18 diverse datasets representing 9 information retrieval tasks:

Fact-checking: FEVER , Climate-FEVER , SciFact
Question-Answering: NQ , HotpotQA , FiQA-2018
Bio-Medical IR: TREC-COVID , BioASQ , NFCorpus
News Retrieval: TREC-NEWS , Robust04
Argument Retrieval: Touche-2020 , ArguAna
Duplicate Question Retrieval: Quora , CqaDupstack
Citation-Prediction: SCIDOCS
Tweet Retrieval: Signal-1M
Entity Retrieval: DBPedia

All these datasets have been preprocessed and can be used for your experiments.

Supported Tasks and Leaderboards

The dataset supports a leaderboard that evaluates models against task-specific metrics such as F1 or EM, as well as their ability to retrieve supporting information from Wikipedia.

The current best performing models can be found here .

Languages

All tasks are in English ( en ).

Dataset Structure

All BEIR datasets must contain a corpus, queries and qrels (relevance judgments file). They must be in the following format:

corpus file: a .jsonl file (jsonlines) that contains a list of dictionaries, each with three fields _id with unique document identifier, title with document title (optional) and text with document paragraph or passage. For example: {"_id": "doc1", "title": "Albert Einstein", "text": "Albert Einstein was a German-born...."}
queries file: a .jsonl file (jsonlines) that contains a list of dictionaries, each with two fields _id with unique query identifier and text with query text. For example: {"_id": "q1", "text": "Who developed the mass-energy equivalence formula?"}
qrels file: a .tsv file (tab-seperated) that contains three columns, i.e. the query-id , corpus-id and score in this order. Keep 1st row as header. For example: q1 doc1 1

Data Instances

A high level example of any beir dataset:

corpus = {
    "doc1" : {
        "title": "Albert Einstein", 
        "text": "Albert Einstein was a German-born theoretical physicist. who developed the theory of relativity, \
                 one of the two pillars of modern physics (alongside quantum mechanics). His work is also known for \
                 its influence on the philosophy of science. He is best known to the general public for his massâ€“energy \
                 equivalence formula E = mc2, which has been dubbed 'the world's most famous equation'. He received the 1921 \
                 Nobel Prize in Physics 'for his services to theoretical physics, and especially for his discovery of the law \
                 of the photoelectric effect', a pivotal step in the development of quantum theory."
        },
    "doc2" : {
        "title": "", # Keep title an empty string if not present
        "text": "Wheat beer is a top-fermented beer which is brewed with a large proportion of wheat relative to the amount of \
                 malted barley. The two main varieties are German WeiÃŸbier and Belgian witbier; other types include Lambic (made\
                 with wild yeast), Berliner Weisse (a cloudy, sour beer), and Gose (a sour, salty beer)."
    },
}

queries = {
    "q1" : "Who developed the mass-energy equivalence formula?",
    "q2" : "Which beer is brewed with a large proportion of wheat?"
}

qrels = {
    "q1" : {"doc1": 1},
    "q2" : {"doc2": 1},
}

Data Fields

Examples from all configurations have the following features:

Corpus

corpus : a dict feature representing the document title and passage text, made up of:
- _id : a string feature representing the unique document id
  - title : a string feature, denoting the title of the document.
  - text : a string feature, denoting the text of the document.

Queries

queries : a dict feature representing the query, made up of:
- _id : a string feature representing the unique query id
- text : a string feature, denoting the text of the query.

Qrels

qrels : a dict feature representing the query document relevance judgements, made up of:
- _id : a string feature representing the query id
  - _id : a string feature, denoting the document id.
  - score : a int32 feature, denoting the relevance judgement between query and document.

Data Splits

Dataset	Website	BEIR-Name	Type	Queries	Corpus	Rel D/Q	Down-load	md5
MSMARCO	Homepage	msmarco	train dev test	6,980	8.84M	1.1	Link	444067daf65d982533ea17ebd59501e4
TREC-COVID	Homepage	trec-covid	test	50	171K	493.5	Link	ce62140cb23feb9becf6270d0d1fe6d1
NFCorpus	Homepage	nfcorpus	train dev test	323	3.6K	38.2	Link	a89dba18a62ef92f7d323ec890a0d38d
BioASQ	Homepage	bioasq	train test	500	14.91M	8.05	No	How to Reproduce?
NQ	Homepage	nq	train test	3,452	2.68M	1.2	Link	d4d3d2e48787a744b6f6e691ff534307
HotpotQA	Homepage	hotpotqa	train dev test	7,405	5.23M	2.0	Link	f412724f78b0d91183a0e86805e16114
FiQA-2018	Homepage	fiqa	train dev test	648	57K	2.6	Link	17918ed23cd04fb15047f73e6c3bd9d9
Signal-1M(RT)	Homepage	signal1m	test	97	2.86M	19.6	No	How to Reproduce?
TREC-NEWS	Homepage	trec-news	test	57	595K	19.6	No	How to Reproduce?
ArguAna	Homepage	arguana	test	1,406	8.67K	1.0	Link	8ad3e3c2a5867cdced806d6503f29b99
Touche-2020	Homepage	webis-touche2020	test	49	382K	19.0	Link	46f650ba5a527fc69e0a6521c5a23563
CQADupstack	Homepage	cqadupstack	test	13,145	457K	1.4	Link	4e41456d7df8ee7760a7f866133bda78
Quora	Homepage	quora	dev test	10,000	523K	1.6	Link	18fb154900ba42a600f84b839c173167
DBPedia	Homepage	dbpedia-entity	dev test	400	4.63M	38.2	Link	c2a39eb420a3164af735795df012ac2c
SCIDOCS	Homepage	scidocs	test	1,000	25K	4.9	Link	38121350fc3a4d2f48850f6aff52e4a9
FEVER	Homepage	fever	train dev test	6,666	5.42M	1.2	Link	5a818580227bfb4b35bb6fa46d9b6c03
Climate-FEVER	Homepage	climate-fever	test	1,535	5.42M	3.0	Link	8b66f0a9126c521bae2bde127b4dc99d
SciFact	Homepage	scifact	train test	300	5K	1.1	Link	5f7d1de60b170fc8027bb7898e2efca1
Robust04	Homepage	robust04	test	249	528K	69.9	No	How to Reproduce?

Dataset Creation

Curation Rationale

[Needs More Information]

Source Data

Initial Data Collection and Normalization

[Needs More Information]

Who are the source language producers?

[Needs More Information]

Annotations

Annotation process

[Needs More Information]

Who are the annotators?

[Needs More Information]

Personal and Sensitive Information

[Needs More Information]

Considerations for Using the Data

Social Impact of Dataset

[Needs More Information]

Discussion of Biases

[Needs More Information]

Other Known Limitations

[Needs More Information]

Additional Information

Dataset Curators

[Needs More Information]

Licensing Information

[Needs More Information]

Citation Information

Cite as:

@inproceedings{
thakur2021beir,
title={{BEIR}: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models},
author={Nandan Thakur and Nils Reimers and Andreas R{\"u}ckl{\'e} and Abhishek Srivastava and Iryna Gurevych},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
year={2021},
url={https://openreview.net/forum?id=wCu6T5xFjeJ}
}

Contributions

Thanks to @Nthakur20 for adding this dataset.

作者:

BeIR

数据集大小:

634.34 MB