数据集:

mediabiasgroup/mbib-base

语言:

en
中文

Dataset Card for Media-Bias-Identification-Benchmark

Baseline

Task Model Micro F1 Macro F1
cognitive-bias ConvBERT/ConvBERT 0.7126 0.7664
fake-news Bart/RoBERTa-T 0.6811 0.7533
gender-bias RoBERTa-T/ELECTRA 0.8334 0.8211
hate-speech RoBERTA-T/Bart 0.8897 0.7310
linguistic-bias ConvBERT/Bart 0.7044 0.4995
political-bias ConvBERT/ConvBERT 0.7041 0.7110
racial-bias ConvBERT/ELECTRA 0.8772 0.6170
text-leve-bias ConvBERT/ConvBERT 0.7697 0.7532

Languages

All datasets are in English

Dataset Structure

Data Instances

cognitive-bias

An example of one training instance looks as follows.

{
  "text": "A defense bill includes language that would require military hospitals to provide abortions on demand",
  "label": 1
}

Data Fields

  • text : a sentence from various sources (eg., news articles, twitter, other social media).
  • label : binary indicator of bias (0 = unbiased, 1 = biased)

Considerations for Using the Data

Social Impact of Dataset

We believe that MBIB offers a new common ground for research in the domain, especially given the rising amount of (research) attention directed toward media bias

Citation Information

@inproceedings{
    title = {Introducing MBIB - the first Media Bias Identification Benchmark Task and Dataset Collection},
    author = {Wessel, Martin and Spinde, Timo and Horych, Tomáš and Ruas, Terry and Aizawa, Akiko and Gipp, Bela},
    year = {2023},
    note = {[in review]}
}