Automated Fact-Checking Resources

Overview

This repo contains relevant resources from our survey paper A Survey on Automated Fact-Checking accepted by TACL 2021. In this survey, we present a comprehensive and up-to-date survey of automated fact-checking, unifying various components and definitions developed in previous research into a common framework. As automated fact-checking research is evolving, we will provided timely update on the survey and this repo.

Task Definition
Datasets
- Claim Detection
- Factual Verification
  - Natural Claims
  - Artificial Claims
Shared Tasks
Models
Relevant Surveys
Related Tasks
- Misinformation and Disinformation
- Detecting Previous Claims
Tutorials

Task Definition

Figure below shows a NLP framework for automated fact-checking consisting of three stages:

Claim detection to identify claims that require verification;
Evidence retrievalto find sources supporting or refuting the claim;
Claim verification to assess the veracity of the claim based on the retrieved evidence.

Evidence retrieval and claim verification are sometimes tackled as a single task referred to asfactual verification, while claim detection is often tackled separately. Claim verificationcan be decomposed into two parts that can be tackled separately or jointly: verdict prediction, where claims are assigned truthfulness labels, and justification production, where explanations for verdicts must be produced.

Datasets

Claim Detection Dataset

Towards Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection (Konstantinovskiy et al., 2021)
[Paper]
The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Nakov et al., 2021)
[Paper]
[Dataset]
Mining Dual Emotion for Fake News Detection (Zhang et al., 2021)
[Paper]
[Dataset]
Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media (Barrón-Cedeño et al., 2020)
[Paper]
[Dataset]
Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability (Redi et al., 2019)
[Paper]
[Dataset]
SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours (Gorrell et al., 2019).
[Paper]
[Dataset]
Joint Rumour Stance and Veracity (Lillie et al., 2019)
[Paper]
[Dataset]
Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness (Atanasova et al., 2018)
[Paper]
[Dataset]
Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter (Volkova et al., 2017)
[Paper]
[Dataset]
A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates (Gencheva et al., 2017)
[Paper]
[Dataset]
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours (Derczynski et al., 2017).
[Paper]
[Dataset]
Detecting Rumors from Microblogs with Recurrent Neural Networks (Ma et al., 2016)
[Paper]
[Dataset]
Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads (Zubiaga et al., 2016).
[Paper]
[Dataset]
CREDBANK: A Large-Scale Social Media Corpus with Associated Credibility Annotations (Mitra and Gilbert, 2015).
[Paper]
[Dataset]
Detecting Check-worthy Factual Claims in Presidential Debates (Hassan et al., 2015)
[Paper]

Factual Verification Dataset

Natural Claims

COVID-Fact: Fact Extraction and Verification of Real-World Claims on COVID-19 Pandemic (Saakyan et al., 2021)
[Paper]
[Dataset]
X-FACT: A New Benchmark Dataset for Multilingual Fact Checking (Gupta and Srikumar, 2021)
[Paper]
[Dataset]
Explainable Automated Fact-Checking for Public Health Claims (Kotonya and Toni, 2020b)
[Paper]
[Dataset]
Fact or Fiction: Verifying Scientific Claims (Wadden et al., 2020).
[Paper]
[Dataset]
CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims (Diggelmann et al., 2020)
[Paper]
[Dataset]
A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking (Hanselowski et al., 2019).
[Paper]
[Code]
[Dataset]
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims (Augenstein et al., 2019).
[Paper]
[Dataset]
Fact or Fiction: Verifying Scientific Claims (Wadden et al., EMNLP 2020).
[Paper]
[Dataset]
Explainable Automated Fact-Checking for Public Health Claims (Kotonya and Toni, 2020).
[Paper]
[Dataset]
FakeNewsNet: A Data Repository with News Content, Social Context and Spatialtemporal Information for Studying Fake News on Social Media (Shu et al., 2018).
[Paper]
[Dataset]
Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality (Barrón-Cedeño et al., 2018)
[Paper]
[Dataset]
Integrating Stance Detection and Fact Checking in a Unified Corpus (Baly et al., 2018).
[Paper]
[Dataset]
Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking (Rashkin et al., 2017).
[Paper]
[Dataset]
“Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection (Wang, 2017).
[Paper]
[Dataset]
Credibility Assessment of Textual Claims on the Web (Popat et al., 2016)
[Paper]
[Dataset]
Emergent: a novel data-set for stance classification (Ferreira and Vlachos, 2016)
[Paper]
[Dataset]
Identification and Verification of Simple Claims about Statistical Properties (Vlachos and Riedel, 2015)
[Paper]
[Dataset]
Fact Checking: Task definition and dataset construction (Vlachos and Riedel, 2014)
[Paper]
[Dataset]
Verification and Implementation of Language-Based Deception Indicators in Civil and Criminal Narratives (Bachenko et al., 2008)
[Paper]
AnswerFact: Fact Checking in Product Question Answering (Zhang et al., 2020)
[Paper]
[Dataset]
Fact Checking in Community Forums (Mihaylova et al., 2018)
[Paper]
[Dataset]
FakeCovid-- A Multilingual Cross-domain Fact Check News Dataset for COVID-19 (Shahi and Nandini, 2020).
[Paper]
[Dataset]
FakeNewsNet: A Data Repository with News Content, Social Context and Spatialtemporal Information for Studying Fake News on Social Media (Shu et al., 2020).
[Paper]
[Dataset]
FA-KES: A Fake News Dataset around the Syrian War (Salem et al., 2019)
[Paper]
[Dataset]
A News Veracity Dataset with Facebook User Commentary and Egos (Santia and Williams, 2018)
[Paper]]
[Dataset]
A Stylometric Inquiry into Hyperpartisan and Fake News (Potthast et al., 2018)
[Paper]
[Dataset]
Sampling the News Producers: A Large News and Feature Data Set for the Study of the Complex Media Landscape (Horne et al., 2018)
[Paper]
[Dataset]
r/Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection (Nakamura et al., 2020).
[Paper]
[Dataset]
Fact-Checking Meets Fauxtography: Verifying Claims About Images (Zlatkova et al., 2019)
[Paper]
[Dataset]

Artifical Claims

FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information (Aly et al., 2021)
[Paper]
[Dataset]
[Code]
Statement Verification and Evidence Finding with Tables (SEM-TAB-FACT) (Wang et al., 2021)
[Dataset]
ParsFEVER: a Dataset for Farsi Fact Extraction and Verification (Zarharan et al., 2021)
[Paper]
[Dataset]
TabFact: A Large-scale Dataset for Table-based Fact Verification
(Chen et al., 2020).
[Paper]
[Dataset]
INFOTABS: Inference on Tables as Semi-structured Data (Gupta et al., 2020)
[Paper]
[Dataset]
Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence (Schuster et al., 2021)
[Paper]
[Dataset]
HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification (Jiang et al., 2020)
[Paper]
[Dataset]
DanFEVER: claim verification dataset for Danish (Nørregaard and Derczynski, 2021)
[Paper]
[Dataset]]
Stance Prediction and Claim Verification: An Arabic Perspective (Khouja, 2020)
[Paper]
[Dataset]
Automated Fact-Checking of Claims from Wikipedia (Sathe et al., 2020).
[Paper]
[Dataset]
FEVER: a Large-scale Dataset for Fact Extraction and VERification (Thorne et al., 2018).
[Paper]
[Dataset]]
Automatic Detection of Fake News (Pérez-Rosas et al., 2018)
[Paper]
[Dataset]]
The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language (Mihalcea and Strapparava, 2009)
[Paper]
Unsupervised Fact Checking by Counter-Weighted Positive and Negative Evidential Paths in A Knowledge Graph (Kim and Choi, 2020)
[Paper]
Finding Streams in Knowledge Graphs to Support Fact Checking (Shiralkar et al., 2017)
[Paper]
[Dataset]
Discriminative predicate path mining for fact checking in knowledge graphs (Shi and Weninger, 2016)
[Paper]
Computational fact checking from knowledge networks (Ciampaglia et al., 2015)
[Paper]

Shared Tasks

The Fact Extraction and VERification (FEVER) Shared Task [4th FEVER Workshop] The shared task is ongoing!
Statement Verification and Evidence Finding with Tables (SEM-TAB-FACT) [Wang et al., 2021]
SciFact Claim Verifiation [Wadden et al., 2020]
Fakeddit Multimodal Fake News Detection Challenge [Nakamura et al., 2020]
SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours [Gorrell et al., 2019]
SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums [Mihaylova et al., 2019]
A Retrospective Analysis of the Fake News Challenge Stance-Detection Task [Hanselowski et al., 2018]
The Fact Extraction and VERification (FEVER) Shared Task [Thorne et al., 2018]
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours [Derczynski et al., 2017]
The Fake News Challenge (FNC-1) [Pomerleau and Rao, 2017]

Models

Claim Detection

Mining Dual Emotion for Fake News Detection (Zhang et al., 2021).
[Paper]
[Code]
Claim Check-Worthiness Detection as Positive Unlabelled Learning (Wright and Augenstein, 2021)
[Paper]
[Code]
Exploiting Microblog Conversation Structures to Detect Rumors (Li et al., 2020).
[Paper]
Rumor Detection on Social Media with Graph Structured Adversarial Learning (Yang et al., 2020).
[Paper]
Fake News Early Detection: A Theory-driven Model (Zhou et al., 2020).
[Paper]
Fake News Detection on Social Media using Geometric Deep Learning (Monti et al., 2019).
[Paper]
Rumor Detection on Twitter with Tree-structured Recursive Neural Networks (Ma et al., 2018).
[Paper]
[Code]
Rumor Detection with Hierarchical Social Attention Network (Guo et al., 2018).
[Paper]
A Hybrid Recognition System for Check-worthy Claims Using Heuristics and Supervised Learning (Zuo et al., 2018).
[Paper]
Simple Open Stance Classification for Rumour Analysis (Aker et al., 2017).
[Paper]
NileTMRG at SemEval-2017 Task 8: Determining Rumour and Veracity Support for Rumours on Twitter (Enayet and El-Beltagy, 2017).
[Paper]
Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM (Kochkina et al., 2017).
[Paper]
Automatically Identifying Fake News in Popular Twitter Threads (Buntain and Golbeck, 2017).
[Paper]
Detecting Rumors from Microblogs with Recurrent Neural Networks (Ma et al., 2016).
[Paper]
[Dataset]

Factual Verification

Joint Verification and Reranking for Open Fact Checking Over Tables (Schlichtkrull et al., 2021).
[Paper]
[Code]
Topic-Aware Evidence Reasoning and Stance-Aware Aggregation for Fact Verification (Si et al., 2021).
[Paper]
[Code]
A Multi-Level Attention Model for Evidence-Based Fact Checking (Kruengkrai et al., 2021)
[Paper]
[Code]
Multi-Task Retrieval for Knowledge-Intensive Tasks (Maillard et al., 2021).
[Paper]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020).
[Paper]
[Code]
Language Models as Fact Checkers? (Lee et al., 2020).
[Paper]
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification (Subramanian et al., 2020)
[Paper]
[Code]
Fine-grained Fact Verification with Kernel Graph Attention Network (Liu et al., 2020).
[Paper]
[Code]
Reasoning Over Semantic-Level Graph for Fact Checking (Zhong et al., 2020).
[Paper]
LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network (Zhong et al., 2020).
[Paper]
Program Enhanced Fact Verification with Verbalization and Graph Attention Network (Yang et al., 2020).
[Paper]
[Code]
Understanding tables with intermediate pre-training (Eisenschlos et al., 2020).
[Paper]
[Code]
Scrutinizer: A Mixed-Initiative Approach to Large-Scale, Data-Driven Claim Verification (Karagiannis et al., 2020) [Paper]
[Code]
GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification (Zhou et al., 2019).
[Paper]
[Code]]
Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks (Ma et al., 2019).
[Paper]
Combining Fact Extraction and Verification with Neural Semantic Matching Networks (Nie et al., 2019).
[Paper]
[Code]
Team DOMLIN: Exploiting Evidence Enhancement for the FEVER Shared Task (Stammbach and Neumann, 2019).
[Paper]
[Code]
Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks (Ma et al., 2019).
[Paper]
TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification (Yin and Roth, 2018).
[Paper]
[Code]
UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification (Hanselowski et al., 2018).
[Paper]
[Code]
Team Papelo: Transformer Networks at FEVER (Malon, 2018).
[Paper]
[Code]
QED: A fact verification system for the FEVER shared task (Luken et al., 2018).
[Paper]
[Code]
UCL Machine Reading Group: Four Factor Framework For Fact Finding (HexaF) (Yoneda et al., 2018).
[Paper]
[Code]
Can Rumour Stance Alone Predict Veracity? (Dungs et al., 2018).
[Paper]
Varying Shades: Analyzing Language in Fake News and Political Fact-Checking (Rashkin et al., 2017).
[Paper]

Justification Production

Explainable Automated Fact-Checking for Public Health Claims (Kotonya and Toni, 2020).
[Paper]]
[Code]
[Dataset]
Generating Fact Checking Explanations (Atanasova et al., 2020).
[Paper]
GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media (Lu and Li, 2020).
[Paper]
[Code]
DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification (Wu et al., 2020).
[Paper]
ExFaKT: A Framework for Explaining Facts over Knowledge Graphs and Text (Gad-Elrab et al., 2019)
[Paper]
[Code]
dEFEND: Explainable Fake News Detection (Shu et al., 2019).
[Paper]
Explainable Fact Checking with Probabilistic Answer Set Programming
[Paper]
[Code]
Where is your Evidence: Improving Fact-checking by Justification Modeling (Alhindi et al., 2018).
[Paper]
[Code]]
DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning (Popat et al., 2018).
[Paper]

Misinformation and Disinformation

A Survey on Multimodal Disinformation Detection (Alam et al., 2021)
[Paper]
Misinformation, Disinformation, and Online Propaganda (Guess and Lyons, 2020)
[Paper]
A Survey on Computational Propaganda Detection (Da San Martino et al. 2020).
[Paper]
Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature (Tucker et al., 2018)
[Paper]

Detecting Previous Claims

Article Reranking by Memory-Enhanced Key Sentence Matching for Detecting Previously Fact-Checked Claims (Sheng et al. 2021)
[Paper]
[Code]
Claim Matching Beyond English to Scale Global Fact-Checking (Kazemiet al. 2021)
[Paper]
The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News (Nakov et al., 2021)
[Paper]]
That is a Known Lie: Detecting Previously Fact-Checked Claims (Shaar et al., 2020)
[Paper]
[Dataset]
COVIDLies: Detecting COVID-19 Misinformation on Social Media (Hossain et al., 2020)
[Paper]
Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media (Barrón-Cedeño et al., 2020)
[Paper]

Relevant Surveys