GitHunt
GO

GouriRajesh/LLM-Hallucination-Detector-with-Explainability

This repository contains code and notebooks for detecting and explaining hallucinations produced by large language models (LLMs). It includes data processing, model training, evaluation, explainability and a Wikipedia-based verification component.

LLM Hallucination Detection

This repository contains code and notebooks for detecting and explaining hallucinations produced by large language models (LLMs). It includes data processing, model training, evaluation, explainability and a Wikipedia-based verification component.

Project structure

  • README.md — this file
  • requirements.txt — pip installable dependencies
  • environment.yml — conda environment specification
  • data/
    • raw/ — original/raw CSV datasets (halueval_qa.csv, truthfulqa.csv)
    • processed/ — processed train/val/test CSVs
  • models/
    • best_model.pt — trained model checkpoint
    • metrics.json — evaluation metrics
    • training_history.json — training logs/history
  • notebooks/
    • 01_data_exploration.ipynb — EDA and dataset inspection
    • 02_model_training.ipynb — model training and experiments
    • 03_explainability.ipynb — explainability analyses and visualizations
  • results/ — (placeholder for generated outputs / figures)
  • src/ — core Python modules
    • data_loader.py — data loading utilities
    • data_processor.py — preprocessing and feature engineering
    • dataset.py — dataset classes / PyTorch Dataset wrappers
    • model.py — model definition and helpers
    • trainer.py — training loop and checkpoints
    • predictor.py — inference utilities
    • evaluator.py — evaluation metrics and routines
    • explainer.py — explainability methods
    • wikipedia_verifier.py — verification helpers using Wikipedia
    • init.py

Prerequisites

  • Git
  • Python 3.11 (recommended to match environment.yml)
  • Conda or a virtualenv for pip installs
  • CPU is sufficient, GPU can be used if available for faster training

Setup

Using pip + venv

  1. Create and activate a virtual environment:
    • python -m venv .venv
    • source .venv/bin/activate (Linux/macOS) or .venv\Scripts\activate (Windows)
  2. Install dependencies:
    • pip install --upgrade pip
    • pip install -r requirements.txt

Using conda

  1. Create the conda environment from the provided file:
    • conda env create -f environment.yml
  2. Activate it:
    • conda activate llm-halu-env

Notes

  • If using conda, the pip section in environment.yml already installs additional pip-only packages.
  • For spaCy usage, run: python -m spacy download en_core_web_sm (if not already installed by environment).

Key dependencies

Important packages used in the project include (non-exhaustive):

  • PyTorch
  • transformers, tokenizers
  • datasets (Hugging Face)
  • sentence-transformers
  • scikit-learn, numpy, pandas
  • lime(explainability)
  • spaCy (nlp preprocessing)
  • wikipedia, wikipedia-api (verification utilities)

See requirements.txt and environment.yml for full dependency lists.

Project workflow

  1. Data preparation
    • Use src/data_loader.py and src/data_processor.py to load and preprocess raw data.
    • Processed splits are stored under data/processed (train/val/test).
  2. Dataset & dataloaders
    • src/dataset.py provides dataset wrappers for training and evaluation.
  3. Modeling & training
    • src/model.py defines the model architecture.
    • src/trainer.py contains the training loop, checkpointing and logging.
    • Notebooks/02_model_training.ipynb demonstrates experiment steps.
  4. Inference & evaluation
    • src/predictor.py performs inference on new inputs.
    • src/evaluator.py computes metrics and generates evaluation reports.
  5. Explainability
    • src/explainer.py and Notebooks/03_explainability.ipynb explore model explanations (LIME).
  6. Verification
    • src/wikipedia_verifier.py provides utilities to check factual claims against Wikipedia.

Results

  • Trained model artifacts are under models/ (best_model.pt, metrics.json, training_history.json).
  • The results/ directory is reserved for evaluation outputs, visualizations and exported reports produced by notebooks or scripts.
  • Notebooks under notebooks/ produce reproducible EDA, training runs and explainability outputs — export their figures into results/ as needed.

Languages

Jupyter Notebook95.8%Python4.2%

Contributors

Created January 7, 2026
Updated January 7, 2026