HE
hetvidoshi22/AI-DPR-Evaluation-System
NLP-based system to automate compliance checks and risk analysis of government DPR PDFs using hybrid RAG and rule-based validation.
๐ค AI-Based DPR Evaluation System (SIH โ MDoNER)
๐ Overview
The Ministry of Development of North Eastern Region (MDoNER) evaluates hundreds of Detailed Project Reports (DPRs) for infrastructure and socio-economic development projects. Manual evaluation is time-consuming, inconsistent, and prone to human error, often delaying project approvals.
This project implements an offline AI-powered DPR evaluation system that automatically extracts, analyzes, and evaluates DPR PDFs using NLP, OCR, and Retrieval-Augmented Generation (RAG) to support faster, consistent, and explainable decision-making.
๐ฏ Key Objectives
- Automate DPR evaluation for completeness, consistency, and risk
- Handle scanned, multilingual, and unstructured PDFs
- Provide explainable compliance validation
- Enable offline deployment for low-connectivity regions
๐ System Workflow
- DPR PDF upload via FastAPI
- Hybrid PDF extraction (PyMuPDF โ pdfplumber โ OCR fallback)
- Sentence-aware semantic chunking (200โ500 words)
- Retrieval-Augmented Generation (RAG) for context-aware analysis
- Rule-based compliance and consistency checks
- Evidence extraction from source text
- Structured JSON output for decision support
โจ Core Features
- Hybrid PDF extraction for digital and scanned DPRs
- OCR fallback using PaddleOCR and Tesseract
- Semantic chunking preserving contextual information
- RAG-based context-aware analysis
- Deterministic rule-based compliance engine
- Offline-first design using local LLMs (Ollama)
- FastAPI backend with a single
/processendpoint
๐ Tech Stack
Main
- Python
- FastAPI
- NLP (RAG)
- Rule-Based Systems
Libraries & Tools
- PyMuPDF, pdfplumber
- PaddleOCR, pytesseract, Pillow
- Sentence Transformers
- Ollama
- NLTK
๐ Output
The system generates:
- Extracted text in JSON format
- Semantic text chunks
- Answers with evidence snippets
- Structured compliance and risk insights
๐ฅ Team & Contribution
- Developed as part of Smart India Hackathon (SIH).
On this page
Languages
Python92.8%HTML4.5%JavaScript2.7%
Contributors
Created December 13, 2025
Updated December 14, 2025