SH
Shrutakeerti/Actual_Phenomenon
This repo is like Parseval's theorem
Invoice Reimbursement Analyzer & Chatbot Assistant
This project is an intelligent system for automating the analysis of employee reimbursement invoices against HR policy documents and providing an interactive chatbot interface to query the results. It leverages FastAPI, Gradio, LLMs (via Groq API), and FAISS vector store for robust document processing and semantic search.
Project Overview
The system provides two main functionalities:
-
Invoice Analysis API (
/analyze/):- Accepts an HR Policy PDF and a ZIP file of employee invoices.
- Extracts text, generates invoice-policy compliance prompts, and queries a language model (LLM) to perform analysis.
- Stores the result along with invoice metadata in a vector store for future querying.
-
Chatbot API (
/chat/):- Accepts natural language queries (e.g., "Show invoices for Ramesh with Partially Reimbursed status").
- Retrieves relevant stored content from the vector store.
- Constructs a system prompt and gets a contextual response from the LLM.
A Gradio-based UI is provided to make the system user-friendly and visually intuitive.
Installation Instructions
Prerequisites
- Python 3.8+
git- Groq API key (or replace with OpenAI if needed)
Clone the Repository
git clone https://github.com/Shrutakeerti/Actual_Phenomenon.git
cd Actual_PhenomenonInstall Dependencies
pip install -r requirements.txtDirectory Structure
.
├── main.py # FastAPI backend
├── interface.py # Gradio frontend
├── prompts.py # Prompt engineering
├── utils.py # File parsing utils
├── vector_store.py # FAISS integration
├── groq_llm.py # LLM calling logic
├── policy/ # Uploaded HR policies
├── invoices/ # Uploaded ZIP filesUsage Guide
▶ Run the Application
uvicorn main:app --reloadStart Gradio UI
In a separate terminal, run:
python interface.py📨 API Endpoints
/analyze/ [POST]
Purpose:
Analyze uploaded invoices and store results.
Payload:
policy_file: PDF file (UploadFile)invoice_zip: ZIP file with invoice PDFsemployee_name: Employee’s name (Form field)
Response:
{
"message": "Invoice analysis completed."
}/chat/ [POST]
Purpose:
Query stored invoice analysis results.
Payload:
query: Natural language string (Form field)
Response:
{
"response": "Ramesh has 3 invoices marked as Partially Reimbursed..."
}Technical Details
LLM & Embeddings
LLM Provider: Groq (via LLaMA3 models)
Embedding Model: SentenceTransformers (sentence-transformers/all-MiniLM-L6-v2)
Vector Store
Backend: Faat API
Storage Logic:
Each invoice + LLM summary is saved with metadata (employee name, filename, date).
Stored as vector embeddings with ChromaDB for fast retrieval.
✍️ Prompt Design
Invoice Analysis Prompt
Crafted to:
- Summarize invoice content
- Validate against HR policy rules
- Mark non-compliance if any
Prompt Template:
Based on the HR policy: {policy_text}
Analyze this invoice: {invoice_text}
💬 Code Comments & Docstrings
🗃 vector_store.py
def add_to_vector_store(doc_id, content, metadata):
"""
Converts text to embeddings and stores them with associated metadata in FAISS.
"""🤖 Chatbot Logic
async def chat(query: str = Form(...)):
"""
Retrieves context from FAISS vector store based on query,
constructs a prompt and sends to LLM.
"""🧾 Invoice Analysis
async def analyze(...):
"""
Unzips invoice PDFs, extracts text, runs compliance check against policy,
and stores results in vector DB.
"""