InsightFlow

Transform scattered communications into strategic intelligence

🎯 What is InsightFlow?

InsightFlow automatically ingests emails and meetings, uses ML-powered topic detection to identify patterns, and generates weekly strategic briefs that help leaders stay ahead of organizational drift.

Key Features:

🤖 AI-Optional Design: Works perfectly without AI; Gemini enhancement is optional
📊 Smart Topic Detection: HDBSCAN clustering + embeddings find emerging patterns
🎯 Rules-First Intelligence: Deterministic metrics with 28-day baseline comparison
📝 Evidence-Backed: Full audit trail linking insights to original sources
🚀 Production Ready: Docker-containerized, fully tested, zero-config startup

⚡ Quick Start (5 Minutes)

Option 1: Docker (Recommended - No Setup Required)

# 1. Clone the repository
git clone https://github.com/yourusername/InsightFlow.git
cd InsightFlow

# 2. Create .env file (uses defaults, no API keys needed)
copy .env.example .env

# 3. Start all services
docker-compose up -d

# 4. Open dashboard
# http://localhost:8000/static/index.html

# That's it! 🎉

What you get:

✅ PostgreSQL with pgvector
✅ FastAPI backend with dashboard
✅ Rules-only mode (no AI required)
✅ Manual analysis trigger
✅ Beautiful web UI

Option 2: Add AI Enhancement (Optional)

# 1. Get free Gemini API key
# Visit: https://makersuite.google.com/app/apikey

# 2. Edit .env file
# Set: LLM_ENABLED=true
# Set: GEMINI_API_KEY=your-key-here

# 3. Restart services
docker-compose restart api

# Now you have AI-enhanced strategic insights! ✨

Option 3: Local Development

# 1. Create virtual environment
python -m venv .venv
.venv\Scripts\activate  # Windows
source .venv/bin/activate  # Linux/Mac

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set up PostgreSQL
createdb insightflow_db
psql insightflow_db < sql/schema.sql

# 4. Configure .env
copy .env.example .env
# Edit DATABASE_URL if needed

# 5. Start API server
uvicorn api.main:app --reload

# 6. Access dashboard
# http://localhost:8000/static/index.html

📖 Documentation

Setup Guide - Detailed installation instructions
Docker Guide - Container deployment
Project Documentation - Complete technical spec
N8N Testing Guide - Workflow integration

🏗️ Architecture

┌─────────────────┐
│ Data Sources    │  Gmail, Read.ai, etc.
│ (via n8n)       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ FastAPI         │  /ingest/email, /ingest/meeting
│ Ingestion       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ PostgreSQL +    │  Vector storage + relational data
│ pgvector        │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Weekly Analysis │  Embeddings → Clustering → Rules → LLM (optional)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Dashboard +     │  Web UI + Slack delivery
│ Delivery        │
└─────────────────┘

Processing Pipeline:

Ingestion - Validate & store events (emails/meetings)
Embeddings - Generate 384-dim vectors (MiniLM)
Clustering - HDBSCAN topic detection
Metrics - Compute deltas vs 28-day baseline
Rules - Apply 4 deterministic finding rules
LLM - Optional Gemini enhancement
Render - Markdown brief + JSON audit trail
Deliver - Dashboard + Slack notification

🎨 Dashboard Features

Access at http://localhost:8000/static/index.html

Overview - Real-time metrics, charts, latest brief preview
Weekly Briefs - Historical briefs with LLM enhancement indicators
Trends - Week-over-week visualizations
Recent Events - Sortable table with source links for traceability
Manual Trigger - "Run Analysis Now" button (no wait for weekly timer)
LLM Status - Visual indicator showing AI enhancement status

🔧 Configuration

Environment Variables

Edit .env file to configure:

Variable	Default	Description
`LLM_ENABLED`	`false`	Enable/disable AI enhancement
`GEMINI_API_KEY`	-	API key for Gemini (optional)
`DATABASE_URL`	docker default	PostgreSQL connection string
`SLACK_WEBHOOK_URL`	-	Slack delivery (optional)
`CLUSTERING_MIN_SIZE`	`3`	Minimum events per topic
`EMERGING_RISK_FREQ_PCT`	`30`	Risk detection threshold

See .env.example for all options.

Toggle AI Enhancement

# Disable AI (rules-only mode)
LLM_ENABLED=false

# Enable AI (requires API key)
LLM_ENABLED=true
GEMINI_API_KEY=your-key-here

# Restart to apply
docker-compose restart api

📊 Data Ingestion

Via n8n Workflows (Recommended)

Import workflows from n8n/workflows/:

gmail-ingestion.json - Email automation
readai-ingestion.json - Meeting transcripts

Direct API

# Ingest an email
curl -X POST http://localhost:8000/ingest/email \
  -H "Content-Type: application/json" \
  -d '{
    "id": "email_001",
    "source": "email",
    "timestamp": "2026-01-08T10:00:00Z",
    "actor": "alice@company.com",
    "direction": "inbound",
    "subject": "Q1 Budget Review",
    "text": "Attached is the revised budget...",
    "thread_id": "thread_001",
    "decision": "deferred",
    "action_owner": null,
    "follow_up_required": true,
    "urgency_score": 7,
    "sentiment": "unknown",
    "raw_ref": "https://mail.google.com/..."
  }'

🧪 Testing

# Run all tests
pytest

# With coverage
pytest --cov=src --cov-report=html

# Specific test suite
pytest tests/unit/
pytest tests/api/
pytest tests/integration/

📦 Project Structure

InsightFlow/
├── api/                    # FastAPI backend
│   ├── main.py            # Ingestion endpoints
│   ├── dashboard.py       # Dashboard API + settings
│   └── schemas.py         # Pydantic models
├── src/insight/           # Core engine
│   ├── clustering.py      # HDBSCAN topic detection
│   ├── llm_client.py      # Gemini integration
│   ├── rules.py           # Finding generation
│   ├── renderer.py        # Markdown + JSON output
│   └── services.py        # Metrics computation
├── scripts/               # Automation
│   ├── weekly_run.py      # Main orchestration
│   └── export_brief.py    # Export utilities
├── static/                # Web dashboard
│   └── index.html         # Single-page app
├── sql/                   # Database schema
│   └── schema.sql         # PostgreSQL + pgvector
├── n8n/                   # Workflow templates
├── tests/                 # Test suites
└── docker-compose.yml     # Container orchestration

🚀 Deployment

Docker (Recommended)

# Production deployment
docker-compose -f docker-compose.prod.yml up -d

# Set up weekly automation
sudo cp systemd/insightflow-weekly.* /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable insightflow-weekly.timer
sudo systemctl start insightflow-weekly.timer

Manual Deployment

See SETUP_GUIDE.md for Hostinger VPS deployment.

🔐 Security

API keys in .env (gitignored)
Database credentials in environment variables
No sensitive data sent to LLM (only pre-computed metrics)
Slack webhooks for secure delivery

🤝 Contributing

Fork the repository
Create feature branch (git checkout -b feature/amazing-feature)
Make changes and add tests
Run test suite (pytest)
Commit with conventional commits
Push and open Pull Request

📝 License

MIT License - See LICENSE for details

🎓 Use Cases

For Teams

Clone and run immediately (Docker)
Add to existing n8n workflows
Customize rules for your domain
Self-host on any VPS

For Developers

Clean architecture example
Production-ready FastAPI patterns
ML pipeline best practices
Docker deployment templates

For Researchers

Embedding-based clustering
LLM-optional design patterns
Deterministic + AI hybrid systems
Full audit trail for reproducibility

💡 Tips

First Time Setup:

Start with Docker (zero config)
Test with manual analysis trigger
Add sample data via API
Enable AI when ready

Customization:

Edit rules in src/insight/rules.py
Adjust thresholds in .env
Modify dashboard in static/index.html
Add new metrics in src/insight/services.py

Troubleshooting:

Check logs: docker-compose logs -f api
Test API: http://localhost:8000/docs
Verify DB: docker-compose exec db psql -U insightflow
LLM status: Dashboard header shows active/disabled

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: johnsavilesh@gmail.com

Transform your organization's communication into strategic intelligence 🚀

Ad163/InsightFlow-