GitHunt
DA

danijeun/foresee-app

AI-powered AutoML platform that transforms CSV files into ML insights in 60 seconds. Upload data → Get AI-ranked target recommendations (Gemini 2.5) → Auto-train 3 models (LR, DT, XGBoost) → Download professional PDF reports with visualizations. Built with React, Flask, Snowflake & Google Gemini AI.

ForeSee - AI-Powered AutoML Platform

Automated Machine Learning Analysis & Reporting with Google Gemini AI

ForeSee is an intelligent web application that transforms raw data into actionable ML insights. Upload a CSV file and get professional ML analysis reports with AI-powered target variable recommendations, automated model training, and comprehensive PDF reports—all in minutes.


🎯 What Does Foresee Do?

From CSV to ML Insights in 5 Simple Steps:

  1. Upload your dataset (CSV format)
  2. AI Analysis - Google Gemini automatically suggests the best target variables to predict
  3. Select your prediction target from ranked recommendations
  4. Auto-Train - System trains 3 ML models in parallel (Logistic Regression, Decision Tree, XGBoost)
  5. Download a comprehensive PDF report with insights, metrics, and recommendations

✨ Key Features

🤖 AI-Powered Target Selection (Google Gemini 2.5)

  • Uses Google Gemini 2.5 Flash (gemini-2.5-flash-preview-05-20) to intelligently analyze your dataset
  • Recommends the top 5 most valuable prediction targets with importance scores (1-100)
  • Distinguishes between target variables (outcomes) and features (predictors)
  • Provides detailed business rationale, predictability assessment, and suggested features
  • Runs in parallel with EDA for faster results

📊 Automatic Exploratory Data Analysis (EDA)

  • Snowflake-based comprehensive statistical analysis
  • Analyzes all column types: numeric, categorical, datetime, text
  • Detects data types, missing values, duplicates, and cardinality
  • Calculates metrics: mean, std, quartiles, skewness, kurtosis, top values
  • Stores all results in Snowflake for querying and persistence
  • Parallel execution with Target Analysis for optimal performance

🚀 Multi-Model Machine Learning

Trains 3 models sequentially after target selection:

Model Description Key Metrics
Logistic Regression Fast, interpretable baseline Accuracy, Precision, Recall, F1, ROC-AUC
Decision Tree Non-linear pattern detection Tree depth, leaves, feature importance
XGBoost State-of-the-art gradient boosting N-estimators, max depth, learning rate

Each model provides:

  • Performance metrics (train & test)
  • Confusion matrices
  • Feature importance rankings
  • Model-specific recommendations
  • Data quality assessments

📄 Professional PDF Reports (AI-Generated)

  • Natural language insights generated by Google Gemini
  • Executive summary with best-performing model
  • Data quality and EDA insights
  • Model performance comparisons
  • Feature importance analysis
  • Actionable recommendations
  • Professional charts and visualizations

❄️ Snowflake Data Platform

  • Isolated workflow schemas - Each upload creates WORKFLOW_<UUID> schema
  • Scalable data warehouse for enterprise datasets
  • SQL-based data processing and storage
  • Persistent storage for all EDA and ML results
  • Clean separation between workflows

Modern Web Interface (React + Tailwind)

  • Drag-and-drop file upload
  • Real-time progress tracking with rotating status messages
  • Interactive podium display for top 3 target recommendations
  • Responsive design for all devices
  • Smooth animations with AOS (Animate On Scroll)
  • In-browser PDF viewing and download

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    FRONTEND (React 19 + Vite)                   │
│                                                                  │
│  • Drag & drop file upload       • Podium target display       │
│  • Real-time progress tracking   • PDF viewer                  │
│  • Target variable selection     • Responsive UI               │
└─────────────────────┬───────────────────────────────────────────┘
                      │ REST API (CORS enabled)
                      │
┌─────────────────────▼───────────────────────────────────────────┐
│                  BACKEND (Flask 3.0 API)                        │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │               MULTI-AGENT SYSTEM                          │  │
│  │                                                            │  │
│  │  1️⃣ EDA Agent (Parallel)                                  │  │
│  │     → Analyze dataset                                     │  │
│  │     → Store stats in Snowflake                           │  │
│  │                                                            │  │
│  │  2️⃣ Target Variable Agent (Parallel) - Gemini 2.5        │  │
│  │     → Sample data                                         │  │
│  │     → LLM analysis                                        │  │
│  │     → Rank top 5 targets (importance scores)            │  │
│  │                                                            │  │
│  │  3️⃣ ML Training Agents (Sequential after target select)  │  │
│  │     → Logistic Regression Agent                          │  │
│  │     → Decision Tree Agent                                │  │
│  │     → XGBoost Agent                                      │  │
│  │                                                            │  │
│  │  4️⃣ Natural Language Agent - Gemini 2.5                  │  │
│  │     → Collect EDA & ML results                           │  │
│  │     → Generate insights (LLM)                            │  │
│  │     → Create PDF report (ReportLab)                      │  │
│  └────────────────────────────────────────────────────────────┘  │
│                                                                  │
│  Services:                                                      │
│  • Workflow Manager     • Snowflake Ingestion                  │
│  • EDA Service          • Config Management                    │
└─────────────────────┬───────────────────────────────────────────┘
                      │ Snowflake Connector
                      │
┌─────────────────────▼───────────────────────────────────────────┐
│                    SNOWFLAKE DATA PLATFORM                      │
│                                                                  │
│  Isolated Schemas: WORKFLOW_<UUID>                             │
│                                                                  │
│  Tables per Workflow:                                          │
│  • WORKFLOW_METADATA          → Workflow info                  │
│  • WORKFLOW_EDA_SUMMARY       → EDA results                    │
│  • COLUMN_STATS               → Column metrics                 │
│  • LOGISTIC_REGRESSION_SUMMARY → LR model results             │
│  • DECISION_TREE_SUMMARY       → DT model results             │
│  • XGBOOST_SUMMARY             → XGB model results            │
│  • RAW_DATA_TABLE              → Original CSV data            │
└─────────────────────────────────────────────────────────────────┘

🔄 Application Workflow

Phase 1: Upload & Parallel Analysis (45-60s)

User uploads CSV
    │
    ├─→ Store in Snowflake (temp file → Snowflake table)
    │
    ├─→ 🧵 Thread 1: EDA Agent
    │       └─→ Analyze all columns
    │           └─→ Save to WORKFLOW_EDA_SUMMARY
    │
    └─→ 🧵 Thread 2: Target Variable Agent (Gemini 2.5)
            └─→ Sample 100 rows
                └─→ LLM analysis
                    └─→ Return top 5 targets (ranked)

Time Saved: ~40% faster than sequential execution


Phase 2: Target Selection (User interaction)

Frontend displays 3 recommendations in podium:
    🥇 Gold   (Rank 1 - Highest importance)
    🥈 Silver (Rank 2)
    🥉 Bronze (Rank 3)
    
+ "Other Options" button → Shows all 5 recommendations

Each recommendation includes:
    • Importance Score (1-100)
    • Problem Type (regression/classification)
    • Why Important (business value)
    • Predictability (HIGH/MEDIUM/LOW)
    • Suggested Features (top predictors)

User selects target → Saved to workflow_metadata

Phase 3: Sequential ML Training (10-15s)

Automatic training after target selection:

1. Logistic Regression Agent
    ├─→ Feature engineering
    ├─→ Train/test split (80/20)
    ├─→ Model training (max_iter=1000)
    ├─→ Performance evaluation
    └─→ Save to LOGISTIC_REGRESSION_SUMMARY

2. Decision Tree Agent
    ├─→ Feature engineering
    ├─→ Train/test split (80/20)
    ├─→ Model training (max_depth=10)
    ├─→ Performance evaluation
    └─→ Save to DECISION_TREE_SUMMARY

3. XGBoost Agent
    ├─→ Feature engineering
    ├─→ Train/test split (80/20)
    ├─→ Model training (n_estimators=100, max_depth=6)
    ├─→ Performance evaluation
    └─→ Save to XGBOOST_SUMMARY

Phase 4: Report Generation (5-10s)

Natural Language Agent (Gemini 2.5)
    │
    ├─→ Collect EDA insights from Snowflake
    ├─→ Collect ML results (all 3 models)
    ├─→ Generate narrative with Gemini LLM
    ├─→ Create visualizations (Matplotlib)
    ├─→ Generate PDF (ReportLab)
    └─→ Save to backend/pdf/

Total Time: ~60-85 seconds from upload to PDF


🛠️ Technology Stack

Frontend

Library Version Purpose
React 19.1.1 UI framework
React Router 7.9.3 Navigation
Vite 7.1.7 Build tool & dev server
Tailwind CSS 4.1.14 Styling framework
AOS 2.3.4 Scroll animations
ESLint 9.36.0 Code linting

Backend

Library Version Purpose
Flask 3.0.0 REST API framework
Flask-CORS 4.0.0 Cross-origin support
python-dotenv 1.0.1 Environment config

AI & Machine Learning

Library Version Purpose
google-generativeai 0.8.3 Google Gemini 2.5 API
scikit-learn 1.5.0 Logistic Regression, Decision Tree
XGBoost 2.1.0 Gradient boosting
SHAP 0.44.0 Model explainability
pandas 2.2.0 Data manipulation
NumPy 1.26.0 Numerical operations

Data Platform

Library Version Purpose
snowflake-connector-python 3.12.0 Snowflake connectivity
snowflake-snowpark-python 1.39.1 Snowpark DataFrame API

Reporting & Visualization

Library Version Purpose
ReportLab 4.0.7 PDF generation
Matplotlib 3.8.0 Charts & visualizations

📋 Project Structure

foresee-app/
├── frontend/                           # React Application
│   ├── src/
│   │   ├── pages/
│   │   │   ├── Home.jsx               # Landing page
│   │   │   ├── Foresee.jsx            # Main app (upload, analysis, results)
│   │   │   ├── AboutUs.jsx            # Team information
│   │   │   └── Help.jsx               # User guide
│   │   ├── components/
│   │   │   ├── TopBanner.jsx          # Navigation header
│   │   │   └── Footer.jsx             # Footer
│   │   ├── App.jsx                    # Main component & routing
│   │   └── main.jsx                   # Entry point
│   ├── package.json                   # Frontend dependencies
│   └── vite.config.js                 # Vite configuration
│
├── backend/                            # Flask API + ML Agents
│   ├── app.py                         # Main Flask API (1949 lines)
│   │
│   ├── agents/                        # AI/ML Agents
│   │   ├── eda_agent/                 # EDA Agent (Snowflake-based)
│   │   │   ├── agent.py               # Main EDA orchestration
│   │   │   ├── config.py              # EDA configuration
│   │   │   ├── database/
│   │   │   │   ├── connection.py      # Snowflake connection
│   │   │   │   ├── schema.py          # Schema management
│   │   │   │   └── storage.py         # Results storage
│   │   │   ├── metrics/               # Metric calculators
│   │   │   │   ├── basic_metrics.py   # Basic stats
│   │   │   │   ├── numeric_metrics.py # Numeric stats
│   │   │   │   ├── categorical_metrics.py
│   │   │   │   ├── datetime_metrics.py
│   │   │   │   ├── text_metrics.py
│   │   │   │   └── target_metrics.py
│   │   │   └── utils/
│   │   │       ├── helpers.py
│   │   │       ├── logger.py
│   │   │       └── validators.py
│   │   │
│   │   ├── target_variable_agent.py   # Gemini-powered target suggestions
│   │   ├── logistic_regression_agent.py
│   │   ├── decision_tree_agent.py
│   │   ├── xgboost_agent.py
│   │   └── natural_language_agent.py  # Gemini-powered PDF generation
│   │
│   ├── services/
│   │   ├── workflow_manager.py        # Workflow & schema management
│   │   ├── snowflake_ingestion.py     # CSV → Snowflake
│   │   ├── eda_service.py             # EDA orchestration
│   │   └── config.py                  # Configuration loader
│   │
│   ├── insights/                      # Generated JSON insights
│   └── pdf/                           # Generated PDF reports
│
├── requirements.txt                   # Python dependencies
├── .env                               # Environment variables (not tracked)
├── .gitignore
├── start.bat                          # Windows startup script
├── start.sh                           # Linux/Mac startup script
└── README.md

🚀 Getting Started

Prerequisites


Installation

1. Clone the repository

git clone https://github.com/yourusername/foresee-app.git
cd foresee-app

2. Set up backend

# Create virtual environment
python -m venv myenv

# Activate virtual environment
# Windows:
myenv\Scripts\activate
# macOS/Linux:
source myenv/bin/activate

# Install Python dependencies
pip install -r requirements.txt

3. Configure environment variables

Create a .env file in the project root:

# Snowflake Configuration
SNOWFLAKE_ACCOUNT=your_account_identifier
SNOWFLAKE_USER=your_username
SNOWFLAKE_PASSWORD=your_password
SNOWFLAKE_DATABASE=your_database
SNOWFLAKE_SCHEMA=PUBLIC
INGESTION_WAREHOUSE=your_warehouse

# Google Gemini API
GEMINI_API_KEY=your_gemini_api_key_here

Get your Gemini API key:

  1. Visit https://aistudio.google.com/app/apikey
  2. Sign in with Google account
  3. Click "Create API Key"
  4. Copy and paste into .env

4. Set up frontend

cd frontend
npm install
cd ..

🎬 Running the Application

Windows:

start.bat

macOS/Linux:

chmod +x start.sh  # First time only
./start.sh

This automatically:

  1. Activates Python virtual environment
  2. Starts Flask backend (port 5000)
  3. Starts Vite frontend (port 5173)

Option 2: Using npm

cd frontend
npm run dev:all

Uses concurrently to run both servers simultaneously.


Option 3: Manual (Two Terminals)

Terminal 1 - Backend:

# Activate virtual environment
myenv\Scripts\activate  # Windows
# or
source myenv/bin/activate  # macOS/Linux

# Start Flask server
cd backend
python app.py

Terminal 2 - Frontend:

cd frontend
npm run dev

Access the Application


📖 Usage Guide

1. Upload Dataset

  1. Navigate to "Foresee" in the top menu
  2. Drag & drop your CSV file or click "Choose File"
  3. Click "Upload & Analyze"

The system will:

  • Upload data to Snowflake
  • Run parallel EDA + Target Analysis (~45-60s)
  • Display progress with rotating status messages

2. Select Target Variable

After analysis, you'll see:

Podium Display (Top 3):

  • 🥇 Gold - Most important target (Rank 1)
  • 🥈 Silver - Second best (Rank 2)
  • 🥉 Bronze - Third option (Rank 3)

Click "Other Options" to see all 5 recommendations.

Each recommendation shows:

  • Importance Score (1-100) - Quantitative ranking
  • Problem Type - regression/classification
  • Why Important - Business value explanation
  • Predictability - HIGH/MEDIUM/LOW feasibility
  • Suggested Features - Best predictor columns

3. Model Training (Automatic)

After selecting a target, the system automatically:

  1. Trains Logistic Regression model
  2. Trains Decision Tree model
  3. Trains XGBoost model
  4. Generates Natural Language Insights (Gemini)
  5. Creates PDF Report (ReportLab)

Total Time: 10-15 seconds


4. View/Download Report

When complete:

  • Click "View Report" → Opens PDF in browser
  • Click "Download Report" → Saves PDF to your computer

📊 What's in the PDF Report?

1. Executive Summary

  • Dataset overview (rows, columns)
  • Selected target variable
  • Best-performing model
  • Key findings

2. Data Quality Analysis

  • Missing value analysis
  • Duplicate detection
  • Column type breakdown
  • Data completeness metrics

3. Exploratory Data Analysis

  • Numeric column statistics (mean, std, quartiles, skewness, kurtosis)
  • Categorical distributions (top values, cardinality)
  • Datetime patterns
  • Text metrics

4. Model Performance Comparison

Model Accuracy Precision Recall F1 Score ROC-AUC
Logistic Regression 0.85 0.82 0.88 0.85 0.91
Decision Tree 0.83 0.80 0.87 0.83 0.89
XGBoost 0.88 0.86 0.90 0.88 0.94

5. Feature Importance

  • Top 10 most important features
  • Feature importance scores
  • Model-specific interpretations

6. Model-Specific Insights

  • Confusion matrices
  • Decision tree depth/leaves
  • XGBoost hyperparameters
  • Performance summaries

7. Recommendations

  • Best model selection advice
  • Data quality improvements
  • Feature engineering suggestions
  • Next steps for deployment

🎯 API Reference

Core Endpoints

Method Endpoint Description
GET /api/health Health check
POST /api/upload Upload CSV & run parallel analysis
GET /api/workflows List all workflows
DELETE /api/workflow/<id> Delete workflow & schema
POST /api/query Execute SQL query on workflow

Target Variable Selection

Method Endpoint Description
GET /api/target-suggestions/<workflow_id>/<table_name> Get AI recommendations (Gemini)
POST /api/workflow/<id>/select-target Save target & auto-train models

POST Body:

{
  "target_variable": "column_name",
  "table_name": "table_name",
  "problem_type": "classification",
  "importance_score": 95
}

Manual Model Training (Optional)

Method Endpoint Description
POST /api/workflow/<id>/train-logistic-regression Train LR model
POST /api/workflow/<id>/train-decision-tree Train DT model
POST /api/workflow/<id>/train-xgboost Train XGBoost model

Model Results

Method Endpoint Description
GET /api/workflow/<id>/logistic-regression-results Get LR results
GET /api/workflow/<id>/decision-tree-results Get DT results
GET /api/workflow/<id>/xgboost-results Get XGB results

Report Generation

Method Endpoint Description
POST /api/workflow/<id>/generate-insights Generate insights & PDF (Gemini)
GET /api/workflow/<id>/report/view View PDF in browser
GET /api/workflow/<id>/report/download Download PDF

🗄️ Snowflake Database Schema

Each workflow creates an isolated schema: WORKFLOW_<UUID>

Tables Created Per Workflow

1. WORKFLOW_METADATA

CREATE TABLE WORKFLOW_METADATA (
    key VARCHAR,
    value VARIANT,
    updated_at TIMESTAMP
);

Stores workflow-level metadata (selected target, configuration).


2. WORKFLOW_EDA_SUMMARY

CREATE TABLE WORKFLOW_EDA_SUMMARY (
    analysis_id VARCHAR PRIMARY KEY,
    table_name VARCHAR,
    total_rows INTEGER,
    total_columns INTEGER,
    duplicate_rows INTEGER,
    target_column VARCHAR,
    analysis_type VARCHAR,
    created_at TIMESTAMP
);

3. COLUMN_STATS

CREATE TABLE COLUMN_STATS (
    column_name VARCHAR,
    data_type VARCHAR,
    null_count INTEGER,
    unique_count INTEGER,
    completeness FLOAT,
    -- Numeric metrics
    mean FLOAT,
    std FLOAT,
    min FLOAT,
    max FLOAT,
    q1 FLOAT,
    q2 FLOAT,
    q3 FLOAT,
    skewness FLOAT,
    kurtosis FLOAT,
    -- Categorical metrics
    mode VARCHAR,
    top_values VARIANT,
    cardinality INTEGER,
    -- Text metrics
    avg_length FLOAT,
    max_length INTEGER,
    -- Datetime metrics
    date_range VARIANT
);

4. LOGISTIC_REGRESSION_SUMMARY

CREATE TABLE LOGISTIC_REGRESSION_SUMMARY (
    analysis_id VARCHAR PRIMARY KEY,
    table_name VARCHAR,
    target_variable VARCHAR,
    model_type VARCHAR,
    problem_type VARCHAR,
    test_accuracy FLOAT,
    test_precision FLOAT,
    test_recall FLOAT,
    test_f1_score FLOAT,
    test_roc_auc FLOAT,
    train_accuracy FLOAT,
    total_samples INTEGER,
    total_features INTEGER,
    n_classes INTEGER,
    confusion_matrix ARRAY,
    top_features ARRAY,
    performance_summary VARCHAR,
    recommendations VARCHAR,
    created_at TIMESTAMP
);

5. DECISION_TREE_SUMMARY

Same as Logistic Regression + additional columns:

    tree_depth INTEGER,
    n_leaves INTEGER,
    max_depth INTEGER,
    min_samples_split INTEGER,
    min_samples_leaf INTEGER

6. XGBOOST_SUMMARY

Same as Logistic Regression + additional columns:

    n_estimators INTEGER,
    max_depth INTEGER,
    learning_rate FLOAT,
    subsample FLOAT,
    colsample_bytree FLOAT

⚙️ Configuration

Backend Configuration (backend/services/config.py)

from dotenv import load_dotenv
import os

load_dotenv()

class Config:
    SNOWFLAKE_ACCOUNT = os.getenv("SNOWFLAKE_ACCOUNT")
    SNOWFLAKE_USER = os.getenv("SNOWFLAKE_USER")
    SNOWFLAKE_PASSWORD = os.getenv("SNOWFLAKE_PASSWORD")
    SNOWFLAKE_DATABASE = os.getenv("SNOWFLAKE_DATABASE")
    SNOWFLAKE_SCHEMA = os.getenv("SNOWFLAKE_SCHEMA")
    INGESTION_WAREHOUSE = os.getenv("INGESTION_WAREHOUSE")

Frontend Configuration (frontend/src/pages/Foresee.jsx)

const API_BASE_URL = "http://localhost:5000/api";

Change this to your backend URL in production.


Flask Configuration (backend/app.py)

ALLOWED_EXTENSIONS = {'csv'}
app.config['MAX_CONTENT_LENGTH'] = 500 * 1024 * 1024  # 500MB max

🤖 Google Gemini Integration

Models Used

Agent Model Purpose
Target Variable Agent gemini-2.5-flash-preview-05-20 Analyze data & rank targets
Natural Language Agent gemini-2.5-flash-preview-05-20 Generate PDF insights

API Configuration

import google.generativeai as genai

genai.configure(api_key=os.getenv('GEMINI_API_KEY'))
model = genai.GenerativeModel('models/gemini-2.5-flash-preview-05-20')

# Generate content
response = model.generate_content(prompt)

Rate Limits & Pricing


🔒 Security & Privacy

Data Isolation

  • Each workflow creates an isolated Snowflake schema (WORKFLOW_<UUID>)
  • No data mixing between workflows
  • Automatic cleanup on workflow deletion

API Security

  • CORS enabled for frontend-backend communication
  • File size limits (500MB max)
  • File type validation (CSV only)
  • Secure filename sanitization

Data Privacy

  • Data stored in your Snowflake account (not ours)
  • AI models (Gemini) don't retain your data
  • Stateless API calls
  • No data sent to third parties

Temporary Files

  • Uploaded files stored in system temp directory
  • Automatically deleted after Snowflake upload
  • No persistent local storage

🐛 Troubleshooting

Backend won't start

# Check Python version
python --version  # Should be 3.11+

# Verify .env file exists
cat .env  # macOS/Linux
type .env  # Windows

# Test Snowflake connection
python -c "from services.config import Config; print(Config.SNOWFLAKE_ACCOUNT)"

Frontend can't connect to backend

# Verify backend is running
curl http://localhost:5000/api/health

# Check CORS is enabled in backend/app.py
# CORS(app) should be present

# Verify API_BASE_URL in frontend matches backend port

Upload fails

Possible causes:

  1. Invalid CSV format - Verify file has headers and proper encoding
  2. Snowflake credentials - Check .env variables
  3. Warehouse not running - Start warehouse in Snowflake UI
  4. Insufficient credits - Check Snowflake billing

Gemini API errors

# Verify API key is set
echo $GEMINI_API_KEY  # macOS/Linux
echo %GEMINI_API_KEY%  # Windows

# Test API key manually
curl -H "Content-Type: application/json" \
     -d '{"contents":[{"parts":[{"text":"Hello"}]}]}' \
     "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-preview-05-20:generateContent?key=YOUR_API_KEY"

Common errors:

  • INVALID_ARGUMENT: Invalid API key
  • RESOURCE_EXHAUSTED: Rate limit exceeded (wait 1 minute)
  • PERMISSION_DENIED: API not enabled (enable in Google Cloud Console)

Model training fails

Possible causes:

  1. Target has only 1 unique value - Select different target
  2. Target has excessive nulls (>50%) - Data quality issue
  3. Insufficient samples (<50 rows) - Upload larger dataset
  4. All features are null - Data quality issue

📈 Performance Optimizations

Parallel Execution

  • EDA + Target Analysis run in parallel using ThreadPoolExecutor
  • Saves ~40% time compared to sequential execution
  • Typical time saved: 20-30 seconds

Snowflake Optimizations

  • Uses PUT command for fast bulk loading
  • Isolated schemas reduce query overhead
  • Warehouse auto-suspend to reduce costs

Frontend Optimizations

  • Vite for fast builds and HMR (Hot Module Replacement)
  • Code splitting with React Router
  • Lazy loading of heavy components

📝 Development Guidelines

Python Code Style

  • Follow PEP 8 style guide
  • Use type hints for function parameters
  • Docstrings for all functions/classes
  • Maximum line length: 100 characters

React Code Style

  • Use functional components with hooks
  • ESLint for code linting
  • Consistent file naming (PascalCase for components)
  • PropTypes for type checking (optional)

Git Workflow

# Create feature branch
git checkout -b feature/your-feature-name

# Make changes and commit
git add .
git commit -m "Add: your feature description"

# Push to remote
git push origin feature/your-feature-name

# Open Pull Request on GitHub

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License.


🙏 Acknowledgments

Technologies

  • Snowflake - Enterprise data platform
  • Google Gemini - AI-powered insights (gemini-2.5-flash-preview-05-20)
  • React - Modern web framework
  • Flask - Lightweight Python API
  • scikit-learn & XGBoost - ML frameworks
  • ReportLab - Professional PDF generation

Team

Built with ❤️ by the Foresee Team


📞 Support

For issues, questions, or suggestions:


🚀 Roadmap

Planned Features

  • Support for more ML models (Random Forest, Neural Networks)
  • Advanced hyperparameter tuning (GridSearchCV)
  • Time series forecasting support
  • Interactive charts in PDF reports
  • Model deployment API (FastAPI)
  • Scheduled re-training
  • User authentication (OAuth 2.0)
  • Multi-user workspaces
  • Excel file support (.xlsx)
  • Real-time model monitoring dashboard
  • SHAP value visualizations
  • Model versioning & comparison

Happy Analyzing! 🚀📊

Transform your data into insights with the power of AI.

Languages

Python82.4%JavaScript16.7%CSS0.4%HTML0.1%Shell0.1%Batchfile0.1%

Contributors

Created October 4, 2025
Updated January 8, 2026
danijeun/foresee-app | GitHunt