π§ AI-Powered WebScraping Assistant
An advanced, real-time AI assistant powered by Streamlit, FastAPI, LangGraph, and Groq, capable of web scraping from Reddit and the open web, bypassing anti-bot mechanisms using Bright Data MCP. It supports natural voice responses using TTS, and integrates Ollama for powerful local LLM inference.
π Features
-
π Web Scraping Engine
Scrapes Reddit and browser-based content using automated headless browsing and anti-bot evasion (BrightData MCP). -
π§ Groq LLM Integration
Blazing-fast, accurate responses using Groq-powered language models for natural dialogue. -
πΊοΈ LangGraph-Based Flow
Modular, multi-step reasoning pipeline using LangGraph for structured agent behavior. -
π§© Local Model Support (Ollama)
Plug-and-play support for running local LLMs via Ollama. -
π£οΈ Text-to-Speech (TTS)
Converts AI responses into human-like speech for a natural conversation experience. -
π₯οΈ Streamlit UI + FastAPI Backend
Beautiful and fast front-end for real-time interaction, powered by FastAPI REST services.
π οΈ Tech Stack
| Technology | Purpose |
|---|---|
| Streamlit | Frontend interface |
| FastAPI | Backend API server |
| BrightData MCP | Web scraping and anti-bot bypassing |
| Groq | LLM for fast and intelligent responses |
| LangGraph | Multi-agent reasoning and logic graph |
| Ollama | Local LLM support (e.g. LLaMA2, Mistral) |
| TTS | Voice output from textual response |
πΈ Screenshots
Coming Soon β Add your screenshots or demo GIFs here.
π§ͺ Installation
# Clone the repo
git clone https://github.com/yourusername/ai-webscraping-assistant.git
cd ai-webscraping-assistant
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Fill in required keys like GROQ_API_KEY, BRIGHTDATA credentials, etc.
# Run backend
python backend.py
# Run Streamlit app
streamlit run app.py