🧠 AI-Powered WebScraping Assistant

An advanced, real-time AI assistant powered by Streamlit, FastAPI, LangGraph, and Groq, capable of web scraping from Reddit and the open web, bypassing anti-bot mechanisms using Bright Data MCP. It supports natural voice responses using TTS, and integrates Ollama for powerful local LLM inference.

🚀 Features

🔍 Web Scraping Engine
Scrapes Reddit and browser-based content using automated headless browsing and anti-bot evasion (BrightData MCP).
🧠 Groq LLM Integration
Blazing-fast, accurate responses using Groq-powered language models for natural dialogue.
🗺️ LangGraph-Based Flow
Modular, multi-step reasoning pipeline using LangGraph for structured agent behavior.
🧩 Local Model Support (Ollama)
Plug-and-play support for running local LLMs via Ollama.
🗣️ Text-to-Speech (TTS)
Converts AI responses into human-like speech for a natural conversation experience.
🖥️ Streamlit UI + FastAPI Backend
Beautiful and fast front-end for real-time interaction, powered by FastAPI REST services.

🛠️ Tech Stack

Technology	Purpose
Streamlit	Frontend interface
FastAPI	Backend API server
BrightData MCP	Web scraping and anti-bot bypassing
Groq	LLM for fast and intelligent responses
LangGraph	Multi-agent reasoning and logic graph
Ollama	Local LLM support (e.g. LLaMA2, Mistral)
TTS	Voice output from textual response

📸 Screenshots

Coming Soon – Add your screenshots or demo GIFs here.

🧪 Installation

# Clone the repo
git clone https://github.com/yourusername/ai-webscraping-assistant.git
cd ai-webscraping-assistant

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Fill in required keys like GROQ_API_KEY, BRIGHTDATA credentials, etc.

# Run backend
python backend.py

# Run Streamlit app
streamlit run app.py

Ashis-Mishra07/WebScrapper

🧠 AI-Powered WebScraping Assistant

🚀 Features

🛠️ Tech Stack

📸 Screenshots

🧪 Installation

On this page

Contributors