Chatbot Throttle

AI Chat Rate Limiter with Vercel AI SDK

Project Overview

This project is a chatbot backend built with Node.js/Express using the Vercel AI SDK. It enforces rate limits for different user types to control AI usage costs.

User Type	AI Questions per Hour	Notes
Guest	3	No Authorization header -> treated as Guest
Free	10	Logged-in user with Free plan
Premium	50	Logged-in user with Premium plan

The backend tracks users by user ID (logged-in) or IP address (guests) and uses a fixed window rate limiting algorithm to prevent excessive AI calls.

Features

Fixed Window Rate Limiting (1-hour window)
User Types: Guest, Free, Premium
JWT-based authentication for logged-in users
In-memory request tracking
Vercel AI SDK integration with OpenAI gpt-4o-mini
Clear error messages when rate limits are exceeded
Mock user data stored in users.json
Environment configuration via .env and config.js

System Design

The system uses a fixed window rate limiter to manage AI usage.

Architecture Diagram

Illustrates rate limiting for /api/chat endpoint.

Flow:

Client sends a request to /api/chat.
Middleware authenticate checks the JWT token. If missing, user is treated as Guest.
rateLimiter checks usage for the current 1-hour window:
- Logged-in users tracked by user.id
- Guests tracked by req.ip
If within limit, the request is sent to Vercel AI SDK to generate a response.
AI response returned along with remaining requests.
If limit is exceeded, server responds with 429 status and an error message.

Project Structure

chatbot-throttle/
│
├─ assets/           # System design images
├─ config.js         # Environment variables and defaults
├─ constants.js      # Rate limits and time window constants
├─ server.js         # Express server & API endpoints
├─ users.json        # Mock user database
├─ package.json      # Project metadata and scripts
├─ README.md         # Documentation
├─ node_modules/     
└─ pnpm-lock.yaml

Setup Instructions

Clone the repository:

git clone https://github.com/fahimahammed/chatbot-throttle.git
cd chatbot-throttle

Install dependencies:

pnpm install

Create .env file:

PORT=3000
JWT_SECRET=supersecret
JWT_EXPIRES_IN=1h
OPENAI_API_KEY=your_openai_api_key

Run in development mode:

pnpm dev

Server will start on http://localhost:3000.

API Endpoints

POST /api/login

Description: Authenticates a user using mock users.json data and returns a JWT token.

Request Body:

{
  "username": "fahimahammed",
  "password": "fahim123"
}

Response (Success):

{
    "success": true,
    "message": "User logged in successfully!",
    "token": "**********************************"
}

Response (Failure):

{
    "success": false,
    "message": "Invalid username or password!",
    "error": "Invalid login request"
}

2. Chat (Rate-limited)

POST /api/chat

Headers:

Authorization: Bearer <jwt_token>

If missing, the request is treated as a Guest.

Request Body:

{
  "message": "I want to learn rate limiting. Give me a roadmap for this."
}

Response (Success):

{
  "success": true,
  "message": "AI response here...",
  "remaining_requests": 7
}

Response (Rate Limit Exceeded):

{
  "success": false,
  "error": "Too many requests. Free users can make 10 requests per hour.",
  "remaining_requests": 0
}

3. Check Status

GET /api/status

Headers:

Authorization: Bearer <jwt_token>

If missing, treated as Guest.

Response:

{
  "success": true,
  "message": "You are a free user & your request limit is 10 AI questions per hour.",
  "remaining_requests": 7
}

Rate Limiter Implementation

Fixed Window: Tracks requests in a 1-hour window.
Tracking:
- Logged-in users: by user.id
- Guests: by req.ip
In-memory store:

{
  key: {
    count: number,
    windowStart: timestamp
  }
}

Behavior:
- Increment count for requests within the window
- Reset count and windowStart after 1 hour
- Return 429 when the limit is exceeded

Testing Rate Limiting

Login as a user or use guest (no token).
Make multiple /api/chat requests within an hour.
Check /api/status to see remaining requests.
Observe 429 error when the limit is exceeded.

Testing Rate Limiting (All User Types)

1. Guest User (No Authorization Header)

Limit: 3 requests per hour

# Guest request 1
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Hello AI"}'

# Guest request 2
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"How are you?"}'

# Guest request 3
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Tell me a joke"}'

# Guest request 4 → Exceeds limit
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Another request"}'

Expected Response for 4th request:

{
  "success": false,
  "error": "Too many requests. guest users can make 3 requests per hour.",
  "remaining_requests": 0
}

2. Free User (Logged-in)

Limit: 10 requests per hour

# Login as Free user (fahimahammed)
curl -X POST http://localhost:3000/api/login \
-H "Content-Type: application/json" \
-d '{"username":"fahimahammed","password":"fahim123"}'

# Copy the returned token
TOKEN=<paste_jwt_token_here>

# Make 11 chat requests
for i in {1..11}; do
  curl -X POST http://localhost:3000/api/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message":"Hello AI"}'
done

Expected Response for 11th request:

{
  "success": false,
  "error": "Too many requests. free users can make 10 requests per hour.",
  "remaining_requests": 0
}

3. Premium User (Logged-in)

Limit: 50 requests per hour

# Login as Premium user (firoz)
curl -X POST http://localhost:3000/api/login \
-H "Content-Type: application/json" \
-d '{"username":"firoz","password":"firoz123"}'

# Copy the returned token
TOKEN=<paste_jwt_token_here>

# Make 51 chat requests
for i in {1..51}; do
  curl -X POST http://localhost:3000/api/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message":"Hello AI"}'
done

Expected Response for 51st request:

{
  "success": false,
  "error": "Too many requests. premium users can make 50 requests per hour.",
  "remaining_requests": 0
}

4. Check Remaining Requests

You can check remaining quota for any user type:

curl -X GET http://localhost:3000/api/status \
-H "Authorization: Bearer $TOKEN"

Sample Response:

{
  "success": true,
  "message": "You are a free user & your request limit is 10 AI questions per hour.",
  "remaining_requests": 7
}

Dependencies

express – Web server
body-parser – Parse JSON request bodies
jsonwebtoken – JWT authentication
dotenv – Environment variable management
@ai-sdk/openai + ai – Vercel AI SDK integration

Dev Dependencies:

nodemon – Auto-reload for development

Notes

In-memory storage resets on server restart. Use Redis or a database for production.
Mock users are for testing purposes only.
OPENAI_API_KEY must be set in .env for AI calls.
Rate limits can be adjusted in constants.js.

fahimahammed/chatbot-throttle

Chatbot Throttle

Project Overview

Features

System Design

Architecture Diagram

Flow:

Project Structure

Setup Instructions

API Endpoints

2. Chat (Rate-limited)

3. Check Status

Rate Limiter Implementation

Testing Rate Limiting

Testing Rate Limiting (All User Types)

1. Guest User (No Authorization Header)

2. Free User (Logged-in)

3. Premium User (Logged-in)

4. Check Remaining Requests

Dependencies

Notes

On this page

Languages

Contributors

fahimahammed/chatbot-throttle

Chatbot Throttle

Project Overview

Features

System Design

Architecture Diagram

Flow:

Project Structure

Setup Instructions

API Endpoints

1. Login

2. Chat (Rate-limited)

3. Check Status

Rate Limiter Implementation

Testing Rate Limiting

Testing Rate Limiting (All User Types)

1. Guest User (No Authorization Header)

2. Free User (Logged-in)

3. Premium User (Logged-in)

4. Check Remaining Requests

Dependencies

Notes

On this page

Languages

Contributors