fahimahammed/chatbot-throttle
AI chatbot backend with per-user rate limiting and Vercel AI SDK integration.
Chatbot Throttle
AI Chat Rate Limiter with Vercel AI SDK
Project Overview
This project is a chatbot backend built with Node.js/Express using the Vercel AI SDK. It enforces rate limits for different user types to control AI usage costs.
| User Type | AI Questions per Hour | Notes |
|---|---|---|
| Guest | 3 | No Authorization header -> treated as Guest |
| Free | 10 | Logged-in user with Free plan |
| Premium | 50 | Logged-in user with Premium plan |
The backend tracks users by user ID (logged-in) or IP address (guests) and uses a fixed window rate limiting algorithm to prevent excessive AI calls.
Features
- Fixed Window Rate Limiting (1-hour window)
- User Types:
Guest,Free,Premium - JWT-based authentication for logged-in users
- In-memory request tracking
- Vercel AI SDK integration with OpenAI
gpt-4o-mini - Clear error messages when rate limits are exceeded
- Mock user data stored in
users.json - Environment configuration via
.envandconfig.js
System Design
The system uses a fixed window rate limiter to manage AI usage.
Architecture Diagram

Illustrates rate limiting for /api/chat endpoint.
Flow:
-
Client sends a request to
/api/chat. -
Middleware
authenticatechecks the JWT token. If missing, user is treated as Guest. -
rateLimiterchecks usage for the current 1-hour window:- Logged-in users tracked by
user.id - Guests tracked by
req.ip
- Logged-in users tracked by
-
If within limit, the request is sent to Vercel AI SDK to generate a response.
-
AI response returned along with remaining requests.
-
If limit is exceeded, server responds with 429 status and an error message.
Project Structure
chatbot-throttle/
│
├─ assets/ # System design images
├─ config.js # Environment variables and defaults
├─ constants.js # Rate limits and time window constants
├─ server.js # Express server & API endpoints
├─ users.json # Mock user database
├─ package.json # Project metadata and scripts
├─ README.md # Documentation
├─ node_modules/
└─ pnpm-lock.yaml
Setup Instructions
- Clone the repository:
git clone https://github.com/fahimahammed/chatbot-throttle.git
cd chatbot-throttle- Install dependencies:
pnpm install- Create
.envfile:
PORT=3000
JWT_SECRET=supersecret
JWT_EXPIRES_IN=1h
OPENAI_API_KEY=your_openai_api_key- Run in development mode:
pnpm devServer will start on http://localhost:3000.
API Endpoints
1. Login
POST /api/login
Description: Authenticates a user using mock users.json data and returns a JWT token.
Request Body:
{
"username": "fahimahammed",
"password": "fahim123"
}Response (Success):
{
"success": true,
"message": "User logged in successfully!",
"token": "**********************************"
}Response (Failure):
{
"success": false,
"message": "Invalid username or password!",
"error": "Invalid login request"
}2. Chat (Rate-limited)
POST /api/chat
Headers:
Authorization: Bearer <jwt_token>
If missing, the request is treated as a Guest.
Request Body:
{
"message": "I want to learn rate limiting. Give me a roadmap for this."
}Response (Success):
{
"success": true,
"message": "AI response here...",
"remaining_requests": 7
}Response (Rate Limit Exceeded):
{
"success": false,
"error": "Too many requests. Free users can make 10 requests per hour.",
"remaining_requests": 0
}3. Check Status
GET /api/status
Headers:
Authorization: Bearer <jwt_token>
If missing, treated as Guest.
Response:
{
"success": true,
"message": "You are a free user & your request limit is 10 AI questions per hour.",
"remaining_requests": 7
}Rate Limiter Implementation
-
Fixed Window: Tracks requests in a 1-hour window.
-
Tracking:
- Logged-in users: by
user.id - Guests: by
req.ip
- Logged-in users: by
-
In-memory store:
{
key: {
count: number,
windowStart: timestamp
}
}-
Behavior:
- Increment
countfor requests within the window - Reset
countandwindowStartafter 1 hour - Return 429 when the limit is exceeded
- Increment
Testing Rate Limiting
- Login as a user or use guest (no token).
- Make multiple
/api/chatrequests within an hour. - Check
/api/statusto see remaining requests. - Observe 429 error when the limit is exceeded.
Testing Rate Limiting (All User Types)
1. Guest User (No Authorization Header)
Limit: 3 requests per hour
# Guest request 1
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Hello AI"}'
# Guest request 2
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"How are you?"}'
# Guest request 3
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Tell me a joke"}'
# Guest request 4 → Exceeds limit
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Another request"}'Expected Response for 4th request:
{
"success": false,
"error": "Too many requests. guest users can make 3 requests per hour.",
"remaining_requests": 0
}2. Free User (Logged-in)
Limit: 10 requests per hour
# Login as Free user (fahimahammed)
curl -X POST http://localhost:3000/api/login \
-H "Content-Type: application/json" \
-d '{"username":"fahimahammed","password":"fahim123"}'
# Copy the returned token
TOKEN=<paste_jwt_token_here>
# Make 11 chat requests
for i in {1..11}; do
curl -X POST http://localhost:3000/api/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Hello AI"}'
doneExpected Response for 11th request:
{
"success": false,
"error": "Too many requests. free users can make 10 requests per hour.",
"remaining_requests": 0
}3. Premium User (Logged-in)
Limit: 50 requests per hour
# Login as Premium user (firoz)
curl -X POST http://localhost:3000/api/login \
-H "Content-Type: application/json" \
-d '{"username":"firoz","password":"firoz123"}'
# Copy the returned token
TOKEN=<paste_jwt_token_here>
# Make 51 chat requests
for i in {1..51}; do
curl -X POST http://localhost:3000/api/chat \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"message":"Hello AI"}'
doneExpected Response for 51st request:
{
"success": false,
"error": "Too many requests. premium users can make 50 requests per hour.",
"remaining_requests": 0
}4. Check Remaining Requests
You can check remaining quota for any user type:
curl -X GET http://localhost:3000/api/status \
-H "Authorization: Bearer $TOKEN"Sample Response:
{
"success": true,
"message": "You are a free user & your request limit is 10 AI questions per hour.",
"remaining_requests": 7
}Dependencies
express– Web serverbody-parser– Parse JSON request bodiesjsonwebtoken– JWT authenticationdotenv– Environment variable management@ai-sdk/openai+ai– Vercel AI SDK integration
Dev Dependencies:
nodemon– Auto-reload for development
Notes
- In-memory storage resets on server restart. Use Redis or a database for production.
- Mock users are for testing purposes only.
OPENAI_API_KEYmust be set in.envfor AI calls.- Rate limits can be adjusted in
constants.js.