LoveFishoO/TokenGuard
Zero-config proxy that stops runaway LLM agents from burning your API budget. Pre-calc tokens, dedup requests, circuit breaker—no code changes. Works with OpenClaw, Nanobot, any OpenAI-compatible client.
Token Guard
Stop your AI agent from burning through your wallet. Zero-config proxy for OpenClaw, Nanobot, and any LLM client.
Token Guard is a zero-config LLM API proxy that protects your budget from runaway agents. Works seamlessly with OpenClaw, Nanobot, and any OpenAI-compatible client—no code changes required.
😱 The Problem
| Scenario | What Happens | Cost |
|---|---|---|
| Infinite loop | Agent retries the same failed step 50x | 💸 Hundreds of dollars |
| Duplicate calls | Tool returns same result, agent calls LLM again | 💸 Wasted tokens |
| No guardrails | Single task runs until you notice | 💸 Surprise bill |
OpenClaw, Nanobot, and similar agent frameworks are powerful—but when they get stuck, they can drain your API budget in minutes. Token Guard stops that.
✨ What Token Guard Does
| Feature | Description |
|---|---|
| 🔄 Auto Deduplication | Identical requests within a time window return cached response instantly. No double-counting. |
| 🛑 Token Limit + Circuit Breaker | Hit your limit? Service pauses. Resume when you're ready. No surprise overages. |
| 🧮 Pre-calculation | Local tokenization (tiktoken) pre-calculates input tokens before forwarding. Exceeds limit? Block the request before it hits the API—saves that request's tokens (prompt + completion) vs. counting only after response. |
| 📊 Granular Limits | Per-platform, per-model, or per-task limits. |
| ⚙️ Zero Config | Set HTTP_PROXY and go. No SDK changes, no wrapper imports. |
| 🖥️ Web Admin | Real-time usage, pause/resume, adjust limits—all from a browser. |
🚀 Quick Start
Install
git clone https://github.com/LoveFishoO/TokenGuard.git
cd tokenguard
pip install -e .Start TokenGuard
# intercept mode
tokenguard start
# proxy mode
# modify ~/.tokenguard/config.yaml
mode: proxy
tokenguard startPoint Your App
# intercept mode
bash merge_cert.sh
export SSL_CERT_FILE=$MERGED_CERT
export REQUESTS_CA_BUNDLE=$MERGED_CERT
export HTTP_PROXY=http://127.0.0.1:8080
export HTTPS_PROXY=http://127.0.0.1:8080
# proxy mode
modify base url
Example:
from https://dashscope.aliyuncs.com/compatible-mode/v1
to http://localhost:8080/dashscope.aliyuncs.com/compatible-mode/v1 Run OpenClaw / Nanobot / Any Client
# Your agent runs as usual—Token Guard intercepts traffic transparently
openclaw gateway
nanobot gateway
...That's it. Token Guard sits between your app and the LLM API. No code changes.
📋 Two Modes
| Mode | Use Case | How |
|---|---|---|
| Intercept (default) | Transparent—set HTTP_PROXY, traffic flows through mitmproxy |
mode: intercept |
| Proxy | Explicit—client sends requests to Token Guard URL | mode: proxy |
⚙️ Configuration
Config lives at ~/.tokenguard/config.yaml (auto-created on first run).
# Token limits (priority: platform+model > platform > model > default)
token_limits:
default: 30000
by_platform:
dashscope: 50000
openai: 20000
by_model:
qwen3.5-flash: 10000
by_platform_model:
dashscope:
qwen3.5-flash: 1000
# Deduplication: same request within 60s = cached response
dedup:
enabled: true
window_seconds: 60
# Supported platforms (add your own)
platforms:
dashscope: https://dashscope.aliyuncs.com/compatible-mode/v1
openai: https://api.openai.com/v1
deepseek: https://api.deepseek.com/v1
volcengine: https://ark.cn-beijing.volces.com/api/v3
# ...🖥️ Admin Panel
Open http://127.0.0.1:8081/admin in your browser:
- Real-time usage per platform/model
- Pause / Resume when limits are hit
- Adjust limits on the fly
- Live logs of token consumption
🛠️ CLI
tokenguard start # Start service
tokenguard stop # Stop service
tokenguard status # Show usage and limits
tokenguard resume # Resume all paused channels
tokenguard resume dashscope qwen3.5-flash # Resume specific
tokenguard limit dashscope qwen3.5-flash 5000 # Set limit🤝 Contributing
PRs welcome! Issues for bugs and feature requests.
Protect your wallet. Run your agents with confidence.