Token Guard

Stop your AI agent from burning through your wallet. Zero-config proxy for OpenClaw, Nanobot, and any LLM client.

Token Guard is a zero-config LLM API proxy that protects your budget from runaway agents. Works seamlessly with OpenClaw, Nanobot, and any OpenAI-compatible client—no code changes required.

😱 The Problem

Scenario	What Happens	Cost
Infinite loop	Agent retries the same failed step 50x	💸 Hundreds of dollars
Duplicate calls	Tool returns same result, agent calls LLM again	💸 Wasted tokens
No guardrails	Single task runs until you notice	💸 Surprise bill

OpenClaw, Nanobot, and similar agent frameworks are powerful—but when they get stuck, they can drain your API budget in minutes. Token Guard stops that.

✨ What Token Guard Does

Feature	Description
🔄 Auto Deduplication	Identical requests within a time window return cached response instantly. No double-counting.
🛑 Token Limit + Circuit Breaker	Hit your limit? Service pauses. Resume when you're ready. No surprise overages.
🧮 Pre-calculation	Local tokenization (tiktoken) pre-calculates input tokens before forwarding. Exceeds limit? Block the request before it hits the API—saves that request's tokens (prompt + completion) vs. counting only after response.
📊 Granular Limits	Per-platform, per-model, or per-task limits.
⚙️ Zero Config	Set `HTTP_PROXY` and go. No SDK changes, no wrapper imports.
🖥️ Web Admin	Real-time usage, pause/resume, adjust limits—all from a browser.

🚀 Quick Start

Install

git clone https://github.com/LoveFishoO/TokenGuard.git
cd tokenguard
pip install -e .

Start TokenGuard

# intercept mode
tokenguard start

# proxy mode
# modify ~/.tokenguard/config.yaml
mode: proxy
tokenguard start

Point Your App

# intercept mode
bash merge_cert.sh

export SSL_CERT_FILE=$MERGED_CERT
export REQUESTS_CA_BUNDLE=$MERGED_CERT

export HTTP_PROXY=http://127.0.0.1:8080
export HTTPS_PROXY=http://127.0.0.1:8080

# proxy mode
modify base url 
Example:
from https://dashscope.aliyuncs.com/compatible-mode/v1 
to   http://localhost:8080/dashscope.aliyuncs.com/compatible-mode/v1

Run OpenClaw / Nanobot / Any Client

# Your agent runs as usual—Token Guard intercepts traffic transparently
openclaw gateway
nanobot gateway
...

That's it. Token Guard sits between your app and the LLM API. No code changes.

📋 Two Modes

Mode	Use Case	How
Intercept (default)	Transparent—set `HTTP_PROXY`, traffic flows through mitmproxy	`mode: intercept`
Proxy	Explicit—client sends requests to Token Guard URL	`mode: proxy`

⚙️ Configuration

Config lives at ~/.tokenguard/config.yaml (auto-created on first run).

# Token limits (priority: platform+model > platform > model > default)
token_limits:
  default: 30000
  by_platform:
    dashscope: 50000
    openai: 20000
  by_model:
    qwen3.5-flash: 10000
  by_platform_model:
    dashscope:
      qwen3.5-flash: 1000   

# Deduplication: same request within 60s = cached response
dedup:
  enabled: true
  window_seconds: 60

# Supported platforms (add your own)
platforms:
  dashscope: https://dashscope.aliyuncs.com/compatible-mode/v1
  openai: https://api.openai.com/v1
  deepseek: https://api.deepseek.com/v1
  volcengine: https://ark.cn-beijing.volces.com/api/v3
  # ...

🖥️ Admin Panel

Open http://127.0.0.1:8081/admin in your browser:

Real-time usage per platform/model
Pause / Resume when limits are hit
Adjust limits on the fly
Live logs of token consumption

🛠️ CLI

tokenguard start      # Start service
tokenguard stop       # Stop service
tokenguard status     # Show usage and limits
tokenguard resume     # Resume all paused channels
tokenguard resume dashscope qwen3.5-flash   # Resume specific
tokenguard limit dashscope qwen3.5-flash 5000   # Set limit

🤝 Contributing

PRs welcome! Issues for bugs and feature requests.

Protect your wallet. Run your agents with confidence.

LoveFishoO/TokenGuard