VannaDii/RustyGPT
A ChatGPT inspired app in Rust!
RustyGPT
RustyGPT is a workspace of Rust crates that together provide a chat assistant server, a Yew web UI, and a command line interface. The project focuses on end-to-end Rust implementations for authentication, threaded conversations, Server-Sent Event (SSE) streaming, and local LLM execution through a pluggable llama.cpp provider.
Workspace layout
| Crate | Purpose |
|---|---|
rustygpt-server |
Axum HTTP server with authentication, rate limiting, SSE streaming, and OpenAPI documentation. |
rustygpt-web |
Yew single-page application that consumes the server APIs and renders threaded conversations. |
rustygpt-cli |
Command line client for logging in, inspecting conversations, following SSE streams, and running the server locally. |
rustygpt-shared |
Shared models, configuration loader, and llama.cpp integration code reused by all binaries. |
rustygpt-doc-indexer |
Helper used by the docs build to generate the machine-readable index. |
rustygpt-tools/confuse |
Development helper that runs frontend/backend watchers via the just dev recipe. |
Other notable directories include scripts/pg for schema/procedure SQL, deploy/grafana for metrics dashboards, and docs for the mdBook documentation.
Capabilities
- Threaded conversations –
/api/conversationsand/api/threadsendpoints manage conversation membership, invites, roots, and replies (rustygpt-server/src/handlers/{conversations,threads}.rs). - Streaming updates –
conversation_streaminhandlers/streaming.rsbroadcastsConversationStreamEventvalues over SSE at/api/stream/conversations/:conversation_id, with optional PostgreSQL persistence configured through[sse.persistence]. - Authentication – cookie-backed sessions, refresh, and logout flows (see
handlers/auth.rs) plus optional GitHub or Apple OAuth handlers when the relevant environment variables are present. First-time setup uses/api/setupto create the initial administrator (handlers/setup.rs). - Rate limiting –
middleware::rate_limitenforces per-route buckets populated from the database using stored procedures inscripts/pg/procs/034_limits.sql. Admin APIs under/api/admin/limits/*allow live updates whenrate_limits.admin_api_enabledandfeatures.auth_v1are enabled. - Local LLM inference –
AssistantServicestreams replies via llama.cpp models configured under[llm]inconfig.toml, with metrics such asllm_model_cache_hits_totalandllm_model_load_seconds. - Observability – Prometheus counters and gauges for health checks, bootstrap progress, rate limiting, and LLM usage, plus
/metrics,/healthz, and/readyzendpoints. Grafana dashboards live indeploy/grafana/. - Typed configuration –
rustygpt-shared::config::server::Configloads layered TOML/YAML/JSON files with environment overrides (e.g.RUSTYGPT__SERVER__PORT). The templateconfig.example.tomldocuments all sections.
Quick start
-
Install prerequisites
- Rust 1.81+ (
rustup default stable) just,cargo-watch, andtrunk- PostgreSQL 15+ (local install or Docker)
- Optional: llama.cpp-compatible model files for streaming replies
- Rust 1.81+ (
-
Create a configuration file
cp config.example.toml config.toml
Adjust values as needed. For a full local experience set:
[features] auth_v1 = true sse_v1 = true well_known = true
Ensure
[db].urlpoints to your PostgreSQL instance and that the database already exists. -
Start PostgreSQL You can use the provided Compose service:
docker compose up postgres -d
The server automatically runs the bootstrap SQL in
scripts/pgon startup. -
Run the backend
just run-server
The process listens on
http://127.0.0.1:8080by default. -
Perform first-time setup POST to
/api/setuponce to create the initial admin account:curl -X POST http://127.0.0.1:8080/api/setup \ -H 'Content-Type: application/json' \ -d '{"username":"admin","email":"admin@example.com","password":"change-me"}'
-
Run the web client
just web-serve
The SPA proxies API requests to the backend and renders conversations, presence, and streaming updates.
-
Use the CLI
just cli login just cli chat --conversation <uuid> just cli follow --root <thread-uuid>
Commands reuse the same configuration loader and session cookies as the server. See
rustygpt-cli/src/main.rsfor the full list of subcommands (serve,chat,reply,follow,spec,completion,config,login,me,logout).
Observability
Metrics are exposed at /metrics after calling server::metrics_handle(). Key instruments include:
| Metric | Description |
|---|---|
health_checks_total{endpoint,status} |
Count of /healthz and /readyz responses. |
db_bootstrap_batches_total{stage,status} / db_bootstrap_script_duration_seconds{stage,status} |
Bootstrap progress per SQL stage (schema, procedures, indexes, seed). |
db_liveness_checks_total{status} / db_readiness_checks_total{status} |
Database readiness probes. |
db_pool_max_connections, db_statement_timeout_ms |
Gauges reflecting the active configuration. |
http_rate_limit_requests_total{profile,result} |
Requests allowed or denied by the rate limit middleware. |
http_rate_limit_remaining{profile} / http_rate_limit_reset_seconds{profile} |
Current token state per bucket. |
rustygpt_limits_profiles, rustygpt_limits_assignments |
Gauges updated when admin routes reload configuration. |
llm_model_cache_hits_total{provider,model} / llm_model_load_seconds{provider,model} |
llama.cpp model cache activity. |
Import the Grafana dashboards in deploy/grafana/*.json to visualise these metrics.
Documentation
The mdBook at docs/ covers architecture, API reference, configuration keys, and operational guides. Run just docs-serve to preview it locally or browse the published version via GitHub Pages.
Contributing
Contributions are welcome! Please review CONTRIBUTING.md and the code of conduct before opening a pull request. Run just check and just test prior to submitting changes. Security concerns should be reported via the SECURITY.md process.
License
RustyGPT is available under the Apache 2.0 license. See LICENSE for details.