theagenticguy/code-context-agent
AI-powered CLI tool for automated codebase analysis and context generation
AI-powered codebase analysis with 48 tools, graph algorithms, and structured output for AI coding assistants.
code-context-agent
code-context-agent uses Claude Opus 4.6 (via Amazon Bedrock) with 48 tools to analyze unfamiliar codebases and produce structured context documentation for AI coding assistants. It combines semantic analysis (LSP), structural pattern matching (ast-grep), graph algorithms (NetworkX), git history analysis, and intelligent code bundling (repomix) to generate narrated markdown that helps developers and AI assistants quickly understand a codebase's architecture and business logic.
Caution
This CLI runs a fully autonomous AI agent loop. The agent decides which tools to invoke, what files to read, and what shell commands to run. While shell commands are restricted to a read-only allowlist and all inputs are validated, the agent makes its own decisions within those bounds. Review all generated output before using it in production.
Important
Generative AI can make mistakes. You should review all output and monitor costs generated by your chosen AI model. Analysis of a single repository typically consumes 50K-500K input tokens and 10K-50K output tokens on Claude Opus 4.6. See AWS Responsible AI Policy.
Note
Disclaimer: The author is an AWS employee. This is not an official AWS project or service. It is not maintained, supported, or endorsed by AWS. This project runs fully autonomous agent loops with access to your filesystem (read-only). You are solely responsible for any consequences of running this tool. The CLI and source code are provided AS IS without warranty of any kind. User discretion advised.
License
This project is licensed under the Apache License 2.0.
Tenets
These principles guide every design decision. See tenets.md for full details with tie-breakers.
- Measure, don't guess -- Rank code by graph metrics (centrality, PageRank, coupling), not by filename or directory structure
- Layer signals, read less -- Combine 5 signal types (AST, call graphs, git history, signatures, commit messages) across all files rather than reading a few files deeply
- Compress aggressively, expand selectively -- Start with the most compressed representation; only expand to full source for code that earns it through high scores
- The model picks the depth -- Analysis depth scales with codebase complexity automatically; no user-facing depth knobs
- Machines read it first -- Output is optimized for AI consumption: typed schemas, ranked tables, bounded diagrams over prose
- Fail loud, fill gaps -- When a signal source is unavailable, surface it explicitly and compensate with remaining signals
Features
- Claude Opus 4.6 with adaptive thinking and 1M context window
- 48 analysis tools: LSP, ast-grep, ripgrep, repomix, git history, NetworkX graph
- Multi-language LSP: Python (ty), TypeScript, Rust, Go, Java (configurable fallback chains)
- Graph-based insights: Hotspots (betweenness centrality), foundations (PageRank/TrustRank), modules (Louvain/Leiden), triangle detection
- Git-aware bundling: Embeds diffs, commit history, and coupling data directly in context bundles
- Tree-sitter compression: Extract signatures/types only, stripping function bodies for token efficiency
- Structured output: Pydantic-typed
AnalysisResultwith ranked business logic, risks, and graph stats --fullmode: Exhaustive analysis with no size limits, fail-fast error handling, and per-module output- Phase-aware TUI: 10-phase progress tracking with discovery feed and mode badge
- Rich terminal UI: Real-time progress display with Rich library
- MCP server: Expose graph algorithms and analysis as MCP tools for Claude Code, Cursor, and other agents
- context7 integration: Library documentation lookup during analysis via MCP
- Security hardened: Shell allowlist, input validation on all tool parameters, path traversal prevention, no network access from agent tools
Architecture
flowchart TD
A[CLI: cyclopts] --> B[run_analysis]
B --> C[create_agent]
C --> D[Strands Agent<br/>Opus 4.6 + adaptive thinking]
D --> E[Jinja2 System Prompt]
D --> F[HookProviders<br/>quality + efficiency + fail-fast]
D --> G[AnalysisResult<br/>structured output]
D --> H[Tool Execution]
H --> I[Discovery<br/>ripgrep, repomix]
H --> J[LSP<br/>ty, ts-server, rust-analyzer]
H --> K[AST<br/>ast-grep patterns]
H --> L[Graph<br/>NetworkX analysis]
H --> M[Git<br/>coupling, churn, blame]
H --> N[Shell<br/>bounded execution]
H --> O[Output Files<br/>.code-context/ directory]
H --> P[context7 MCP<br/>library docs]
D -.-> Q[FastMCP Server<br/>MCP protocol]
Q --> R[Claude Code / Cursor<br/>MCP clients]
Tool Categories
| Category | Tools | Purpose |
|---|---|---|
| Discovery | create_file_manifest, repomix_orientation, repomix_bundle, repomix_compressed_signatures, repomix_split_bundle, repomix_json_export |
File inventory, bundling, token-aware orientation |
| Search | rg_search, read_file_bounded |
Text search and bounded file reading |
| LSP | lsp_start, lsp_document_symbols, lsp_references, lsp_definition, lsp_hover, lsp_workspace_symbols, lsp_diagnostics |
Semantic analysis across multiple languages |
| AST | astgrep_scan, astgrep_scan_rule_pack, astgrep_inline_rule |
Structural pattern matching |
| Graph | code_graph_create, code_graph_analyze (hotspots, foundations, trust, modules, triangles, coupling), code_graph_explore, code_graph_export |
Dependency and structural analysis |
| Git | git_hotspots, git_files_changed_together, git_blame_summary, git_file_history, git_contributors, git_recent_commits, git_diff_file |
Temporal analysis and coupling detection |
| Shell | shell |
Read-only command execution (allowlisted programs only) |
Security Model
The agent operates under a defense-in-depth security model:
- Shell allowlist: Only read-only programs (
ls,git log,wc,head,grep, etc.) are permitted. Destructive commands (rm,mv,curl,wget,docker, etc.), shell interpreters (bash,sh), and network tools are blocked. - Git read-only: Git commands are restricted to read-only subcommands (
log,diff,blame,status, etc.). Write operations (push,commit,reset,checkout) are blocked. - Shell operator blocking: Command chaining (
;,&&,||,|), redirects (>,>>), backticks,$(), and${}expansion are all blocked. - Input validation: All tool parameters (
repo_path,file_path,glob,pattern) are validated before use. Path traversal to sensitive directories (/etc,/root,/proc,/sys) is prevented. - No
sh -cstring interpolation: All subprocess calls use list-based execution or stdin piping. No user-controlled values are interpolated into shell command strings. - Pinned dependencies: npm packages invoked at runtime (
context7,jscpd) are pinned to major versions.
Prerequisites
1. Python Environment
- Python 3.13+ (required)
- uv (Astral's fast package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh2. AWS Configuration
Requires AWS credentials configured for Amazon Bedrock access:
aws configure
# or set environment variables
export AWS_PROFILE=your-profile
export AWS_REGION=us-east-1Default model: global.anthropic.claude-opus-4-6-v1 (configurable via CODE_CONTEXT_MODEL_ID)
3. External CLI Tools
| Tool | Installation | Purpose |
|---|---|---|
| ripgrep | cargo install ripgrep |
File search and manifest creation |
| ast-grep | cargo install ast-grep |
Structural code search |
| repomix | npm install -g repomix |
Code bundling with Tree-sitter compression |
| typescript-language-server | npm install -g typescript-language-server |
TypeScript/JavaScript LSP |
| ty | uv tool install ty |
Python type checker/LSP server |
Installation
# Install from package
uv tool install code-context-agent
# Or development setup
git clone https://github.com/theagenticguy/code-context-agent.git
cd code-context-agent
uv sync --all-groups
uv run code-context-agentUsage
Analyze a Codebase
# Analyze current directory
code-context-agent analyze .
# Analyze specific repository
code-context-agent analyze /path/to/repo
# Focus on specific area
code-context-agent analyze . --focus "authentication system"
# Issue-focused analysis
code-context-agent analyze . --issue "gh:1694"
# Custom output directory
code-context-agent analyze . --output-dir ./analysis
# Quiet mode
code-context-agent analyze . --quiet
# Debug mode
code-context-agent analyze . --debugThe agent automatically determines analysis depth based on repository size and complexity. Use --full for exhaustive analysis.
Full Mode
# Exhaustive analysis (no size limits, all graph algorithms, fail-fast)
code-context-agent analyze . --full
# Full + focused on specific area
code-context-agent analyze . --full --focus "authentication"
# Verify external tool dependencies
code-context-agent checkMCP Server
Expose the analysis capabilities to coding agents (Claude Code, Cursor, etc.) via the Model Context Protocol:
# stdio transport (for Claude Desktop, Claude Code)
code-context-agent serve
# HTTP transport (for networked/multi-client access)
code-context-agent serve --transport http --port 8000The MCP server exposes the core differentiators — graph algorithms, progressive exploration, and the full analysis pipeline — as tools that any MCP client can use. Commodity tools (ripgrep, LSP, git) are intentionally not exposed since they're already available in every coding agent.
MCP Tools:
start_analysis/check_analysis— kickoff/poll for the full analysis pipelinequery_code_graph— run algorithms (PageRank, betweenness centrality, Louvain community detection, etc.)explore_code_graph— progressive drill-down into graph structureget_graph_stats— graph composition summary
Visualize Results
# Launch interactive web visualization
code-context-agent viz .Output Files
All outputs are written to .code-context/ (or custom --output-dir):
| File | Description |
|---|---|
CONTEXT.md |
Main narrated context (≤300 lines in standard mode) |
CONTEXT.orientation.md |
Token distribution tree |
CONTEXT.bundle.md |
Bundled source code (compressed) |
CONTEXT.signatures.md |
Signatures-only structural view |
files.all.txt |
Complete file manifest |
files.business.txt |
Curated business logic files |
code_graph.json |
Persisted graph data |
FILE_INDEX.md |
File index with graph metrics (complex repos) |
analysis_result.json |
Structured analysis result (Pydantic JSON) |
CONTEXT.modules/ |
Per-module context files (--full mode) |
Configuration
All configuration uses the CODE_CONTEXT_ prefix:
| Variable | Default | Description |
|---|---|---|
CODE_CONTEXT_MODEL_ID |
global.anthropic.claude-opus-4-6-v1 |
Bedrock model ID |
CODE_CONTEXT_REGION |
us-east-1 |
AWS region |
CODE_CONTEXT_TEMPERATURE |
1.0 |
Model temperature (must be 1.0 for thinking) |
CODE_CONTEXT_LSP_SERVERS |
{"ts": "typescript-language-server --stdio", "py": "ty server", ...} |
LSP server registry (JSON) |
CODE_CONTEXT_AGENT_MAX_TURNS |
1000 |
Max agent turns |
CODE_CONTEXT_AGENT_MAX_DURATION |
1200 |
Timeout in seconds (default: 20 min) |
CODE_CONTEXT_CONTEXT7_ENABLED |
true |
Enable context7 MCP for library doc lookup |
Development
| Task | Command |
|---|---|
| Install dependencies | uv sync --all-groups |
| Run CLI | uv run code-context-agent |
| Lint | uvx ruff check src/ |
| Format | uvx ruff format src/ |
| Type check | uvx ty check src/ |
| Test | uv run pytest |
| Commit (conventional) | uv run cz commit |
| Bump version | uv run cz bump |
Project Structure
src/code_context_agent/
├── cli.py # CLI entry point (cyclopts)
├── config.py # Configuration (pydantic-settings)
├── agent/ # Agent orchestration
│ ├── factory.py # Agent creation with tools + MCP providers
│ ├── runner.py # Analysis runner with event streaming
│ ├── prompts.py # Jinja2 template rendering
│ └── hooks.py # HookProviders: quality, efficiency, fail-fast
├── mcp/ # FastMCP v3 server
│ ├── __init__.py # Package init
│ └── server.py # MCP tools, resources, and server definition
├── templates/ # Jinja2 prompt templates
│ ├── system.md.j2 # Unified system prompt
│ ├── partials/ # Composable prompt sections
│ └── steering/ # Quality guidance fragments
├── models/ # Pydantic models
│ ├── base.py # StrictModel, FrozenModel
│ └── output.py # AnalysisResult, BusinessLogicItem, etc.
├── consumer/ # Phase-aware TUI (10 phases + discovery feed)
│ ├── phases.py # Phase detection, discovery events
│ ├── rich_consumer.py # Dashboard with phase indicator
│ └── state.py # Mutable display state
├── tools/ # Analysis tools (48)
│ ├── discovery.py # ripgrep, repomix, write_file (11 tools)
│ ├── astgrep.py # ast-grep (3 tools)
│ ├── git.py # git history (7 tools)
│ ├── lsp/ # LSP integration (8 tools)
│ └── graph/ # NetworkX analysis (14 tools)
└── rules/ # ast-grep rule packs
Contributing
Contributions are welcome. Please open an issue or pull request.
- All commits must follow Conventional Commits format
- Pre-commit hooks enforce lint, format, type check, and secret scanning
- Pre-push hooks run the full test suite
Related Projects
- strands-agents — Agent framework
- ast-grep — Structural code search
- repomix — Code bundling with Tree-sitter
- ty — Python type checker/LSP server
- NetworkX -- Graph algorithms
- FastMCP -- Model Context Protocol server framework
License
Copyright 2025 Laith Al-Saadoon
Licensed under the Apache License, Version 2.0. See LICENSE for the full text.
