Hammy

Codebase Intelligence Engine — deep structural and historical context for AI coding agents

Hammy gives AI coding agents a high-fidelity map of your codebase. It parses source code into a queryable call graph, tracks version control history, and exposes everything through an MCP server — so agents spend less time guessing and more time doing.

Best on: legacy monoliths, team codebases, anything where "just grep it" stops working.

Why Hammy?

Most coding agents navigate by reading files and hoping for the best. Hammy gives them:

A call graph — not just "this function exists" but "these 12 things call it, and here's what it calls"
Risk signals — which code is high-churn AND heavily depended on (the landmines)
Blast radius — before changing anything, know exactly what breaks
Orientation — understand a 200-file module in one call instead of reading every file

Features

Multi-language AST parsing — PHP, JavaScript, TypeScript, Python, Go (Tree-sitter based)
Full call expression indexing — stores $this->resolve(PaymentService::class) not just resolve, enabling argument-level filtering
Cross-language bridge detection — links fetch('/api/users') in JS to the backend handler
Semantic + hybrid search — dense vector search (Qdrant) + BM25, merged via Reciprocal Rank Fusion
Structural search — filter by visibility, async, param count, return type, complexity score
Impact analysis — N-hop call graph traversal: know the full blast radius before touching anything
PR diff analysis — parse a unified diff (or auto-diff uncommitted changes), get HIGH/MED/LOW risk per changed symbol
Hotspot scoring — log(callers) × log(churn): surfaces code that's both heavily depended on AND frequently modified
LLM enrichment — auto-generate plain-English summaries for every indexed function and class
Brain / memory layer — agents store and recall research findings across sessions (invaluable on large codebases)
Watch mode — incremental re-indexing on file change
VCS integration — Git log, blame, churn; Mercurial scaffolded
MCP server — all tools available to any MCP client (Cursor, Claude Desktop, VS Code, etc.)
Smart ignore — four-layer filtering: defaults → .gitignore → .hgignore → .hammyignore

Installation

Requires uv and Docker (for Qdrant).

git clone https://github.com/yourusername/hammy.git
cd hammy
uv sync --extra dev

# Start Qdrant
docker compose up -d

# Install as a CLI tool
uv tool install --editable .

Quick Start

# Point Hammy at your project
hammy init /path/to/your/project

# Index the codebase
cd /path/to/your/project
hammy index

# Start the MCP server (connect from your IDE)
hammy serve

# Query with the AI agent directly
hammy query "Where is the payment processing logic and who calls it?"

# Watch for changes and re-index incrementally
hammy watch

MCP Tools

All tools are available via hammy serve to any MCP client. Grouped by what you're trying to do:

Orientation — understand unfamiliar code fast

`explain_symbol` ⭐

The single most useful tool. One call returns everything about a symbol: full definition (file, line, params, return type, visibility, async, LLM summary), direct callers, direct callees, sibling symbols in the same file, and recent commits. Replaces lookup_symbol + find_usages + impact_analysis + ast_query in a single round trip.

explain_symbol("PaymentService")
→ definition, 8 callers, 3 callees, 12 siblings, last 5 commits

`module_summary`

Orient yourself on a directory without opening a single file. Groups all symbols under a path into a structured table of contents — classes with nested methods first, then functions — across every file in the directory. Use this before diving into an unfamiliar module.

module_summary("app/Services/Payment/")
→ 3 files, 47 symbols, organized by class hierarchy

`ast_query`

Parse any file and see its full symbol tree: every class, method, function, endpoint, and import with line numbers, visibility, and LLM summaries. Filter by type (classes, functions, methods, endpoints, imports).

`list_files`

List every indexed file with its language. Good first call on an unfamiliar project to understand scope before searching.

Search — find what you're looking for

`search_symbols`

Keyword search over symbol names, ranked by match quality (exact → prefix → substring → summary). Use when you know roughly what you're looking for but not the exact name.

`lookup_symbol`

You know the exact name — get the full definition immediately: file, line range, params, return type, visibility, async flag, and LLM summary. Falls back to word-boundary match if no exact hit.

`lookup_symbols_batch`

Look up multiple symbols in one call. Pass a comma-separated list, get all definitions back at once. Eliminates the lookup_symbol loop after a search result.

lookup_symbols_batch("UserController, PaymentService, getRenew")
→ 3 full definitions in one call

`search_code_hybrid`

Combines BM25 (exact identifiers) with semantic embeddings (conceptual matches), merged via RRF. Use when your query mixes exact terms and concepts: "sendPersonalInvite email logic". Requires Qdrant.

`structural_search`

Find symbols by shape, not name. Useful for refactoring sprints and code reviews.

structural_search(node_type="method", visibility="public", min_params=4)
→ all public methods with 4+ parameters

structural_search(min_complexity=15, file_filter="Services/")
→ high-complexity methods in the services layer

Parameters: node_type, language, visibility, async_only, min_params, max_params, return_type, name_pattern, file_filter, min_complexity, limit

Call Graph — trace dependencies

`find_usages`

Find every call site for a function or method. Word-boundary matched so save won't match saveAll. Now with argument_filter to narrow by what's passed in — critical for dependency-injection heavy codebases.

find_usages("resolve", argument_filter="PaymentService")
→ only calls to resolve() that pass PaymentService, not the other 40

Parameters: symbol_name, file_filter, argument_filter

`impact_analysis`

"If I change this, what breaks?" Traverses the call graph N hops deep. Use direction="callers" before any refactor to map the full dependency chain. Use direction="callees" to see what a function depends on. Use direction="both" for the full neighbourhood.

impact_analysis("charge", depth=3, direction="callers")
→ everything downstream that will break

`find_bridges`

Finds cross-language endpoint connections — e.g. links a fetch('/api/v1/users') in React to the backend route handler. Useful when tracing frontend→backend flows.

Risk — know before you touch

`hotspot_score` ⭐

Mandatory pre-work before any significant change. Scores each symbol by log(callers) × log(churn). High score = heavily depended on AND frequently modified = highest risk. Near zero = safe to change. Run this before touching any unfamiliar code.

hotspot_score(file_filter="app/Services/", top_n=10)
→ ranked list of landmines in the services layer

`pr_diff`

"What's the risk of this PR?" Parses a diff, identifies every changed symbol, and rates each one LOW/MED/HIGH based on caller count. Accepts raw diff text, a base ref, or working_tree=True to automatically diff your uncommitted changes.

pr_diff(working_tree=True)              # analyse uncommitted changes
pr_diff(base_ref="main")               # compare branch against main
pr_diff(diff_text="<paste from GitHub>") # analyse a PR diff

VCS History — understand context and ownership

`git_log`

Recent commit history for a file or the whole repo. Shows what changed, when, and by whom.

`git_blame`

Line-by-line authorship. Use when you need to understand intent, know who to ask about a tricky section, or check whether a suspicious line is recent or ancient.

`file_churn`

Commit frequency per file over a time window. High churn = actively changing or repeatedly fixed. Run before diving into a module to know whether you're on stable ground or in a churn zone.

Semantic Memory — retain research across sessions (requires Qdrant)

`store_context`

Save a research finding to persistent memory with a key, tags, and source files. Sub-agents and future sessions can retrieve it instantly instead of re-researching.

store_context(
  key="auth-flow-research",
  content="Authentication touches 40 files. Entry point is AuthController::login...",
  tags="auth,sprint-42"
)

`recall_context`

Retrieve stored research by exact key or semantic query.

`list_context`

List all stored memory entries with their tags and timestamps.

Housekeeping

`index_status`

Quick orientation: total symbols, files, edges, and languages indexed. Call first on an unfamiliar project, or to confirm the index is populated before searching.

`reindex`

Refresh the in-memory symbol index after editing files. Pass update_qdrant=true to also refresh semantic embeddings. Pass enrich=true to generate LLM summaries for new symbols.

Common Workflows

Before touching unfamiliar code:

hotspot_score(file_filter="app/Services/Payment/")   # find the landmines
explain_symbol("PaymentService")                      # understand the entry point
impact_analysis("charge", depth=3)                    # map the blast radius

Exploring an unfamiliar module:

module_summary("app/Services/Payment/")              # table of contents
lookup_symbols_batch("PaymentService, Webhook, StripeClient")  # drill into key symbols

Before merging a PR:

pr_diff(working_tree=True)                           # risk-rate your changes
pr_diff(base_ref="main")                             # or compare against main

Tracking DI dependencies:

find_usages("resolve", argument_filter="PaymentService")  # who injects PaymentService
find_usages("make", argument_filter="UserRepository")     # who creates UserRepository

Planning a refactoring sprint:

structural_search(min_complexity=15, node_type="method")  # find complexity hotspots
structural_search(min_params=5, visibility="public")      # candidates for parameter objects

Configuration

Each project keeps its own hammy.yaml. Hammy looks for it in two places (in order):

hammy.yaml — directly in the project root (recommended)
config/hammy.yaml — inside a config subdirectory (legacy)

If neither exists, all defaults apply.

Per-project isolation

project.name is the key setting for multi-project setups. Hammy uses it to derive a unique Qdrant collection prefix, so two projects on the same Qdrant instance never mix their indexes. Set it to something short and slug-friendly.

Project A: name: "storefront"   → collections: hammy_storefront_code_symbols, ...
Project B: name: "admin-panel"  → collections: hammy_admin_panel_code_symbols, ...

If you need manual control (e.g. sharing an index between environments), set qdrant.collection_prefix explicitly — that always takes precedence over the derived name.

Full reference

# hammy.yaml  (place in project root)

project:
  name: "my-project"   # Used to isolate Qdrant collections per project — set this!
  root: "."            # Project root, relative to this config file

parsing:
  languages:           # Which languages to index (all enabled by default)
    - php
    - javascript
    - typescript
    - python
    - go
  max_file_size_kb: 500  # Skip files larger than this

qdrant:
  host: "localhost"
  port: 6333
  collection_prefix: ""          # Leave blank to auto-derive from project.name (recommended)
                                 # Set explicitly to override, e.g. "prod" or "shared-index"
  embedding_model: "all-MiniLM-L6-v2"  # SentenceTransformer model for semantic search

vcs:
  max_commits: 5000       # How far back to scan commit history
  churn_window_days: 90   # Lookback window for hotspot churn scoring

enrichment:
  enabled: false          # Set true to auto-generate LLM summaries during indexing
  provider: "anthropic"   # LiteLLM provider string (anthropic, openai, etc.)
  model: "claude-haiku-4-5-20251001"  # Model to use for summaries
  batch_size: 10          # Symbols enriched per API call
  skip_if_summary: true   # Don't re-enrich symbols that already have summaries
  max_symbols: 0          # Cap on symbols to enrich per run (0 = no limit)

ignore:
  use_gitignore: true     # Respect .gitignore
  use_hgignore: true      # Respect .hgignore
  use_hammyignore: true   # Respect .hammyignore
  extra_patterns:         # Additional glob patterns to exclude
    - "vendor/**"
    - "*.generated.php"

.hammyignore

Create a .hammyignore in your project root using standard gitignore syntax. It's merged with .gitignore and .hgignore — anything excluded by any of the three files is skipped during indexing.

# .hammyignore
vendor/
node_modules/
*.min.js
storage/framework/
bootstrap/cache/

Project Structure

hammy/
├── src/hammy/
│   ├── cli.py              # Typer CLI (init, index, query, status, serve, watch)
│   ├── config.py           # Pydantic settings loader
│   ├── ignore.py           # Four-layer ignore system
│   ├── watcher.py          # watchfiles-based incremental re-indexer
│   ├── agents/             # CrewAI Explorer and Historian agents
│   ├── core/               # Crew orchestration and context pack generation
│   ├── indexer/            # File walking, parsing pipeline, incremental indexing
│   ├── mcp/                # MCP server (mcp-python)
│   ├── schema/             # Pydantic models (Node, Edge, ContextPack)
│   └── tools/
│       ├── languages/      # Tree-sitter extractors: php, js, ts, python, go
│       ├── ast_tools.py    # AST query tool
│       ├── bridge.py       # Cross-language bridge resolver
│       ├── diff_analysis.py# PR diff parser and blast radius
│       ├── hotspot.py      # Hotspot scoring
│       ├── hybrid_search.py# BM25 + dense RRF fusion
│       ├── parser.py       # ParserFactory dispatcher
│       ├── qdrant_tools.py # Qdrant embed, upsert, search, delete
│       └── vcs.py          # Git/Mercurial wrapper
├── config/                 # Default configuration files
├── tests/                  # 317 passing tests with fixtures
└── docker-compose.yml      # Qdrant service

Development

# Run the full test suite
uv run pytest

# Run with coverage
uv run pytest --cov=hammy --cov-report=term-missing

# Run a specific module
uv run pytest tests/test_parser.py

Requirements

Python 3.11+
Docker (for Qdrant)
Git (for VCS analysis)
LLM API key (OpenAI, Anthropic, or any LiteLLM-compatible provider) for agent queries and enrichment

Built With

Tree-sitter — incremental, multi-language AST parsing
Qdrant — vector database for semantic search and memory
CrewAI — multi-agent orchestration
mcp-python — Model Context Protocol server
rank-bm25 — BM25Plus for hybrid search
watchfiles — fast filesystem watching
uv — Python package management

n0nag0n/hammy

Hammy

Why Hammy?

Features

Installation

Quick Start

MCP Tools

Orientation — understand unfamiliar code fast

explain_symbol ⭐

module_summary

ast_query

list_files

Search — find what you're looking for

search_symbols

lookup_symbol

lookup_symbols_batch

search_code_hybrid

structural_search