shpitdev/palantir-compute-module-pipeline-template
mvp of streaming pipelines by emulating a trimmed version of the palantir txn and dataset APIs. Includes an AI search / enrichment flow using gemini + google search. written in go and with no pre-caching optimizations seeing 10x faster start up for full local testing.
palantir-compute-module-pipeline-search
Pipeline-mode Foundry Compute Module (Go) that:
- Reads a dataset of email addresses
- Enriches each email via Gemini (grounding + URL context + structured output)
- Writes enriched rows to either:
- a snapshot dataset (transactions), or
- a streaming dataset (stream-proxy)
Local-first workflow: iterate locally (mock Foundry APIs + real container) and deploy the same image into Foundry.
Note: compute modules run as long-lived containers. This module runs the pipeline once per container start and then keeps the process alive so the platform does not restart it (which would re-run the pipeline and can duplicate stream outputs).
Repo Layout
This repo is split into reusable kit packages and an example module:
pkg/pipeline/...: reusable pipeline primitives (worker, local/foundry IO adapters, schema contract)pkg/foundry/...: Foundry env parsing and HTTP clientpkg/mockfoundry/...: emulated Foundry server used by local harnesses and testsexamples/email_enricher/...: example email enrichment domain logic and output mappingcmd/enricher: example binary wiring the kit + example
External-consumer contracts are validated in:
test/consumer: imports reusable packages directlytest/template: minimal new-module skeleton using pipeline kit APIs
Development
Canonical entrypoint:
./dev helpVerify (CI parity + external consumer checks):
./dev verifyReal e2e test run (Gemini + Foundry-emulated docker-compose):
./dev test./dev test performs real Gemini calls and fails if committed output contains any status=error rows.
Preflight diagnostics:
./dev doctor
./dev doctor --jsonRun locally (no Foundry required, Gemini required):
export GEMINI_API_KEY=...
./dev run local -- --input /path/to/emails.csv --output /path/to/enriched.csvGEMINI_MODEL is optional; default is gemini-2.5-flash.
Run Foundry-like flow locally (mock dataset API + real Gemini + real container):
./dev run foundry-emulatedRun a long-lived local dev loop (watches input CSV and reruns automatically):
./dev run foundry-emulated --watch./dev run foundry-emulated --watch starts a tight local loop:
- starts mock-foundry + a real container
- runs once immediately, then reruns on input CSV edits
- reuses prior
status=okrows byemail(best-effort incremental cache) - stops cleanly on
Ctrl+C
Local Watch Loop Quickstart
- Set a valid Gemini key in
.env:
GEMINI_API_KEY=...
# GEMINI_MODEL is optional (default: gemini-2.5-flash)- Edit input rows in:
.local/mock-foundry/inputs/ri.foundry.main.dataset.11111111-1111-1111-1111-111111111111.csv- Start the local loop:
./dev run foundry-emulated --watch- Read latest committed output at:
.local/mock-foundry/uploads/ri.foundry.main.dataset.22222222-2222-2222-2222-222222222222/_committed/readTable.csv- Change and save the input CSV again to trigger another run.
Reset local compose state and clear mock-foundry uploads (inputs are preserved):
./dev cleanSee docker-compose.local.yml for fixture mounts and output paths.
Run CI-style docker-compose E2E (fixed fixtures + output validation):
export GEMINI_API_KEY=...
./dev test -vNote: CI jobs that require Gemini secrets are skipped automatically if GEMINI_API_KEY / GEMINI_MODEL GitHub secrets are not configured.
Docs
docs/DESIGN.md: architecture, interfaces, local testing approachdocs/RELEASE.md: Foundry configuration steps (Sources, egress, probes) and publishing guidancedocs/TROUBLESHOOTING.md: common deployment failures and diagnosisdocs/DIAGRAMS.md: Mermaid sequence diagrams + flowcharts for API usage scenarios
Defaults (high-signal)
Defaults differ between:
- binary internal fallbacks (used when env vars are unset in Foundry)
- local docker-compose harness defaults in
docker-compose.local.yml
Key ones:
REQUEST_TIMEOUT:30sbinary fallback; local compose sets2mWORKERS:10MAX_RETRIES:3FAIL_FAST:false
For the full set of options and Foundry configuration, see docs/RELEASE.md.
Screenshots
Put Foundry UI screenshots in docs/screenshots/ and reference them from this README.
- Convention:
docs/screenshots/<short-topic>-<yyyy-mm-dd>.png
Current screenshots:
Compute module configuration (pipelines mode, sources + env vars):
Lineage overview (inputs, sources, egress, output):
Streaming dataset current transaction view:
Streaming dataset metrics:



