Monce-AI/concierge.aws.monce.ai
Moncey Concierge — Monce AI extraction memory & intelligence layer. Sonnet 4.6 on Bedrock.
Moncey Concierge
Live at: https://concierge.aws.monce.ai
Chat UI: https://concierge.aws.monce.ai/ui
Internal memory and intelligence layer for Monce AI. Tracks extraction pipeline activity, answers analytical questions from pre-computed digests, and bridges to Snake for synonym management.
Architecture
Route53 → EC2 (nginx/SSL → gunicorn) → FastAPI + Bedrock Sonnet
│
┌────────────────────┼──────────────────┐
Memory (JSON) monce_db (S3) snake.aws.monce.ai
- memories.json - extractions - article synonyms
- conversations.json - stats - client synonyms
- digests.json - rebuild triggers
Quick Start
# Chat
curl -X POST https://concierge.aws.monce.ai/chat \
-H 'Content-Type: application/json' \
-d '{"message": "What are the top clients this week?"}'
# Ingest last 14 days of extractions
curl -X POST https://concierge.aws.monce.ai/ingest \
-H 'Content-Type: application/json' \
-d '{"days": 14}'
# Search memories
curl 'https://concierge.aws.monce.ai/search?q=SGD'
# Add article synonym to Snake
curl -X POST https://concierge.aws.monce.ai/snake/synonym \
-H 'Content-Type: application/json' \
-d '{"text": "6mm", "num_article": "1006", "factory_id": "3"}'
# Add client synonym to Snake
curl -X POST https://concierge.aws.monce.ai/snake/synonym_client \
-H 'Content-Type: application/json' \
-d '{"text": "DUBOS MATERIAUX", "numero_client": "565", "factory_id": "4"}'API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Service health + memory/conversation counts |
/ui |
GET | Chat interface |
/chat |
POST | Chat with Concierge (Sonnet + context) |
/remember |
POST | Store a memory manually |
/forget |
POST | Forget memories matching query |
/memories |
GET | List memories (paginated, filterable by ?tag=) |
/search |
GET | Keyword search across memories (?q=) |
/ingest |
POST | Pull extractions from monce_db into memory |
/ingest/stats |
GET | Pull aggregate stats from monce_db |
/digest |
POST | Recompute aggregate digests |
/digest |
GET | Return current digests |
/snake/synonym |
POST | Push article synonym to Snake |
/snake/synonym_client |
POST | Push client synonym to Snake |
/snake/synonyms_batch |
POST | Batch push synonyms + rebuild |
/snake/rebuild |
POST | Trigger Snake rebuild_all |
How Concierge Answers Questions
Concierge uses a 3-layer context system so Sonnet can answer precisely:
- Digests — Pre-computed aggregates from ALL extraction data (top clients, daily volumes, glass types, matching quality, weekly rankings). Always included. Compact.
- Search results — Memories keyword-matched to the user's question (up to 20). Targeted.
- Recent memories — Last 10 raw ingestions. For "what just happened" questions.
When you call /ingest, digests are auto-recomputed. You can also manually trigger /digest POST.
Feeding Data to Concierge
From monce_db (extractions)
# Ingest last 14 days, all factories
curl -X POST https://concierge.aws.monce.ai/ingest \
-d '{"days": 14}'
# Ingest specific factory
curl -X POST https://concierge.aws.monce.ai/ingest \
-d '{"days": 14, "factory": "VIP"}'
# Only verified extractions
curl -X POST https://concierge.aws.monce.ai/ingest \
-d '{"days": 14, "status": "verified"}'Deduplicates by extraction ID — safe to call repeatedly.
Manual memories
# Remember something
curl -X POST https://concierge.aws.monce.ai/remember \
-d '{"text": "Factory 4 had a major outage today", "tags": ["incident", "VIP"]}'Programmatic best practices
To make Concierge the effective memory of Monce AI:
-
Tag everything. Tags enable filtering and weighted search. Use consistent tags:
extraction,synonym,incident, factory names. -
Ingest regularly. Set up a cron or call
/ingestdaily. Concierge deduplicates, so overcalling is fine. -
Use
/rememberfor non-extraction events. Deployments, incidents, configuration changes — anything Sonnet should know about when answering questions. -
Let digests do the heavy lifting. Don't ask Concierge to count raw memories — digests pre-compute totals, rankings, and trends. If you need a new aggregate, add it to
compute_digests()inmemory.py. -
Search before asking. For programmatic lookups, use
/search?q=keywordinstead of/chat. It's faster and doesn't consume Bedrock tokens.
Snake Synonym Integration
Concierge can push synonyms directly to snake.aws.monce.ai (article matching service). This is useful when extraction analysis reveals missing or incorrect synonym mappings.
Article synonyms
curl -X POST https://concierge.aws.monce.ai/snake/synonym \
-d '{"text": "PLANILUX 4MM", "num_article": "1004", "factory_id": "3"}'Client synonyms
curl -X POST https://concierge.aws.monce.ai/snake/synonym_client \
-d '{"text": "SAINT GOBAIN PARIS", "numero_client": "7890", "factory_id": "4"}'Batch workflow
# Push multiple synonyms without rebuilding each time
curl -X POST https://concierge.aws.monce.ai/snake/synonyms_batch \
-d '{
"synonym_type": "article",
"synonyms": [
{"text": "6mm", "num_article": "1006", "factory_id": "3"},
{"text": "FLOAT 6", "num_article": "1006", "factory_id": "3"},
{"text": "8mm clair", "num_article": "1008", "factory_id": "4"}
]
}'Batch adds all synonyms with trigger_rebuild=false, then calls /rebuild_all once at the end.
Every synonym action is logged as a Concierge memory with tags [synonym, article/client, factory_id].
Claude Code Sync
To work on Concierge with Claude Code:
git clone git@github.com:Monce-AI/concierge.aws.monce.ai.git
cd concierge.aws.monce.aiFile structure
concierge.aws.monce.ai/
api/
__init__.py
main.py # FastAPI app entry
config.py # Env var config (Bedrock, data dir)
routes.py # All endpoints
sonnet.py # Bedrock Sonnet caller + system prompt
memory.py # Memory CRUD + digest engine + search
ingest.py # monce_db ingestion
snake.py # Snake API client (synonyms + rebuild)
static/
index.html # Landing page
ui.html # Chat interface
terraform/
main.tf # EC2 + SG + Route53
deploy.sh # Rsync + systemd + nginx
setup.py
Deploy
cd terraform
./deploy.sh # or ./deploy.sh <ip>Environment variables (on server at /opt/concierge/.env)
AWS_BEARER_TOKEN_BEDROCK=... # Bedrock access
MONCE_S3_ACCESS_KEY=... # monce_db S3 access
MONCE_S3_SECRET_KEY=... # monce_db S3 secret
Adding new capabilities
- New data source: Add an ingestion function in
ingest.py, add a route inroutes.py - New digest type: Add computation logic in
memory.py→compute_digests() - New external service: Create a module (like
snake.py), add routes - Changing Sonnet's behavior: Edit
SYSTEM_PROMPTinsonnet.py
Infrastructure
| Spec | |
|---|---|
| Instance | t3.small (2 vCPU, 2 GB) |
| Region | eu-west-3 (Paris) |
| IP | 35.180.24.206 |
| Workers | 2 gunicorn/uvicorn |
| Timeout | 300s (for heavy ingestion) |
| SSL | Let's Encrypt via certbot |
| Model | Bedrock Sonnet 3 (bearer token) |