Shrutakeerti/Agentic-Orchestration
In this solution agents the modular decision making components that perform autonomous actions
Agentic AI Orchestration System using Google Cloud
This project implements an Agentic AI System that orchestrates prompts across multiple LLMs (Groq, Together, Ollama), selects the best response using evaluation logic or human feedback, and logs all results to BigQuery. It is built with Google Cloud Functions, Pub/Sub, and modular LLM APIs, supporting hot-swapping, prompt routing, and real-time monitoring.
Features
- Prompt routing across multiple LLMs (Groq, Together, Ollama)
- Config-based model enable/disable and versioning
- Response evaluation logic (length-based + human feedback)
- Logging to BigQuery (prompt, latency, best model, etc.)
- Zero-downtime model hot-swapping
- CLI-based local execution for debugging and validation
- Cloud-native (Pub/Sub, Cloud Functions, BigQuery)
Architecture Overview
[User Prompt]
↓
[Pub/Sub Topic: agentic-prompts]
↓
[Cloud Function Triggered: subscriber.py]
↓
[Prompt Routing (prompt_router.py)]
↓
[Selected LLMs: GROQ, TOGETHER, OLLAMA]
↓
[Response Evaluation / Human Feedback]
↓
[Best Response Selected]
↓
[Logged in BigQuery + Printed in CLI]
🛠️ Project Structure
agentic-ai-orchestration/
├── cloud_functions/
│ ├── route_prompt/main.py # Pub/Sub publisher
│ └── run_agents/main.py # Cloud Function subscriber
├── llms/
│ ├── config.py # Enabled models + versions
│ ├── groq_api.py # GROQ LLM API
│ ├── together_api.py # Together API
│ └── ollama_api.py # Local Ollama API
├── utils/
│ ├── logger.py # Log results to BigQuery
│ ├── prompt_router.py # Dynamic model selection logic
│ ├── selector.py # Evaluation/Scoring function
│ └── human_feedback.py # CLI-based feedback
├── evaluators/
│ └── judge.py # Simple evaluation logic
├── local_runner.py # Local CLI interface
├── subscriber.py # Main Cloud Function logic
├── requirements.txt
├── .env # API keys and project vars
└── README.md
Setup Instructions
1. Environment Setup
Create a .env file in the root directory:
GCP_PROJECT_ID=your-gcp-project-id
GROQ_API_KEY=your-groq-key
TOGETHER_API_KEY=your-together-key
GOOGLE_APPLICATION_CREDENTIALS=cred.jsonPath to GCP service account JSON
## Google Cloud Setup
```bash
gcloud pubsub topics create agentic-prompts
gcloud pubsub subscriptions create agentic-sub --topic=agentic-prompts
bq mk --table agentic.logs \
prompt:STRING,responses:STRING,best_model:STRING,timestamp:TIMESTAMP
Install Python Requirements
pip install -r requirements.txtHow to Run
Option 1: Local Testing (with CLI feedback)
python local_runner.pyOption 2: Cloud Function Execution
A. Start the Cloud Subscriber
python subscriber.py
gcloud pubsub topics publish agentic-prompts \
--message="{\"prompt\": \"Who was the first female F1 driver?\"}"BigQuery Schema
Table: agentic.logs
| Field | Type | Description |
|---|---|---|
prompt |
STRING | User input prompt |
responses |
STRING | JSON of model outputs |
best_model |
STRING | Model with highest score/selected best |
latency_info |
RECORD | Timing and performance data |
error_info |
RECORD | Any error traces |
timestamp |
TIMESTAMP | Request time |
Model Management and Hot-Swapping
All model configurations are managed in llms/config.py:
"groq": {
"enabled": True,
"model": "llama3-70b-8192",
"version": "v1.0"
},
"together": {
"enabled": True,
"model": "meta-llama/Llama-3-8b-chat-hf",
"version": "v1.1"
},
"ollama": {
"enabled": True,
"model": "mistral",
"version": "v1.1"
}To Hot-Swap Models
-
Add a new model to
llms/config.pywith"enabled": False -
Test the model locally using:
python local_runner.py
Once validated, set
"enabled": True→ Ready for production
Monitoring and Observability
- Logs stored in BigQuery
- View latency, model performance, and failure rates
- Analyze usage patterns over time
- GCP Alerting & Monitoring (future scope)
Human-in-the-Loop (CLI Feedback)
In local_runner.py, all LLM responses are displayed for manual selection.
This supports:
- Validating auto-selection logic
- Collecting human-labeled data for fine-tuning evaluation functions
Future Improvements
- Web-based frontend for prompt submission
- Semantic evaluation using fine-tuned scoring models
- Prompt classification using a lightweight LLM
- GCP alerting and observability integration
- Web UI for human-in-the-loop feedback
License
MIT License
Author
Made by Shrutakeerti with love.