benitomartin/github-issues-multiagent-intelligence
Agentic Github Issues Retrieval on Kubernetes
GitHub Issues Multiagent Intelligence
A modular pipeline for searching and analyzing GitHub issues and providing recommendations for issue classification
This project provides an intelligent, multi-agent system for processing, analyzing, and managing GitHub issues at scale. It leverages LLMs, vector databases, and cloud-native Kubernetes infrastructure to automate search, triage, and enrichment of GitHub issues, supporting advanced workflows and integration using LangGraph agents.
The detailed implementation of the project can be found in the following blog.
Overview
- Multi-agent orchestration for issue processing
- Integration with GitHub, PostgreSQL, and Qdrant vector store
- Modular architecture for agents, guardrails, tools, and pipelines
- Infrastructure-as-code with AWS CDK and Kubernetes support
Table of Contents
- GitHub Issues Multiagent Intelligence
Project Structure
├── LICENSE
├── Makefile
├── README.md
├── SETUP.md
├── alembic.ini
├── aws_cdk_infra
│ ├── README.md
│ ├── app.py
│ ├── aws_eks_rds
│ │ ├── __init__.py
│ │ ├── eks_stack.py
│ │ ├── rds_stack.py
│ │ └── vpc_stack.py
│ ├── requirements.txt
│ └── source.bat
├── docker
│ ├── dev.Dockerfile
│ ├── docker-compose.yml
│ └── prod.Dockerfile
├── env.example
├── kubernetes
│ ├── fastapi-deployment.yaml
│ ├── fastapi-service.yaml
│ ├── iam_policy.json
│ └── test-pod.yaml
├── langgraph.json
├── migrations
│ ├── README
│ ├── env.py
│ ├── script.py.mako
│ └── versions
│ └── 77e4d0a13aa8_create_comments_and_issues_table.py
├── pyproject.toml
├── scripts
│ └── lint-makefile.sh
├── src
│ ├── __init__.py
│ ├── agents
│ │ ├── __init__.py
│ │ ├── agents.py
│ │ ├── graph.py
│ │ └── graph_service.py
│ ├── api
│ │ ├── __init__.py
│ │ └── main.py
│ ├── config
│ │ ├── guardrails.yaml
│ │ └── repos.yaml
│ ├── data_pipeline
│ │ ├── __init__.py
│ ├── ingest_embeddings.py
│ │ └── ingest_raw_data.py
│ ├── database
│ │ ├── __init__.py
│ │ ├── drop_tables.py
│ │ ├── init_db.py
│ │ └── session.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── agent_models.py
│ │ ├── api_model.py
│ │ ├── db_models.py
│ │ ├── github_models.py
│ │ ├── guardrails_models.py
│ │ └── repo_models.py
│ ├── utils
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── error_handler.py
│ │ ├── guardrails.py
│ │ └── promps.py
│ └── vectorstore
│ ├── __init__.py
│ ├── create_collection.py
│ ├── create_index.py
│ ├── delete_collection.py
│ ├── payload_builder.py
│ ├── qdrant_store.py
│ └── qdrant_store_sync.py
├── tests
│ ├── integration
│ │ ├── test_api_process_issue.py
│ │ ├── test_full_graph_output_guardrails.py
│ │ └── test_query_search.py
│ └── unit
│ ├── test_db_ingest_qdrant.py
│ ├── test_input_guardrail_agent.py
│ ├── test_output_guardrail_agent.py
│ └── test_qdrant_collection.py
└── uv.lock
Prerequisites
- Python 3.12+
- uv
- Docker & Docker Compose
- PostgreSQL
- Qdrant
- AWS CLI (for CDK)
- Node.js (for AWS CDK)
- Kubernetes CLI (
kubectl) - OpenAI API Key
- Guardrails AI API Key
- GitHub Token
Installation
Clone the repository
git clone https://github.com/benitomartin/github-issues-multiagent-intelligence.git
cd github-issues-multiagent-intelligenceConfigure Environment
uv sync --all-groups
source ./.venv/bin/activateCopy and configure environment variables
There must be two environments created (development and production):
cp env.example .env.dev
cp env.example .env.prodThe development mode runs on localhost. The production mode runs with RDS as the database and AWS EKS with Fargate for FastAPI.
Usage
Configuration
You must follow the SETUP.md first before running the below commands:
PostgreSQL
Start the database and supporting services, either in development or production mode:
make docker-build APP_ENV=devAccess Adminer at http://localhost:8080.
Alembic Migrations
Update the database schema:
alembic upgrade headRepositories
This file defines which repositories to pull issues from, how many issues to pull, and in what state (e.g., open, closed, or all).
- owner: scikit-learn
repo: scikit-learn
state: all
per_page: 100
max_pages: 1Guardrails
This file configures the thresholds for Guardrails agents like jailbreak, toxicity, and secrets detection.
jailbreak:
threshold: 0.8
on_fail: "filter"
toxicity:
threshold: 0.5
validation_method: "full"
on_fail: "filter"
secrets:
on_fail: "filter"AWS CDK
Install dependencies in a separate virtual environment:
pip install -r aws_cdk_infra/requirements.txtDeploy infrastructure:
- AWS EKS with Fargate and Load Balancer
- AWS RDS
- VCP
cd aws_cdk_infra
cdk bootstrap
cdk deployTesting
Run all tests (unit and integration):
make all-testsOr run individual test suites.
API
The FastAPI server is defined in src/api/main.py.
Start the API server (example):
uvicorn src.api.main:app --reload{
"title": "Test Issue",
"body": "Test Issue"
}API docs are available at /docs when running.
Kubernetes
Kubernetes manifests are in kubernetes. Once the CDK Stack has been deployed, the environment variables must be adapted. Sensitive information can be found under AWS Secrets Manager.
Cluster Configuration
Update your cluster configuration and add a new namespace:
aws eks --region <aus-region> update-kubeconfig --name <cluster-name>
kubectl create namespace my-appSecrets and Configmap
Make sure to add the environment variables information to the Kubernetes cluster:
kubectl create configmap app-config \
--from-literal=APP_ENV=prod \
--from-literal=AWS_REGION= \
--from-literal=POSTGRES_DB= \
--from-literal=POSTGRES_PORT= \
--from-literal=ADMINER_PORT= \
--from-literal=ISSUES_TABLE_NAME= \
--from-literal=COMMENTS_TABLE_NAME= \
--from-literal=DENSE_MODEL_NAME= \
--from-literal=SPARSE_MODEL_NAME= \
--from-literal=LEN_EMBEDDINGS= \
--from-literal=COLLECTION_NAME= \
--from-literal=CHUNK_SIZE= \
--from-literal=BATCH_SIZE= \
--from-literal=CONCURRENT_COMMENTS= \
--from-literal=LLM_MODEL_NAME= \
--from-literal=TEMPERATURE= \
--from-literal=REPOS_CONFIG=src/config/repos.yaml \
--from-literal=GUARDRAILS_CONFIG=src/config/guardrails.yaml \
-n my-appkubectl create secret generic app-secrets \
--from-literal=GH_TOKEN= \
--from-literal=POSTGRES_USER= \
--from-literal=POSTGRES_HOST=\
--from-literal=POSTGRES_PASSWORD== \
--from-literal=QDRANT_API_KEY= \
--from-literal=QDRANT_URL= \
--from-literal=LANGSMITH_API_KEY= \
--from-literal=OPENAI_API_KEY= \
--from-literal=GUARDRAILS_API_KEY= \
--from-literal=SECRET_NAME= \
-n my-appProduction Docker Image
You need to build and push the image in production and send it to AWS ECR:
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com
aws ecr create-repository --repository-name fastapi-app --region <aus-region>
docker tag myapp-prod-image:latest <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latest
docker push <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latestThen adapt the image name in the deployment manifest and apply:
kubectl apply -f kubernetes/fastapi-deployment.yamlAs the VPC is in private mode, you cannot make a request locally. You can either forward the port, create an EC2 instance in the same network, or add a load balancer to your Kubernetes cluster following these instructions.
Then you can apply the load balancer manifest:
kubectl apply -f kubernetes/fastapi-service.yamlThese will expose an External IP that can be used to make requests:
curl -X POST "http://k8s-myapp-fastapie-96d739e92d-4d28b27c27683b40.elb.eu-central-1.amazonaws.com/process-issue" \
-H "Content-Type: application/json" \
-d '{
"title": "Test Issue",
"body": "Test Issue"
}'License
This project is licensed under the MIT License. See the LICENSE

