pree-dew/protegi

ProTeGi: Prompt Optimization With Textual Gradient

A Python library for automatically improving prompts using LLM feedback and bandit algorithms.

Quick Start

pip install -r requirements.txt
export ANTHROPIC_API_KEY="your_key"
python intent_classification_example.py

Simple Example

import os
from llm import ProviderFactory, LLMConfig
from evaluation import ClassificationDataset, DatasetItem, PromptEvaluator
from optimization import GradientGenerator, PromptEditor, BanditBeamSearch, BanditBeamConfig

# Setup
api_key = os.getenv("ANTHROPIC_API_KEY")
config = LLMConfig(model="claude-3-5-haiku-20241022")
provider = ProviderFactory.create("anthropic", api_key=api_key, config=config)

# Create dataset
dataset = ClassificationDataset("intents", [
    DatasetItem("I want my money back", "refund"),
    DatasetItem("The app keeps crashing", "technical_support"),
    DatasetItem("What are your shipping options?", "general_inquiry"),
])

# Optimize prompt
evaluator = PromptEvaluator(provider)
generator = GradientGenerator(provider)
editor = PromptEditor(provider)
config = BanditBeamConfig(beam_width=2, num_iterations=2)

protegi = BanditBeamSearch(evaluator, generator, editor, config)
best = protegi.optimize("What is the customer asking for?", dataset)

print(f"Optimized prompt: {best.prompt}")
print(f"F1 score: {best.mean_score:.3f}")

Example Output

python intent_classification_example.py
Initial prompt: 'What is the customer asking for?'
Initial F1 score: 0.062
Initial accuracy: 0.167

Optimizing prompt...

🔄 Iteration 1/2
   Current beam: 1 candidates
   Best score: 0.062
   Generated: 2 new variants
   Pruned to: 2 candidates
   New best: 0.350

🔄 Iteration 2/2
   Current beam: 2 candidates
   Best score: 0.350
   Generated: 4 new variants
   Pruned to: 2 candidates
   New best: 0.760

Optimized prompt: 'Identify the customer's request by selecting from these predefined categories: [refund, technical_support, login_assistance, billing, general_inquiry]. If the request does not clearly match the first four categories and involves product information, availability, or open-ended questions, use 'general_inquiry'. Respond with the EXACT matching category name.'
Optimized F1 score: 0.760
Improvement: 1135.0%

pree-dew/protegi

Quick Start

Simple Example

Example Output

On this page

Languages

Contributors