PR
pree-dew/protegi
Prompt Optimization With Textual Gradient
ProTeGi: Prompt Optimization With Textual Gradient
A Python library for automatically improving prompts using LLM feedback and bandit algorithms.
Quick Start
pip install -r requirements.txt
export ANTHROPIC_API_KEY="your_key"
python intent_classification_example.pySimple Example
import os
from llm import ProviderFactory, LLMConfig
from evaluation import ClassificationDataset, DatasetItem, PromptEvaluator
from optimization import GradientGenerator, PromptEditor, BanditBeamSearch, BanditBeamConfig
# Setup
api_key = os.getenv("ANTHROPIC_API_KEY")
config = LLMConfig(model="claude-3-5-haiku-20241022")
provider = ProviderFactory.create("anthropic", api_key=api_key, config=config)
# Create dataset
dataset = ClassificationDataset("intents", [
DatasetItem("I want my money back", "refund"),
DatasetItem("The app keeps crashing", "technical_support"),
DatasetItem("What are your shipping options?", "general_inquiry"),
])
# Optimize prompt
evaluator = PromptEvaluator(provider)
generator = GradientGenerator(provider)
editor = PromptEditor(provider)
config = BanditBeamConfig(beam_width=2, num_iterations=2)
protegi = BanditBeamSearch(evaluator, generator, editor, config)
best = protegi.optimize("What is the customer asking for?", dataset)
print(f"Optimized prompt: {best.prompt}")
print(f"F1 score: {best.mean_score:.3f}")Example Output
python intent_classification_example.py
Initial prompt: 'What is the customer asking for?'
Initial F1 score: 0.062
Initial accuracy: 0.167
Optimizing prompt...
๐ Iteration 1/2
Current beam: 1 candidates
Best score: 0.062
Generated: 2 new variants
Pruned to: 2 candidates
New best: 0.350
๐ Iteration 2/2
Current beam: 2 candidates
Best score: 0.350
Generated: 4 new variants
Pruned to: 2 candidates
New best: 0.760
Optimized prompt: 'Identify the customer's request by selecting from these predefined categories: [refund, technical_support, login_assistance, billing, general_inquiry]. If the request does not clearly match the first four categories and involves product information, availability, or open-ended questions, use 'general_inquiry'. Respond with the EXACT matching category name.'
Optimized F1 score: 0.760
Improvement: 1135.0%