Backprop Bandits: The Gradient Arena

Backprop Bandits is a terminal-based game where you train, battle, and optimize miniature GPT models. Step into the arena, manage your compute "energy", and see if you can build the most efficient learner!

🚀 Features

Train Micro-GPTs: Train language models from scratch on character-level data.
Battle Mode: Pit agents against each other to see who learns faster.
Efficiency Tracking: Balance model performance (perplexity) against computational cost (FLOPs/Energy).
Visualization: View real-time loss curves and gradient distributions.
Persistence: Save and load your champion agents.

📦 Installation

Clone the repository:

git clone https://github.com/pronzzz/backdrop-bandits.git
cd backdrop-bandits

Install dependencies:
```
pip install -r requirements.txt
```
(Note: The game relies on matplotlib for visualization and standard libraries.)

🎮 Walkthrough & Usage

Start the game execution:

python3 -m src.game

You will enter the (bandit) command shell.

Core Commands

1. Training & Generation

create_agent [name] [n_embd] [n_layer] [n_head]: Create a custom agent.
- Example: create_agent Tiny 8 1 2
switch_agent [name]: Select which agent to control.
train [steps]: Train the active agent on the dataset.
- Example: train 100 (Watch the loss go down!)
generate [n]: Generate n names and see the game score.
[NEW] save [filename]: Save the current agent state to disk.
[NEW] load [filename]: Load an agent from disk.

2. Visualization & Diagnostics

visualize loss: Plot the training loss curve.
visualize gradients: Plot the gradient distribution (requires matplotlib).
diagnose: Check model health for exploding gradients or dead neurons.

3. Competition

battle [agent1] [agent2] [steps]: Pit two agents against each other. They both train for steps, then generate names. The one with the better balance of Perplexity and Uniqueness wins.
leaderboard: Check which agent is the most efficient (Score / Energy).

🏆 Strategies

Efficiency Mode: Large models learn faster but cost massive "Energy". Small models are efficient but might underfit. Find the balance!
Stability: If you set the learning_rate too high (using config learning_rate 10.0), your gradients might explode. Use diagnose to check.

📂 Project Structure

src/game.py: Main game loop and CLI.
src/model.py: The GPT architecture (based on microgpt).
src/engine.py: Autograd engine (Value class).
src/agent.py: Agent wrapper handling training and energy tracking.
src/mechanics.py: Stability and health monitoring system.
src/viz.py: Matplotlib visualization tools.

🤝 Contributing

Contributions are welcome! Please check out the CONTRIBUTING.md guide for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with ❤️ by [Your Name] using MicroGPT.

pronzzz/backdrop-bandits