pronzzz/backdrop-bandits
Backprop Bandits is a terminal-based strategy game where players train, battle, and optimize miniature GPT models. It features an interactive CLI for managing compute energy, visualizing real-time loss curves and gradients, and competing to build the most efficient learner from scratch.
Backprop Bandits: The Gradient Arena
Backprop Bandits is a terminal-based game where you train, battle, and optimize miniature GPT models. Step into the arena, manage your compute "energy", and see if you can build the most efficient learner!
๐ Features
- Train Micro-GPTs: Train language models from scratch on character-level data.
- Battle Mode: Pit agents against each other to see who learns faster.
- Efficiency Tracking: Balance model performance (perplexity) against computational cost (FLOPs/Energy).
- Visualization: View real-time loss curves and gradient distributions.
- Persistence: Save and load your champion agents.
๐ฆ Installation
-
Clone the repository:
git clone https://github.com/pronzzz/backdrop-bandits.git cd backdrop-bandits -
Install dependencies:
pip install -r requirements.txt
(Note: The game relies on
matplotlibfor visualization and standard libraries.)
๐ฎ Walkthrough & Usage
Start the game execution:
python3 -m src.gameYou will enter the (bandit) command shell.
Core Commands
1. Training & Generation
create_agent [name] [n_embd] [n_layer] [n_head]: Create a custom agent.- Example:
create_agent Tiny 8 1 2
- Example:
switch_agent [name]: Select which agent to control.train [steps]: Train the active agent on the dataset.- Example:
train 100(Watch the loss go down!)
- Example:
generate [n]: Generatennames and see the game score.- [NEW]
save [filename]: Save the current agent state to disk. - [NEW]
load [filename]: Load an agent from disk.
2. Visualization & Diagnostics
visualize loss: Plot the training loss curve.visualize gradients: Plot the gradient distribution (requires matplotlib).diagnose: Check model health for exploding gradients or dead neurons.
3. Competition
battle [agent1] [agent2] [steps]: Pit two agents against each other. They both train for steps, then generate names. The one with the better balance of Perplexity and Uniqueness wins.leaderboard: Check which agent is the most efficient (Score / Energy).
๐ Strategies
- Efficiency Mode: Large models learn faster but cost massive "Energy". Small models are efficient but might underfit. Find the balance!
- Stability: If you set the learning_rate too high (using
config learning_rate 10.0), your gradients might explode. Usediagnoseto check.
๐ Project Structure
src/game.py: Main game loop and CLI.src/model.py: The GPT architecture (based on microgpt).src/engine.py: Autograd engine (Value class).src/agent.py: Agent wrapper handling training and energy tracking.src/mechanics.py: Stability and health monitoring system.src/viz.py: Matplotlib visualization tools.
๐ค Contributing
Contributions are welcome! Please check out the CONTRIBUTING.md guide for details.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Built with โค๏ธ by [Your Name] using MicroGPT.