GitHunt
PR

pronzzz/backdrop-bandits

Backprop Bandits is a terminal-based strategy game where players train, battle, and optimize miniature GPT models. It features an interactive CLI for managing compute energy, visualizing real-time loss curves and gradients, and competing to build the most efficient learner from scratch.

Backprop Bandits: The Gradient Arena

Python
License
Status

Backprop Bandits is a terminal-based game where you train, battle, and optimize miniature GPT models. Step into the arena, manage your compute "energy", and see if you can build the most efficient learner!

๐Ÿš€ Features

  • Train Micro-GPTs: Train language models from scratch on character-level data.
  • Battle Mode: Pit agents against each other to see who learns faster.
  • Efficiency Tracking: Balance model performance (perplexity) against computational cost (FLOPs/Energy).
  • Visualization: View real-time loss curves and gradient distributions.
  • Persistence: Save and load your champion agents.

๐Ÿ“ฆ Installation

  1. Clone the repository:

    git clone https://github.com/pronzzz/backdrop-bandits.git
    cd backdrop-bandits
  2. Install dependencies:

    pip install -r requirements.txt

    (Note: The game relies on matplotlib for visualization and standard libraries.)

๐ŸŽฎ Walkthrough & Usage

Start the game execution:

python3 -m src.game

You will enter the (bandit) command shell.

Core Commands

1. Training & Generation

  • create_agent [name] [n_embd] [n_layer] [n_head]: Create a custom agent.
    • Example: create_agent Tiny 8 1 2
  • switch_agent [name]: Select which agent to control.
  • train [steps]: Train the active agent on the dataset.
    • Example: train 100 (Watch the loss go down!)
  • generate [n]: Generate n names and see the game score.
  • [NEW] save [filename]: Save the current agent state to disk.
  • [NEW] load [filename]: Load an agent from disk.

2. Visualization & Diagnostics

  • visualize loss: Plot the training loss curve.
  • visualize gradients: Plot the gradient distribution (requires matplotlib).
  • diagnose: Check model health for exploding gradients or dead neurons.

3. Competition

  • battle [agent1] [agent2] [steps]: Pit two agents against each other. They both train for steps, then generate names. The one with the better balance of Perplexity and Uniqueness wins.
  • leaderboard: Check which agent is the most efficient (Score / Energy).

๐Ÿ† Strategies

  • Efficiency Mode: Large models learn faster but cost massive "Energy". Small models are efficient but might underfit. Find the balance!
  • Stability: If you set the learning_rate too high (using config learning_rate 10.0), your gradients might explode. Use diagnose to check.

๐Ÿ“‚ Project Structure

  • src/game.py: Main game loop and CLI.
  • src/model.py: The GPT architecture (based on microgpt).
  • src/engine.py: Autograd engine (Value class).
  • src/agent.py: Agent wrapper handling training and energy tracking.
  • src/mechanics.py: Stability and health monitoring system.
  • src/viz.py: Matplotlib visualization tools.

๐Ÿค Contributing

Contributions are welcome! Please check out the CONTRIBUTING.md guide for details.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with โค๏ธ by [Your Name] using MicroGPT.