KohakuBlueleaf/KohakuEngine
All-in-Python Config and Execution engine system for RD workload
KohakuEngine
All-in-Python Configuration and Execution Engine for R&D Workloads
KohakuEngine bridges the gap between quick prototyping with global variables and production-ready configuration systems. Write configs as pure Python - no YAML, no JSON, just Python variables.
How It Works
KohakuEngine takes your existing Python scripts and runs them with different configurations without modifying your code:
- Import your script as a Python module
- Inject global variables from your config
- Find and call your entrypoint (the function in
if __name__ == "__main__")
Your script runs exactly as if you executed it directly, but with different configuration values!
Features
- ๐ Python-First Configs: Just define variables -
from_globals()captures them automatically - ๐ง Include Anything: Use
use()to include functions, classes, objects in your config - ๐ Iterative Workflows: Generator-based configs for hyperparameter sweeps
- โก Non-Invasive: Works with existing scripts - no refactoring required
- ๐ Parallel Execution: Run multiple configs in parallel with subprocess isolation
Quick Start
Installation
pip install -e .Complete Example
Step 1: Your Script (No Changes Needed!)
File: train.py
# Global variables - will be overridden by config
learning_rate = 0.001
batch_size = 32
epochs = 10
def train():
print(f"Training: LR={learning_rate}, BS={batch_size}, Epochs={epochs}")
for epoch in range(epochs):
loss = 1.0 / (epoch + 1)
print(f"Epoch {epoch+1}/{epochs} - Loss: {loss:.4f}")
return {"final_loss": loss}
if __name__ == "__main__":
train()Step 2: Create a Config
File: config.py - Just define your variables!
from kohakuengine import Config
# Define config as normal Python variables
learning_rate = 0.01
batch_size = 64
epochs = 5
def config_gen():
return Config.from_globals() # Automatically captures all variables!That's it! No manual dict building. from_globals() captures everything.
Step 3: Run It!
Python API:
from kohakuengine import Config, Script
config = Config.from_file('config.py')
script = Script('train.py', config=config)
result = script.run()Command Line:
kogine run train.py --config config.pyOutput:
Training: LR=0.01, BS=64, Epochs=5
Epoch 1/5 - Loss: 1.0000
Epoch 2/5 - Loss: 0.5000
Epoch 3/5 - Loss: 0.3333
Epoch 4/5 - Loss: 0.2500
Epoch 5/5 - Loss: 0.2000
Config Methods
from_globals() - Auto-capture Variables (Recommended)
The simplest way - just define variables and call from_globals():
from kohakuengine import Config
learning_rate = 0.01
batch_size = 64
epochs = 5
def config_gen():
return Config.from_globals()use() - Include Functions/Classes
By default, from_globals() skips functions and classes. Use use() to include them:
from kohakuengine import Config, use
import torch
learning_rate = 0.01
batch_size = 64
# Wrap functions/classes to include them
optimizer = use(torch.optim.Adam)
model_class = use(MyModel)
loss_fn = use(lambda x, y: (x - y).pow(2).mean())
def config_gen():
return Config.from_globals()capture_globals() - Context Manager
Capture everything defined within a block (including modules, functions, etc.):
from kohakuengine import capture_globals, Config
with capture_globals() as ctx:
import numpy as np
learning_rate = 0.01
batch_size = 64
def config_gen():
return Config.from_context(ctx)Generator Configs - Hyperparameter Sweeps
Use generators to yield multiple configs:
from kohakuengine import Config
def config_gen():
for lr in [0.001, 0.01, 0.1]:
for bs in [16, 32, 64]:
yield Config(globals_dict={
'learning_rate': lr,
'batch_size': bs
})Workflows
Sequential Execution
from kohakuengine import Config, Script, Flow
# Define configs using from_globals pattern
preprocess_config = Config(globals_dict={'input': 'data.csv', 'output': 'processed.csv'})
train_config = Config(globals_dict={'data': 'processed.csv', 'epochs': 50})
eval_config = Config(globals_dict={'model': 'model.pt'})
scripts = [
Script('preprocess.py', config=preprocess_config),
Script('train.py', config=train_config),
Script('evaluate.py', config=eval_config),
]
flow = Flow(scripts, mode='sequential')
results = flow.run()Parallel Execution
Run multiple scripts or configs in parallel:
from kohakuengine import Config, Script, Flow
# Same script with different configs
scripts = [
Script('train.py', config=Config(globals_dict={'lr': 0.001})),
Script('train.py', config=Config(globals_dict={'lr': 0.01})),
Script('train.py', config=Config(globals_dict={'lr': 0.1})),
]
flow = Flow(scripts, mode='parallel', max_workers=3)
results = flow.run()CLI Reference
# Run single script
kogine run script.py --config config.py
# Sequential workflow
kogine workflow sequential script1.py script2.py --config config.py
# Parallel execution
kogine workflow parallel script.py --config sweep_config.py --workers 4Advanced: Manual Config Dict
For explicit control, you can still use the dict-based approach:
from kohakuengine import Config
def config_gen():
return Config(
globals_dict={
'learning_rate': 0.01,
'batch_size': 64,
},
args=[], # Positional args for entrypoint
kwargs={}, # Keyword args for entrypoint
metadata={} # Optional tracking metadata
)Documentation
- API.md - Complete API reference
- GOAL.md - Project vision and objectives
- PLAN.md - Technical architecture and design
- TODO.md - Implementation status and roadmap
License
Apache-2.0