aliyanz85/crisis-sim
๐จ AI-powered emergency response simulation combining Mesa ABM with advanced LLM reasoning strategies (ReAct, Reflexion, Plan-Execute) for optimized crisis management and multi-agent coordination
๐จ CrisisSim: AI-Powered Emergency Response Simulation
Advanced multi-agent simulation platform combining Mesa ABM with LLM-driven reasoning strategies for optimized crisis response planning
๐ฏ Overview
CrisisSim is a comprehensive emergency response simulation that models crisis scenarios with intelligent AI agents. The platform integrates multiple Large Language Model (LLM) reasoning strategies including ReAct, Reflexion, Plan-Execute, Chain-of-Thought, and Tree-of-Thought to optimize rescue operations, resource allocation, and emergency response coordination.
โจ Key Features
๐ค Multi-Strategy AI Planning
- ReAct: Reasoning and Acting in iterative cycles
- Reflexion: Self-reflection and memory-driven improvements
- Plan-Execute: Hierarchical planning with tactical execution
- Chain-of-Thought (CoT): Sequential reasoning chains
- Tree-of-Thought (ToT): Branched reasoning exploration
๐ Realistic Crisis Environment
- Dynamic Fire Spread: Realistic fire propagation mechanics
- Aftershock Events: Earthquake aftermath simulations
- Resource Constraints: Battery, water, and tool limitations
- Hospital Triage: FIFO and priority-based patient management
- Multi-Agent Coordination: Drones, medics, and trucks working together
๐ Comprehensive Evaluation Framework
- Performance Metrics: Rescue efficiency, response time, resource utilization
- Batch Evaluation: Multi-seed statistical analysis
- Visualization: Real-time web UI and detailed performance plots
- Comparative Analysis: Strategy performance across different scenarios
๐ Quick Start
Installation
# Clone the repository
git clone https://github.com/aliyanz85/crisis-sim.git
cd crisis-sim
# Setup environment
python -m venv .venv && source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txtBasic Usage
๐ฎ Interactive Web Interface
python server.py
# Open http://127.0.0.1:8522 in your browser๐ Command Line Simulation
# Quick demo with mock LLM (no API keys required)
python main.py --map configs/map_small.yaml --provider mock --strategy react --seed 42 --ticks 150
# Advanced run with Groq API
export LLM_PROVIDER=groq
export GROQ_API_KEY=your_api_key_here
python main.py --map configs/map_medium.yaml --provider groq --strategy plan_execute --ticks 200๐ Batch Evaluation & Analysis
# Run comprehensive evaluation
python eval/harness.py --n_seeds 5 --maps configs/map_small.yaml configs/map_medium.yaml configs/map_hard.yaml --strategies react reflexion plan_execute --ticks 200
# Generate performance plots
python eval/plots.py --summary results/agg/summary.csv --out results/plots๐๏ธ Architecture
CrisisSim/
โโโ ๐ง reasoning/ # LLM strategy implementations
โ โโโ react.py # ReAct reasoning loops
โ โโโ reflexion.py # Memory-driven self-improvement
โ โโโ plan_execute.py # Hierarchical planning
โ โโโ cot.py # Chain-of-thought reasoning
โ โโโ tot.py # Tree-of-thought exploration
โโโ ๐ env/ # Simulation environment
โ โโโ world.py # Mesa model & crisis dynamics
โ โโโ agents.py # Agent behaviors (drones, medics, trucks)
โ โโโ dynamics.py # Fire spread, aftershocks
โ โโโ sensors.py # State observation system
โโโ ๐ ๏ธ tools/ # Agent capabilities
โ โโโ hospital.py # Medical facility management
โ โโโ resources.py # Resource tracking & constraints
โ โโโ routing.py # Pathfinding & navigation
โโโ ๐ eval/ # Performance evaluation
โ โโโ harness.py # Batch experiment runner
โ โโโ plots.py # Visualization generation
โโโ ๐ configs/ # Scenario configurations
โโโ map_small.yaml # Training scenarios
โโโ map_medium.yaml # Standard benchmarks
โโโ map_hard.yaml # Challenge scenarios
๐ฏ Supported Scenarios
| Scenario | Size | Complexity | Survivors | Key Challenges |
|---|---|---|---|---|
| Small | 20ร20 | Beginner | 15 | Basic coordination |
| Medium | 25ร25 | Intermediate | 25 | Resource management |
| Hard | 30ร30 | Advanced | 40 | Multi-crisis events |
๐ Performance Metrics
- ๐ฅ Rescue Efficiency: Survivors saved vs. casualties
- โฑ๏ธ Response Time: Average rescue completion time
- ๐ Resource Utilization: Energy and tool consumption
- ๐ Crisis Mitigation: Fires extinguished, roads cleared
- ๐ฅ Hospital Management: Triage efficiency, overflow events
- ๐ค AI Performance: JSON validity, replanning frequency
๐ LLM Provider Support
| Provider | Models | Setup |
|---|---|---|
| Groq | Llama 3.3 70B | export GROQ_API_KEY=your_key |
| Google Gemini | Gemini 1.5 Flash | export GEMINI_API_KEY=your_key |
| Mock | Heuristic Fallback | No setup required |
๐จ Visualization Features
Real-time Web Interface
- ๐บ๏ธ Interactive crisis map visualization
- ๐ Live performance dashboards
- ๐ Real-time metrics tracking
- ๐ฎ Manual control override capabilities
Performance Analytics
- ๐ Strategy comparison charts
- ๐ Rescue efficiency trends
- ๐ฏ Resource utilization heatmaps
- ๐ Statistical significance testing
๐ก๏ธ Safety & Ethics
This simulation is designed for:
- ๐ Research: Emergency response optimization
- ๐ Education: Crisis management training
- ๐ข Planning: Resource allocation strategies
- ๐งช Development: AI reasoning system testing
๐ค Contributing
We welcome contributions! Areas of focus:
- ๐ง New LLM reasoning strategies
- ๐ Additional crisis scenarios
- ๐ Advanced evaluation metrics
- ๐จ Visualization improvements
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Mesa Project: Agent-based modeling framework
- OpenAI: LLM reasoning methodologies
- Crisis Response Community: Domain expertise and validation
Built with โค๏ธ for emergency response optimization and AI reasoning research