AzulMARL

PettingZoo AI env for Azul multiplayer board game to enable AI agent training.

Most important libraries used

Usage

Initiating the env via PettingZoo

from azul_marl_env import azul_v1_2players, azul_v1_3players, azul_v1_4players

env_2players = azul_v1_2players()
env_3players = azul_v1_3players()
env_4players = azul_v1_4players()

env_2players_custom_max_moves = azul_v1_2players(max_moves=100)

Initiating the env directly

from azul_marl_env import AzulEnv

env = AzulEnv(player_count=2)
env = AzulEnv(player_count=3)
env = AzulEnv(player_count=4) 

env = AzulEnv(player_count=2, max_moves=100)

Making moves

from azul_marl_env import azul_v1_2players
import random

# Create and reset the environment
env = azul_v1_2players()
observation, info = env.reset()

# Iterate through agents
for agent in env.agent_iter():
    # Get current agent's observation and info
    observation, reward, termination, truncation, info = env.last()
    
    if termination or truncation:
        break
        
    # Get valid moves for current agent
    valid_moves = info["valid_moves"]
    # Select a random valid move
    action = random.choice(valid_moves)
    # Execute the move
    env.step(action)
    
    # Render the environment (optional)
    env.render()

# Close the environment
env.close()

Example of a complete game using random valid moves

from azul_marl_env import azul_v1_2players
import random

def play_random_game():
    env = azul_v1_2players()
    observation, info = env.reset()
    
    for agent in env.agent_iter():
        observation, reward, termination, truncation, info = env.last()
        
        if termination or truncation:
            print(f"Game finished! Final scores: {[player['score'] for player in observation['players']]}")
            break
            
        # Get valid moves and make a random move
        valid_moves = info["valid_moves"]
        if valid_moves:
            action = random.choice(valid_moves)
            env.step(action)
    
    env.close()

play_random_game()

Environment Details

Factory count (num_factories):
    2 player game -> 5
    3 player game -> 7
    4 player game -> 9

Action Space: MultiDiscrete([num_factories + 1, 5, 20, 5])
- First value: Factory index. Index 0 is taken for the center so the factory indexes are: 0 based factory index + 1.
- Second value: Tile color (0-4 representing different colors)
- Third value: Number of tiles to place on floor (0-19)
- Fourth value: Pattern line index (0-4)
Observation Space: Dictionary containing:
- factories: Box(0, 4, (num_factories, 5), int32) - Tile counts in each factory
- center: Box(0, 3 * num_factories, (5,), int32) - Tile counts in center
- players: Tuple of player states, each containing:
  - pattern_lines: Box(0, 5, (5, 5), int32) - Current pattern lines
  - wall: Box(0, 5, (5, 5), int32) - Wall state
  - floor: Box(0, 5, (7,), int32) - Floor tiles
  - is_starting: Discrete(2) - First player marker
  - score: Discrete(241) - Player's score
- bag: Box(0, 100, (5,), int32) - Remaining tiles in bag
- lid: Box(0, 100, (5,), int32) - Discarded tiles
Reward:
- -1 for each step until game end
- -2 for invalid moves
- Final Azul score is added to cumulative reward at game end
Done: True when:
- Game is completed (at least one player filled at least one horizontal wall)
- False otherwise
- Truncated: True when:
- Maximum moves reached (player_count * 150 by default)
- False otherwise
Info: Contains valid_moves list for the current player

AzulImplementation/AzulMARL

AzulMARL

Most important libraries used

Usage

Initiating the env via PettingZoo

Initiating the env directly

Making moves

Example of a complete game using random valid moves

Environment Details

On this page

Languages

Contributors