System Overview

Welcome to the Algorithmic Game Theory (AGT) system! This page provides a comprehensive overview of how the system works, from the high-level architecture down to the detailed interactions that enable students to learn game theory through hands-on experience.

High-Level Architecture

The AGT system is a distributed game engine that enables students to create intelligent agents and compete in various game theory scenarios. The system follows a layered architecture where each layer has specific responsibilities and communicates with adjacent layers through well-defined interfaces.

┌─────────────────────────────────────────────────────────────┐
│                    STUDENT CLIENTS                          │
├─────────────────────────────────────────────────────────────┤
│  Student 1  │  Student 2  │  Student 3  │  Student N     │
│  Agent      │  Agent      │  Agent      │  Agent         │
└────────────────────┼─────────────────────┼──────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                      AGT SERVER                            │
├─────────────────────────────────────────────────────────────┤
│  Connection Manager → Game Manager → Results Manager       │
└─────────────────────────┼──────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                       GAME ENGINE                          │
├─────────────────────────────────────────────────────────────┤
│  Game Coordinator → Action Collector → Reward Distributor  │
└─────────────────────────┼──────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                        GAME LAYER                          │
├─────────────────────────────────────────────────────────────┤
│  RPS Game  │  BOS Game  │  Chicken Game  │  Lemonade Game │
└────────────┼────────────┼────────────────┼────────────────┘
             │
             ▼
┌─────────────────────────────────────────────────────────────┐
│                       STAGE LAYER                          │
├─────────────────────────────────────────────────────────────┤
│  Matrix Stage  │  Spatial Stage  │  Auction Stage        │
└────────────────┼─────────────────┼────────────────────────┘

The architecture diagram illustrates how the system scales from individual student agents to a centralized server that manages all game execution. Student agents connect to the server over the network, which coordinates game sessions and manages the overall competition. The game engine handles the core game logic, while the game layer provides different game types and the stage layer breaks complex games into manageable phases.

Core Components

Student Clients

Student clients run on individual student machines and contain the intelligent agents that students implement. Each agent must implement three core methods: get_action() to choose actions based on observations, update() to learn from rewards and feedback, and reset() to prepare for new games. Agents connect to the server over TCP sockets and communicate in real-time during games, sending their chosen actions and receiving game state updates and rewards.

AGT Server

The AGT Server acts as the central coordination hub, running on the instructor’s machine and managing all game sessions and tournaments. It handles client connections, coordinates game sessions between multiple students, manages tournament brackets and matchmaking, and collects and stores all competition results. The server communicates with student agents through TCP connections and orchestrates the game engine to execute actual game logic.

Game Engine

The Game Engine is the core execution unit that manages individual game sessions. It collects actions from all participating agents, executes the game logic by calling the appropriate game implementation, distributes rewards and new observations back to agents, and tracks game progress and completion. The engine ensures all agents are synchronized and maintains consistent game state throughout each session.

Game Layer

The Game Layer contains the actual game implementations that define the rules, mechanics, and payoff structures for different game types. Each game (RPS, BOS, Chicken, Lemonade, Auctions) implements a standard interface with methods for resetting the game state, processing player actions, calculating rewards, and determining when the game is complete. Games can contain multiple stages for complex scenarios like auctions.

Stage Layer

The Stage Layer breaks complex games into manageable phases or rounds. Simple games like matrix games use a single stage that repeats, while complex games like auctions use multiple stages with different rules and objectives. Each stage processes actions for the current phase, manages state transitions between phases, and handles phase-specific logic and reward calculations.

Key Interfaces

Agent Interface

class BaseAgent:
    def get_action(self, observation: Dict[str, Any]) -> Any
    def update(self, reward: float, info: Dict[str, Any]) -> None
    def reset(self) -> None

Game Interface

class BaseGame:
    def reset(self, seed: int | None = None) -> ObsDict
    def step(self, actions: ActionDict) -> Tuple[ObsDict, RewardDict, bool, InfoDict]
    def players_to_move(self) -> List[PlayerId]

Stage Interface

class BaseStage:
    def step(self, actions: ActionDict) -> Tuple[ObsDict, RewardDict, bool, InfoDict]
    def legal_actions(self, player: PlayerId) -> Any
    def is_done(self) -> bool

How Everything Works Together

The Game Loop

Here’s how a typical game session flows:

1. Engine creates a game and agents
2. Game.reset() → Initial observations
3. For each round:
   a. Engine asks agents: "What's your action?"
   b. Agents call get_action(observation)
   c. Engine collects all actions
   d. Game.step(actions) → New state and rewards
   e. Engine calls agent.update(reward, info)
4. Repeat until game is done
5. Engine returns final scores

Example: Rock Paper Scissors

Let’s trace through a simple RPS game:

# 1. Setup
game = RPSGame(rounds=10)
my_agent = MyRPSAgent("Alice")
opponent = RandomAgent("Bob")
engine = Engine(game, [my_agent, opponent])

# 2. Game starts
observations = game.reset()  # Both agents see: {}

# 3. First round
my_action = my_agent.get_action({})      # Alice chooses "Rock"
opponent_action = opponent.get_action({}) # Bob chooses "Paper"

# 4. Process round
new_obs, rewards, done, info = game.step({
    0: my_action,      # Alice's action
    1: opponent_action # Bob's action
})

# 5. Update agents
my_agent.update(rewards[0], info[0])      # Alice learns she lost
opponent.update(rewards[1], info[1])      # Bob learns he won

# 6. Continue for 10 rounds...

Data Flow

The system processes data through several key flows, each serving a specific purpose in the overall game execution.

1. Game Session Flow

Client Connect → Server Validation → Game Initialization → 
Round Loop: [Collect Actions → Process Game → Distribute Results] → 
Game Complete → Results Collection

The game session flow shows the complete lifecycle of a single game, from initial connection through multiple rounds of play to final results collection. This flow ensures that all players are properly synchronized and that game state is maintained consistently throughout the session.

2. Agent Interaction Flow

Agent receives observation → Agent chooses action → 
Engine collects all actions → Game processes actions → 
Engine distributes rewards → Agent learns from experience

The agent interaction flow represents the core learning loop that enables agents to improve over time. Each iteration of this flow allows agents to observe the game state, make decisions, and learn from the outcomes of their actions.

3. Tournament Flow

Tournament Start → Match Players → Run Games → 
Collect Results → Calculate Rankings → Save Results

The tournament flow manages the higher-level competition structure, coordinating multiple games between different players and aggregating results to determine overall rankings and performance metrics.

Lab-Specific Patterns

Lab 01: Matrix Games (RPS, BOS, Chicken)

Learning Focus: Basic game theory, Nash equilibria, best responses

Game Structure:

  • One stage that repeats for many rounds

  • Two players making simultaneous moves

  • Payoff matrix defines rewards for each action combination

Observation Structure:

observation = {}  # Often empty - agents know the game structure

Action Format:

action = 0  # Integer representing action (0=Rock, 1=Paper, 2=Scissors)

Lab 02: Finite State Machines

Learning Focus: State machines, coordination, repeated games

Game Structure:

  • Multiple states with different rules

  • State transitions based on actions

  • Coordination challenges between players

Lab 03: Q-Learning

Learning Focus: Reinforcement learning, exploration vs exploitation

Game Structure:

  • Repeated interactions with the same opponent

  • State-based observations (previous actions)

  • Learning opportunities across many rounds

Learning Pattern:

def get_action(self, observation):
    state = self.get_state_key(observation)
    if random.random() < self.epsilon:  # Explore
        return random.choice(self.legal_actions)
    else:  # Exploit
        return self.get_best_action(state)

def update(self, reward, info):
    # Update Q-table based on experience
    # Q(s,a) = Q(s,a) + α[r + γ*max(Q(s',a')) - Q(s,a)]
    pass

Lab 04: Spatial Games (Lemonade Stand)

Learning Focus: Spatial competition, location-based strategies

Game Structure:

  • Multiple locations to choose from

  • Spatial competition with other players

  • Customer distribution affects profits

Observation Structure:

observation = {
    "locations": [0, 1, 2, 3],  # Available locations
    "opponent_positions": [1, 3],  # Where opponents are
    "customer_distribution": {...}  # Customer flow info
}

Action Format:

action = 2  # Integer representing location (0-3 for 4 locations)

Lab 06: Auctions

Learning Focus: Auction theory, bidding strategies, mechanism design

Game Structure:

  • Multiple items to bid on

  • Valuation functions for each player

  • Auction mechanisms determine winners and payments

Observation Structure:

observation = {
    "items": [{"id": 1, "value": 10}, {"id": 2, "value": 15}],
    "my_valuations": [8, 12],  # Agent's values for items
    "bidding_history": [...]  # Previous bids
}

Action Format:

action = [5, 8]  # List of bids for each item

Key Concepts for Understanding

1. Observation Structure

Observations tell agents what’s happening in the game. The structure varies by lab:

  • Matrix games (Lab 01-03): Often empty observations since agents know the game structure

  • Spatial games (Lab 04): Include location information and opponent positions

  • Auctions (Lab 06): Include item valuations and bidding history

2. Action Formats

Different games expect different action types:

  • Matrix games: Integer representing action choice

  • Spatial games: Integer representing location choice

  • Auctions: List of bids for each item

3. Learning Patterns

Fictitious Play (Lab 01):

def get_action(self, observation):
    # Track opponent's action frequencies
    # Choose best response to their most likely action
    pass

Q-Learning (Lab 03):

def get_action(self, observation):
    state = self.get_state_key(observation)
    if random.random() < self.epsilon:  # Explore
        return random.choice(self.legal_actions)
    else:  # Exploit
        return self.get_best_action(state)

def update(self, reward, info):
    # Update Q-table based on experience
    # Q(s,a) = Q(s,a) + α[r + γ*max(Q(s',a')) - Q(s,a)]
    pass

Component Interactions

Students → Agents → Engine → Games → Stages
    ↓
Server ← Results ← Engine ← Games
    ↓
Administrators → Server → Tournaments → Results

Error Handling

The system handles various error conditions:

  • Timeouts: Use default actions and continue

  • Invalid Actions: Reject and request new actions

  • Connection Loss: Disconnect player and notify others

  • Game Errors: End game early and save partial results

Next Steps

This overview provides the foundation for understanding the AGT system. From here:

  • Students should proceed to the “For Students” section to learn how to create and run agents

  • Administrators should proceed to the “For Administrators” section to learn how to manage the system

The beauty of this system is that it provides a consistent interface for learning game theory concepts while handling all the complex infrastructure behind the scenes.