# Lab 03: Q-Learning This lab introduces reinforcement learning through Q-learning in repeated games. ## Game Overview **Type:** Reinforcement learning in repeated games **Players:** 2 players **Rounds:** 1000 rounds per game **Stages:** Single stage with state-based observations ## Games ### Chicken Game with Q-Learning - **Actions:** Swerve (0), Straight (1) - **State Space:** Previous action combinations - **Key Concept:** Learning optimal strategies through experience ## State Space ### Observations ```python observation = { "last_actions": [0, 1], # Previous actions from both players "round_count": 45 # Current round number } ``` ### Actions ```python action = 0 # Swerve action = 1 # Straight ``` ### Rewards Immediate payoffs from the Chicken game payoff matrix: ```python # Chicken game payoffs if my_action == 0 and opponent_action == 0: # Both swerve reward = 0 elif my_action == 0 and opponent_action == 1: # I swerve, they straight reward = -1 elif my_action == 1 and opponent_action == 0: # I straight, they swerve reward = 1 else: # Both straight reward = -10 ``` ## Game Structure ### Stage Type - **Single stage** that repeats for all rounds - **State-based observations** - previous actions influence current decisions - **Learning opportunities** - agents can improve over time ### Learning Opportunities - **Q-table updates** - learn value of state-action pairs - **Exploration vs exploitation** - balance trying new actions with using what works - **Convergence** - strategies may converge to optimal policies ## Testing ### Local Testing ```python from core.engine import Engine from core.game.ChickenGame import ChickenGame from core.agents.lab03.random_agent import RandomAgent my_agent = MyAgent("MyAgent") opponent = RandomAgent("Random") engine = Engine(ChickenGame(), [my_agent, opponent], rounds=1000) results = engine.run() print(f"My score: {results[0]}") print(f"Opponent score: {results[1]}") ``` ### Q-Table Analysis ```python def analyze_q_table(self): if hasattr(self, 'q_table'): print(f"Q-table size: {len(self.q_table)}") for state, actions in self.q_table.items(): print(f"State {state}: {actions}") ``` ## Next Steps 1. **Implement a Q-learning agent** using the common patterns 2. **Track state transitions** to understand the learning process 3. **Test exploration strategies** against different opponents 4. **Compete against other students** Focus on understanding Q-learning and state-based strategies!