Adversarial Search
Adversarial search is a method of reaching a goal while making moves against another player. This allows the agent to react to the moves of another player.
In search we make all moves. In games we play against an unpredictable opponent:
- Solution is a strategy specifying a move for every possible move of the opponent.
- A method is need for selecting good moves that stand a good chance of achieving a winning state whatever the opponent does.
- Because of combinatorial explosion, in practice we must approximate using heuristics.
Types of Games
- Information
- In some games we have fully observable environment. These are called games with prefect information.
- In others we have partially observable environments. These are called games with imperfect information.
- Determinism
- Some games are deterministic
- Other games have an element of chance.
In addition we will limit ourselves to two-player games, with a zero-sum. This means:
- The utility values at the end are equal and opposite
- If one player wins then the other player loses.
As a result we can calculate a perfect strategy for the player.
Problem Formulation
The difference between single player searches and adversarial search is that the set of goal states are replace by the utility function.
- Initial state $S_{start}$
- Initial board position. Which player moves first.
- Successor function
- Provides for every state $s$ and move the new state after the move.
- Terminal test
- Determines when the game is over.
- Utility function
- Numeric value for terminal states
- E.g. chess $+1,-1,0$
- Numeric value for terminal states
Game Tree
- Each level labelled with player to move.
- Each level represents a ply.
- Half a turn.
- Represents whit happen with competing agents.
minimax Algorithm
MIN and MAX are two players
- MAX want to win (maximise utility)
- Min wants to MAX to lose (minimise utility for MAX)
- MIN is the opponent
Both players will play to the best of their ability
- MAX wants a strategy for maximising utility assuming MIN will do best to minimise MAX’s utility.
- Considers the minimax value of each state
- The utility of a state given that both players play optimally.
minimax Value
- The utility (=minimax value) of a terminal state is given by its utility already (as an input).
- The utility (=minimaxvalue) of a MAX-state (when MAX moves) is the maximum of the utilities of its successor states.
- The utility (=minimax value) of MIN-state (when MIN moves) is the minimum of the utilities of its successor states.
Thus we can compute the minimax value recursively starting from the terminal states.
Formally, let $\text{Suc}(s)$ denote the set of successor state of state $s$. Define the function $\text{MinimaxV}(s)$ recursively as follows:
\[\text{MV}(s)= \begin{cases} \text{Utility}(s) & s\ \text{is terminal}\\ \max_{n\in\text{Suc}(s)}\text{MV}(n) & \max\text{movs in}\ s\\ \min_{n\in\text{Suc}(s)}\text{MV}(n) & \min\text{movs in}\ s \end{cases}\]This this definition $\text{MinimaxV}$ has been shortened to $\text{MV}$.
- Calculate minimax value of each state using the equation above starting from the terminal states.
- Games tree as minimax tree
- $\bigtriangleup$ max node, $\bigtriangledown$ min node
graph TD
subgraph max1[MAX]
A0
end
subgraph MIN
A1
A2
A3
end
subgraph max2[MAX]
A11
A21
A31
A12
A22
A32
A13
A23
A33
end
A0[3] -->|A1| A1[3]
A0 -->|A2| A2[2]
A0 -->|A3| A3[2]
A1 -->|A11| A11[3]
A1 -->|A12| A12[12]
A1 -->|A13| A13[8]
A2 -->|A21| A21[2]
A2 -->|A22| A22[4]
A2 -->|A23| A23[6]
A3 -->|A31| A31[14]
A3 -->|A32| A32[5]
A3 -->|A33| A33[2]
Properties of minimax
Assuming MAX always moves to the state with the maximal minimax value.
- Optimal
- Against an optimal opponent. Otherwise MAX will do even better. There may however, be better strategies against suboptimal opponents.
- Time complexity
- Can be implemented DFS so that space complexity is $b^m$ (branching factor $b$, depth $m$).
- Space complexity
- can be implemented DFS so that space complexity is $bm$.
For chess, $b\approx 35,\ m\approx 100$ for reasonable games.
- 10^154 paths to explore
- Infeasible
But do we need to explore every path?