Othello is a turn-based two-player strategy board game. The players take turns placing pieces--one player white and the other player black--on an 8x8 board in such a way that captures some of the opponent's pieces, with the goal of finishing the game with more pieces of their color on the board.
Every move must capture one more more of the opponent's pieces. To capture, player A places a piece adjacent to one of player B's pieces so that there is a straight line (horizontal, vertical, or diagonal) of adjacent pieces that begins with one of player A's pieces, continues with one more more of player B's pieces, and ends with one of player A's pieces.
For example, if Black places a piece on square (5, 1), he will capture all of Black's pieces between (5, 1) and (5, 6):
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
1 . . . . . . . . 1 . . . . . . . .
2 . . . . . . . . 2 . . . . . . . .
3 . . o @ . o . . 3 . . o @ . o . .
4 . . o o @ @ . . 4 . . o o @ @ . .
5 . o o o o @ . . 5 @ @ @ @ @ @ . .
6 . . . @ o . . . 6 . . . @ o . . .
7 . . . . . . . . 7 . . . . . . . .
8 . . . . . . . . 8 . . . . . . . .
For more more information about the game (which is also known as Reversi) including detailed rules, see the entry on Wikipedia. Additionally, this implementation doesn't take into account some tournament-style Othello details, such as game time limits and a different indexing scheme.
We will implement representations for the board and pieces and the mechanics of playing a game. We will then explore several game-playing strategies. There is a simple command-line program provided for playing against the computer or comparing two strategies.
Written by Daniel Connelly. This implementation follows chapter 18 of Peter Norvig's "Paradigms of Artificial Intelligence".
We represent the board as a 100-element list, which includes each square on the board as well as the outside edge. Each consecutive sublist of ten elements represents a single row, and each list element stores a piece. An initial board contains four pieces in the center:
? ? ? ? ? ? ? ? ? ?
? . . . . . . . . ?
? . . . . . . . . ?
? . . . . . . . . ?
? . . . o @ . . . ?
? . . . @ o . . . ?
? . . . . . . . . ?
? . . . . . . . . ?
? . . . . . . . . ?
? ? ? ? ? ? ? ? ? ?
This representation has two useful properties:
board[mn]
. This avoids the need to write
functions that convert between square locations and list indexes.The outside edge is marked ?, empty squares are ., black is @, and white is o. The black and white pieces represent the two players.
EMPTY, BLACK, WHITE, OUTER = '.', '@', 'o', '?'
PIECES = (EMPTY, BLACK, WHITE, OUTER)
PLAYERS = {BLACK: 'Black', WHITE: 'White'}
To refer to neighbor squares we can add a direction to a square.
UP, DOWN, LEFT, RIGHT = -10, 10, -1, 1
UP_RIGHT, DOWN_RIGHT, DOWN_LEFT, UP_LEFT = -9, 11, 9, -11
DIRECTIONS = (UP, UP_RIGHT, RIGHT, DOWN_RIGHT, DOWN, DOWN_LEFT, LEFT, UP_LEFT)
List all the valid squares on the board.
def squares():
return [i for i in xrange(11, 89) if 1 <= (i % 10) <= 8]
Create a new board with the initial black and white positions filled.
def initial_board():
board = [OUTER] * 100
for i in squares():
board[i] = EMPTY
The middle four squares should hold the initial piece positions.
board[44], board[45] = WHITE, BLACK
board[54], board[55] = BLACK, WHITE
return board
Get a string representation of the board.
def print_board(board):
rep = ''
rep += ' %s\n' % ' '.join(map(str, range(1, 9)))
for row in xrange(1, 9):
begin, end = 10*row + 1, 10*row + 9
rep += '%d %s\n' % (row, ' '.join(board[begin:end]))
return rep
We need functions to get moves from players, check to make sure that the moves are legal, apply the moves to the board, and detect when the game is over.
A move must be both valid and legal: it must refer to a real square, and it must form a bracket with another piece of the same color with pieces of the opposite color in between.
Is move a square on the board?
def is_valid(move):
return isinstance(move, int) and move in squares()
Get player's opponent piece.
def opponent(player):
return BLACK if player is WHITE else WHITE
Find a square that forms a bracket with square
for player
in the given
direction
. Returns None if no such square exists.
def find_bracket(square, player, board, direction):
bracket = square + direction
if board[bracket] == player:
return None
opp = opponent(player)
while board[bracket] == opp:
bracket += direction
return None if board[bracket] in (OUTER, EMPTY) else bracket
Is this a legal move for the player?
def is_legal(move, player, board):
hasbracket = lambda direction: find_bracket(move, player, board, direction)
return board[move] == EMPTY and any(map(hasbracket, DIRECTIONS))
When the player makes a move, we need to update the board and flip all the bracketed pieces.
Update the board to reflect the move by the specified player.
def make_move(move, player, board):
board[move] = player
for d in DIRECTIONS:
make_flips(move, player, board, d)
return board
Flip pieces in the given direction as a result of the move by player.
def make_flips(move, player, board, direction):
bracket = find_bracket(move, player, board, direction)
if not bracket:
return
square = move + direction
while square != bracket:
board[square] = player
square += direction
class IllegalMoveError(Exception):
def __init__(self, player, move, board):
self.player = player
self.move = move
self.board = board
def __str__(self):
return '%s cannot move to square %d' % (PLAYERS[self.player], self.move)
Get a list of all legal moves for player.
def legal_moves(player, board):
return [sq for sq in squares() if is_legal(sq, player, board)]
Can player make any moves?
def any_legal_move(player, board):
return any(is_legal(sq, player, board) for sq in squares())
Each round consists of:
Play a game of Othello and return the final board and score.
def play(black_strategy, white_strategy):
board = initial_board()
player = BLACK
strategy = lambda who: black_strategy if who == BLACK else white_strategy
while player is not None:
move = get_move(strategy(player), player, board)
make_move(move, player, board)
player = next_player(board, player)
return board, score(BLACK, board)
Which player should move next? Returns None if no legal moves exist.
def next_player(board, prev_player):
opp = opponent(prev_player)
if any_legal_move(opp, board):
return opp
elif any_legal_move(prev_player, board):
return prev_player
return None
Call strategy(player, board) to get a move.
def get_move(strategy, player, board):
copy = list(board) # copy the board to prevent cheating
move = strategy(player, copy)
if not is_valid(move) or not is_legal(move, player, board):
raise IllegalMoveError(player, move, copy)
return move
Compute player's score (number of player's pieces minus opponent's).
def score(player, board):
mine, theirs = 0, 0
opp = opponent(player)
for sq in squares():
piece = board[sq]
if piece == player: mine += 1
elif piece == opp: theirs += 1
return mine - theirs
The easiest strategy to implement simply picks a move at random.
import random
A strategy that always chooses a random legal move.
def random_strategy(player, board):
return random.choice(legal_moves(player, board))
A more sophisticated strategy could look at every available move and evaluate them in some way. This consists of getting a list of legal moves, applying each one to a copy of the board, and choosing the move that results in the "best" board.
Construct a strategy that chooses the best move by maximizing evaluate(player, board) over all boards resulting from legal moves.
def maximizer(evaluate):
def strategy(player, board):
def score_move(move):
return evaluate(player, make_move(move, player, list(board)))
return max(legal_moves(player, board), key=score_move)
return strategy
One possible evaluation function is score
. A strategy constructed with
maximizer(score)
will always make the move that results in the largest
immediate gain in pieces.
A more advanced evaluation function might consider the relative worth of each square on the board and weight the score by the value of the pieces held by each player. Since corners and (most) edge squares are very valuable, we could weight those more heavily, and add negative weights to the squares that, if acquired, could lead to the opponent capturing the corners or edges.
SQUARE_WEIGHTS = [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 120, -20, 20, 5, 5, 20, -20, 120, 0,
0, -20, -40, -5, -5, -5, -5, -40, -20, 0,
0, 20, -5, 15, 3, 3, 15, -5, 20, 0,
0, 5, -5, 3, 3, 3, 3, -5, 5, 0,
0, 5, -5, 3, 3, 3, 3, -5, 5, 0,
0, 20, -5, 15, 3, 3, 15, -5, 20, 0,
0, -20, -40, -5, -5, -5, -5, -40, -20, 0,
0, 120, -20, 20, 5, 5, 20, -20, 120, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
]
A strategy constructed as maximizer(weighted_score)
, then, will always
return the move that results in the largest immediate weighted gain in
pieces.
Compute the difference between the sum of the weights of player's squares and the sum of the weights of opponent's squares.
def weighted_score(player, board):
opp = opponent(player)
total = 0
for sq in squares():
if board[sq] == player:
total += SQUARE_WEIGHTS[sq]
elif board[sq] == opp:
total -= SQUARE_WEIGHTS[sq]
return total
The maximizer strategies are very short-sighted, and a player who can consider the implications of a move several turns in advance could have a significant advantage. The minimax algorithm does just that.
Find the best legal move for player, searching to the specified depth. Returns a tuple (move, min_score), where min_score is the guaranteed minimum score achievable for player if the move is made.
def minimax(player, board, depth, evaluate):
We define the value of a board to be the opposite of its value to our
opponent, computed by recursively applying minimax
for our opponent.
def value(board):
return -minimax(opponent(player), board, depth-1, evaluate)[0]
When depth is zero, don't examine possible moves--just determine the value of this board to the player.
if depth == 0:
return evaluate(player, board), None
We want to evaluate all the legal moves by considering their implications
depth
turns in advance. First, find all the legal moves.
moves = legal_moves(player, board)
If player has no legal moves, then either:
if not moves:
the game is over, so the best achievable score is victory or defeat;
if not any_legal_move(opponent(player), board):
return final_value(player, board), None
or we have to pass this turn, so just find the value of this board.
return value(board), None
When there are multiple legal moves available, choose the best one by maximizing the value of the resulting boards.
return max((value(make_move(m, player, list(board))), m) for m in moves)
Values for endgame boards are big constants.
MAX_VALUE = sum(map(abs, SQUARE_WEIGHTS))
MIN_VALUE = -MAX_VALUE
The game is over--find the value of this board to player.
def final_value(player, board):
diff = score(player, board)
if diff < 0:
return MIN_VALUE
elif diff > 0:
return MAX_VALUE
return diff
Construct a strategy that uses minimax
with the specified leaf board
evaluation function.
def minimax_searcher(depth, evaluate):
def strategy(player, board):
return minimax(player, board, depth, evaluate)[1]
return strategy
Minimax is very effective, but it does too much work: it evaluates many search trees that should be ignored.
Consider what happens when minimax is evaluating two moves, M1 and M2, on one level of a search tree. Suppose minimax determines that M1 can result in a score of S. While evaluating M2, if minimax finds a move in its subtree that could result in a better score than S, the algorithm should immediately quit evaluating M2: the opponent will force us to play M1 to avoid the higher score resulting from M1, so we shouldn't waste time determining just how much better M2 is than M1.
We need to keep track of two values:
When the algorithm begins, alpha is the smallest value and beta is the largest
value. During evaluation, if we find a move that causes alpha >= beta
, then
we can quit searching this subtree since the opponent can prevent us from
playing it.
Find the best legal move for player, searching to the specified depth. Like minimax, but uses the bounds alpha and beta to prune branches.
def alphabeta(player, board, alpha, beta, depth, evaluate):
if depth == 0:
return evaluate(player, board), None
def value(board, alpha, beta):
Like in minimax
, the value of a board is the opposite of its value
to the opponent. We pass in -beta
and -alpha
as the alpha and
beta values, respectively, for the opponent, since alpha
represents
the best score we know we can achieve and is therefore the worst score
achievable by the opponent. Similarly, beta
is the worst score that
our opponent can hold us to, so it is the best score that they can
achieve.
return -alphabeta(opponent(player), board, -beta, -alpha, depth-1, evaluate)[0]
moves = legal_moves(player, board)
if not moves:
if not any_legal_move(opponent(player), board):
return final_value(player, board), None
return value(board, alpha, beta), None
best_move = moves[0]
for move in moves:
if alpha >= beta:
If one of the legal moves leads to a better score than beta, then the opponent will avoid this branch, so we can quit looking.
break
val = value(make_move(move, player, list(board)), alpha, beta)
if val > alpha:
If one of the moves leads to a better score than the current best achievable score, then replace it with this one.
alpha = val
best_move = move
return alpha, best_move
def alphabeta_searcher(depth, evaluate):
def strategy(player, board):
return alphabeta(player, board, MIN_VALUE, MAX_VALUE, depth, evaluate)[1]
return strategy
The strategies we've discussed are very general and are applicable to a broad range of strategy games, such as Chess, Checkers, and Go. More advanced strategies for Othello exist that apply various gameplay heuristics; some of these are discussed in "Paradigms of Artificial Intelligence Programming" by Peter Norvig.
See Wikipedia for more details on minimax and alpha-beta search.