Tags

, , , , , , , , , , , , , , , ,

A couple of days ago, I finished coding the UI for a tic tac toe program that I’ve been working on for the past weeks. Should this series be about how I built the game and what I learned from it.

The idea behind

When I was in college studying AI, the final project was to write a simplified version of the chess game and let it play against the ones other people in the class coded in an all-around tournament, being the one with less defeats the winner.

The approach we took to calculate the next play was the alpha-beta pruning; in short, a search algorithm that stops searching in the current branch of a tree if at least one possible movement is worse than any of the previously examined.

This search is widely used in two-player board games (chess, go, tic tac toe) and we (back in the class) started studying the algorithm using tic tac toe although we never coded anything for it, so I decided that it was time for me to restart my AI self-training with this game.

How does Tic Tac Toe really works?

Pretty much everyone has played tic tac toe:

  • there are two players
  • each one chooses a symbol (commonly O and X)
  • by turns, each one writes its symbol on an empty space in a 3×3 grid
  • the game ends when any of the players fills three consecutive spaces (horizontally, vertically or diagonally) or when there are no more space to play (in this case the game is declared a draw).

That sounds nice and simple from the human perspective, but how a program would do this?

Let’s start considering P1, P2 and the 3×3 grid which I’m going to represent with this tiny square:

3x3 tic tac toe grid

Note: P1 is going to be denoted as the space colored in black while P2 will be blue colored, for simplicity

For the first move in the game, P1 has 9 possible plays (one for each empty space):

All possible plays for the first move

The whole purpose of the game is to determine which one of the possible plays is the best for the current player.

Growing the tree

In order to be able to understand how the decision has to be made, I started trying to build up a three in the minimax way (one level is P1 and the next one is P2); getting something like this, for the first play on the left from the ones shown above:

Second level of the three for the first movement

When I did this graph in my notebook I wondered how many squares I had to draw to get the full picture, in other words, how many possible plays exist in Tic Tac Toe? My first rough estimate was:

9 ( first level)
+ 9x8 ( 9 at the first level, 8 new at the one )
+ 9x8x7 ( 9 at the first level, 8 at the second and 7 at the third )
. . . (and so on...)

And this gave me 986409, which is a little more than I was willing to draw. However I noticed that this number was wrong, because these two reasons:

1. Some of the branches in the tree are shorter (they don’t go all the way down to get a full board) because those are the winning ones (for either P1 or P2)

2. The Tic Tac Toe board can be rotated or symmetrical to itself, in the sense that these four boards for example are basically the same:

Symmetrical boards

In this case, each new board is rotated 90° ccw from the previous one, meaning that some branches are going to be exactly as others just rotated. Thinking that this could potentially be a way to help me with the game, I started analyzing matrix rotations in the geometrical way.

Regardless the search strategy, the underlying remaining questions were still the same: which one is the best? how do the program would know that a movement is better or more convenient than the others?

The value of a possible move

For the tree searching strategies to be successful, each node in the tree should be able to provide a heuristic value which will be used to compare towards other nodes.

What is this heuristic value depends on the problem we are solving, with the only restriction that it must be anything comparable among the nodes.

For instance, it can be a numerical value that represents a property of the node (relative weight, a delta, proximity to a fixed value, etc). However, most of the time, to compare if a node is better than other you would need to consider multiple factors and not just one only property.

For the game, I started to analyze which things I (as a human) would consider to pick a space over another, so given a player and a position I would check for:

  • Possible moves: from the given position, how many possible ways are to play? This is actually a known number:
    Possible moves per position in the board
  • Places to go: from the given position, how many more moves this player needs to win in any direction? When multiple movements are possible, the option is to pick the smallest one (the one that takes us closer) for example:
    Places to go
  • Blocked movements: from the given position, how many of the movements are blocked (meaning that there is a cell occupied by the opposite player)? Keep in mind that this property implies that the opponent can’t play either:
    Blocked movements for a given position

The heuristic function

The next thing was to understand how these three values will work together to determine the heuristic value of a given play. So I started an imaginary game with myself.

Remember that we can skip any movement that is rotated or symmetrical to another one.

  • T = pieces to go
  • P = possible movements
  • B = blocked movements
First movement
Board Player 1 Player 2
Fig8 T:2
P:3
B:0
T:3
P:inf
B:inf
Fig9 T:2
P:2
B:0
T:3
P:inf
B:inf
Fig10 T:2
P:4
B:0
T:3
P:inf
B:inf

From these table we can see that all of the movements have 2 pieces to go (which makes sense since we have just played one only space) and none of them have a blocked movement.

But, the possible movements in each case is different: for the corner we have 3 possible movements, while the side gives us 2, finally the center gives us 4 possibilities. And here is where Alice falls into the rabbit hole.

Should we pick the one with more possibilities or the one with the less?

We can think of the possible movements also as the number of possible blocks I can get (some sort of risk of the play), so let’s start picking the less risky (the second one, the side) and see what happens. The next movement looks like this:

Second movement
Board Player 1 Player 2
Fig11 T:2
P:1
B:1
T:2
P:2
B:1
Fig12 T:2
P:2
B:0
T:2
P:2
B:0
Fig13 T:2
P:1
B:1
T:2
P:3
B:1
Fig14 T:2
P:2
B:0
T:2
P:3
B:0
Fig15 T:2
P:1
B:1
T:2
P:1
B:1

If we check these values, we can see that there are some movements of Player 2 that affect the current values of Player 1, while there are some others that do not.

At any rate, if we keep on going through the way of picking the less risky, we can see that the last movement (playing on the right-middle space) is the less risky and also the one that blocks the opponent to play.

Before moving forward, I also noticed that there are two movements that has the same risk (first and second movements, both with risk = 2), the difference here will be, which one has the bigger impact, which in this case will be decided by the blocks, the first movement is blocking the opponent while the second is not.

While writing this article, I also considered the possibility of adding to the heuristic function something like the “amount of movements of the opponent”, meaning that I would also consider how many movements the opponent can make after this, obviously the less the better.

For this consideration we will again, consider the rotated positions and symmetrical values. Consider the following game:

  1. Player 1 plays in left-middle
  2. Player 2 plays in right-middle

For the next turn, Player 1 has 4 different options:

Fig16

The bottom spaces are not considered because they are symmetrical to those at the top and if calculated will return the same value:

Fig17

Anyway, at this point I realized that if I wanted to test actually if my triad of values (possible movements, pending places to go and number of blocks) or my decision of taking the less risky way I needed to start coding something.

And that’s what the next post is about😉