Tags

, , , , , , , , , , , , , , ,

While doing my first analysis on the Tic Tac Toe game, I got to the point that it was needed start coding something for me to test my heuristic values. For this application I decided to use python because I’ve been learning recently the language and I like the way the manage data structures.

Still considering the alpha-beta pruning as the solving strategy, I organized the development as follows:

  • Representation of the board
  • Calculation of the heuristic values
  • Tree building and search

Let’s review what I did and what I learned.

Representation of the board

To determine the best way to represent the tic tac toe board, we need to know it’s characteristics:

We have 9 cells in a 3×3 grid

   |   |   
---+---+---
   |   |   
---+---+---
   |   |

This can be represented actually in multiple ways. The first one I can think of is as an array:

 0 | 1 | 2 
---+---+---
 3 | 4 | 5    board = [ 0, 1, 2, 3, 4, 5, 6, 7, 8 ]
---+---+---
 6 | 7 | 8

Or as a 3×3 matrix:

 0 | 1 | 2    board = [
---+---+---             [ 0, 1, 2 ],
 3 | 4 | 5              [ 3, 4, 5 ],
---+---+---             [ 6, 7, 8 ]
 6 | 7 | 8            ]

There are other ways of creating a 3×3 board depending on our needs. One that I’ve used for other problems in the past is a 2 byte array using bit level logic to access the data (mostly AND operations):

This will keep all the data in two bytes (with some unused bits):

0x00 = 0000 0000  0000 0000

This comes in handy when I need some binary array only, for multiple states per space like our tic tac toe it doesn’t work. So let’s back to our first two options.

In the first case (keeping the board as an array) the data is stored sequentially, this allows me to iterate through the board in one single cycle. This is particularly useful if I’m trying to count the number of empty cells, or the number of cells that a player has covered. But no so much if I’m referring to the board by row or column, in which case I will need to do some transformation.

The coordinates i,j in the board (being i the row and j the column) can be calculated as follows from any given index:

i = ind // 3   # floor division or, integer division
j = ind % 3    # modulus or offset in the column

The second case (store the board as a matrix) works exactly the opposite, perfectly well when I’m trying to access the board by row or column, but no when I want to read it all.

My final decision was to use the 3×3 matrix since it makes more sense in terms of the type of data and to apply some logic when needing to read sequentially the board. Python for instance, offers me comprehensions to do the cycles, so if I’m trying to get which cells are empty, I can do something like this:

vector_board = [ item for row in board for item in row ]
empty_cells = [ index for index, value in enumerate(vector_board) if value == 0 ]

The states of each cell

In that last line of code I am considering for example, that a cell is empty if its value is zero. Any given cell in the board can be in multiple states:

  • Empty
  • Played by player 1
  • Played by player 2

So I took the easiest approach I could think of, assign 1 to the matrix cell if it was player 1 who played it and 2 if it was player 2.

This approach gave me a nice advantage in the python implementation, the capability of counting elements in a row, or a column or even a diagonal.

So let’s create a 3×3 matrix using the python comprehensions and initialize it to zeros:

board = [ [ 0 for i in range(3) ] for j in range(3) ]

Calculation of the Heuristic Values

From the previous post, I determined that each possible play was going to be valued based on possible movements, pending places to go and the number of blocks by the opponent.

Let’s take this board for example:

 o |   |   
---+---+---
   | x |   
---+---+---
   |   |

There are a some movements already done; let’s consider that we are going to play the next in position 2 with coordinates (0,2) (the top right corner).

 o |   | o 
---+---+---
   | x |   
---+---+---
   |   |

In code:

coord = [0,2]

Extracting from the board

From the given coordinate to play, I’m counting the number of cells occupied after the play by both players, and I’m doing this by row, by column and by diagonal (if possible):

Count at (0,2)
Count of o’s Count of x’s
Row 0 2 0
Column 2 1 0
Left diagonal 1 1

How to accomplish all of this in code is tricky, so let’s do it piece by piece. First, let’s get the row from the board, which is straightforward, since board is already an array of arrays (which happen to be the rows):

row = board[ coord[0] ]

For the column we are going to iterate through each row in the board and extract the column subelement on each one:

column = [ row[ coord[1] ] for row in board ]

And finally, for the diagonals; if we are getting the right diagonal ( \ ) we just have to take the first element of the first row, the second of the second row and the third element of the third row:

diagonal = [ board[i][i] for i in range(3) ]

For the left diagonal ( / ) we have to take do an inverse progression on the sub index:

diagonal = [ board[i][2-i] for i in range(3) ]

Once I have the row, column and diagonals (if any) I count the number of elements of each player in that particular row, column or diagonal. And here, I am taking advantage of Python’s built-in count functionality for lists (note that the row, column and diagonals got above are kept as lists).

# I'm keeping the results in a dictionary named counts, 
# which elements are also dictionaries
counts = {
           'row' : {},
           'column' : {},
           'diagonal' : {}
         }

counts['row']['o'] = row.count(1)
counts['row']['x'] = row.count(2)

counts['column']['o'] = column.count(1)
counts['column']['x'] = column.count(2)

counts['diagonal']['o'] = diagonal.count(1)
counts['diagonal']['x'] = diagonal.count(2)

Getting the heuristic values

A possible play would be that in which the opponent hasn’t played yet; considering we are trying to play an o at (0,2).

for element in counts:
    if counts[element]['x'] == 0:
         possibles += 1

We are iterating through the elements in the dictionary and checking those belonging to the opponent (in this case, the x). However, in python we can simplify everything like this:

possibles = sum( 1 for element in counts if counts[element]['x'] == 0 )

For my places to go value, I am interested in the best position based on the one that would get me closer to win (meaning the smallest value), so I start with the maximum possible places:

togo = 3

for element in counts:
    if 3 - counts[element]['o'] < togo:
        togo = 3 - counts[element]['o']

For my number of blocks, I need to know if the opponent has already played:

for element in counts:
    if counts[element]['x'] == 1:
         blocks += 1

Additionally, this approach allows me to know if the position to be played is a winner or a risk solver, in either case, this will give more value to the movement.

for element in counts:
    if count[element]['o'] == 3:
        winner = True
    elif count[element]['x'] == 2:
        risk = True

Adding an equation to count

For our search strategy, we need the heuristic value of the node in order to do comparisons, so given that my calculations are numeric, the logical thing to do is to create an equation that helps me determine it.

After sketching a lot in paper and considering multiple scenarios and cases, I ended up with the following equation:

if togo > blocks:
    value = ( blocks + togo ) / possibles
else:
    value = ( possibles + togo ) / blocks

The first part of the equation falls into the empty board, when the chances of playing are higher than the number of blocks. The second part falls into the rest of values.

Tree building and search

While I was coding this last part I was iterating through every possible play in the board and comparing results manually, but it hit me that I could use that strategy to determine the next move, so I stopped constructing the decision tree there for several reasons.

First, the broad heuristic value.

If I apply the heuristic calculations to all possible plays in the board, I can easily get which would be the best option to play considering the whole board at each turn.

I concluded that in a full decision tree approach I would have to just take the local information for each node, and extrapolate from there the next level.

Second, the depth of the tree.

In Tic Tac Toe, if we create a tree for each possible play, then the next level being the possible plays for the opponent, we don’t have a huge depth before falling in a risk situation (in which the opponent has played two in a row and we must play to block).

If this would be a different type of game (Chess for instance) the tree would be more helpful since considering the whole board for a single play is not necessarily straightforward, therefore we need to use a local value and be confident on the search and prune.

Third, the number of states.

For each cell I don’t have many options, it can be either empty or taken by any player and even if this last case, the course of action is the same for each case.

Comparing it to Chess in which a movement into a cell would give different values depending on the piece moved (pawns, bishops, rooks, etc.), the tree seem more helpful there than my simple Tic Tac Toe.

Conclusion

More than the code itself or the actual application, coding a Tic Tac Toe was a good exercise to practice python, get back to code and get the dust out of algorithm design.

I have put the code in my GitHub repo along with other code I have been working on, if you want to take a look at the final version.