How to Teach a Bot to Play Checkers Using Deep Learning

Like us, before a bot is able to learn how to play checkers well, it must first be able to play checkers. This includes having it's own rendition of the board, knowing which moves are valid, being able to jump pieces, making queens, and knowing when the game is over. For this a bot, along with a script coded using Self, was created using Bot Libre. The script had various functions which allowed the bot to create a board, allow the player to move, and have the bot move randomly. Then using a neural network created from deep learning on Bot Libre, the bot was slowly able to make more educated moves.

About the Script

The checkers Self script can be found here: https://www.botlibre.com/script?id=30180493.

The Board

The way the bot sees the board is different than how we see it. For the bot, the board is a string of 64 characters. Each character is either a "r" for a red piece, a "b" for a black piece, a "R" for a red queen, a "B" for a black queen, a "_'" for an empty square, or a "/" for an out of bounds square.

"/r/r/r/rr/r/r/r//r/r/r/r_/_/_/_//_/_/_/_b/b/b/b//b/b/b/bb/b/b/b/"

Then in order to create the board that we see, the files game-sdk.js and games.css are used to turn the avatar into the board, and turn the characters like "r" and "b" into pieces.

play

Essentially, this function takes the previous board, determines if your move is valid, adds your move to create a new board, and calls a function for the bot to make a move. When you click where you want to move from and where you want to move to the bot receives two coordinates which are used to determine if your move is valid. The function will then update the board by moving your piece from the first coordinate to the second coordinate, and eliminating a piece if you jumped it. Once the new board has been created it will check if you have won, if your piece can be made into a queen, and if you can jump another piece if your previous move was a jump. Then the function "makeMove" is called where the bot updates the board with its move and returns the board to you.

makeMove

This is where the bot makes its move. If no jumps are available, it uses either the function "randomMove" or "deepLearningMove" to determine where it's moving from and where it's moving to. Then function will update the board to show it's move as well as check this new board to see if it won, if it's piece can be made into a queen, and if it's piece is able to jump again.

movesFrom

movesFrom finds all of the bots pieces that have a valid move and puts them in an array.

movesTo

movesTo finds all the valid moves for one of the bot's pieces and puts them in an array.

randomMove

This function selects a piece randomly from the array created by the function "movesFrom" and then uses the function "movesTo" in order to randomly select where that piece will move.

checkGameOver

In order to find out if the bot has won or lost the function checkGameOver is used. It checks every space on the board to see if you or the bot has no pieces left.

checkQueen

checkQueen will check to see if a black piece has made it to one of the first 8 squares or if a red piece has made it to one of the last 8 squares.

checkJump

This function will return true or false depending if the bot or player can make a jump or not. Additionally, for the bot if a jump is possible it will set the bots move from and move to to be the jump.

Deep Learning

With the bot now able to play checkers using random moves, the next step was creating a neural network that the bot can use make more educated moves. A three layer network was created on Bot Libre and consisted of 32 input nodes, 256 intermediate node, and 128 output nodes. To use the network in order to make a move, the bot would assign each square a value. The input consists of 32 numbers from -1.0 to 1.0. 0.5 means a black man, -0.5 means a red man, 1.0 means a black king, -1.0 means a red king. The input is taken by going left-to-right from the top row (red side) to the bottom row (black side) of the board, skipping the unreachable squares. These inputs would then be sent to the network, which would return 128 values. The 128 values represent moves from the 32 reachable squares on the board in the 4 directions. The highest value is the best move determined by the neural network.

Training the Network

Before the network can accurately output values to indicate a good move, it first must know what a good move is. It will watch a game being played and record all the winner's moves as good moves by increasing their value. Because a game can still be won after making a bad move, and simply because of the giant number of possible moves, it takes lots of games in order to have accurate values. So to decrease the time it takes to train the network, bots will play each other. Here, one can experiment with having bots use different strategies, such as having the bot make the first possible move, a random move or make a jump whenever possible. Having some sort of randomness is beneficial in order to explore more possible moves and having some basic strategy, such as always jumping pieces, can lead to the network learning good moves faster. The network can also be made to record the moves that resulted in a tie if the goal was to not lose.

deepLearningMove

With a trained network, the bot can now call on the network to find a good move. In the script a function can be made to create the inputs and send them to the network to get the outputs. The highest value of the outputs can be determined, which is the best move determined by the neural network.