Monte Carlo Tree Search for Game AI

I have recently been implementing an Othello AI using the Monte Carlo Tree Search (MCTS) algorithm. One of the super cool things about MCTS (and actually the main reason I was attracted to it) is that you can use the same core algorithm for a whole class of games: Chess, Go, Othello, and almost any board game you can think of. Read more on Monte Carlo Tree Search for Game AI…

Visualizing Garbage Collection Algorithms

Most developers take automatic garbage collection for granted. It’s just another amazing feature provided by our language run-times to make our jobs easier.

But if you try to peek inside a modern garbage collector, it’s very difficult to see how they actually work. There are thousands of implementation details that will confuse you unless you already have a good understanding of what it’s trying to do and how they can go fantastically wrong.

I’ve built a toy with five different garbage collection algorithms. Small animations were created from the run-time behavior. You can find larger animations and the code to create them at It surprised me how much a simple animation reveals about these important algorithms.

Cleanup At The End: aka No GC

The simplest possible way of cleaning up garbage is to just wait until a task is done and dispose of everything at once. This is a surprisingly useful technique, especially if you have a way of breaking up a task into pieces. The Apache web server, for example, creates a small pool of memory per request and throws the entire pool away when the request completes.

The small animation to the right represents a running program. The entire image represents the program’s memory. Memory starts out colored black, which means it isn’t used. Areas that flash bright green or yellow are memory reads or writes. The color decays over time so you can see how memory was used, but also see current activity. If you watch carefully, you can see patterns emerge where the program begins to ignore some memory. Those areas have become garbage — they are not used and not reachable by the program. Everything else that isn’t garbage is “live”. Read more on Visualizing Garbage Collection Algorithms…

Fisher-Yates Shuffle – An Algorithm Every Developer Should Know

Problem statement: You have a list of items that you want to randomize.

I’ve found myself in this situation many times. If the language you’re working in has a shuffle or randomize function, you’re set. However, there are plenty of languages that don’t provide built in support for such a function, leaving you on your own. The first time I was faced with this problem, I wrote a shuffle algorithm that looked something like this:

def incorrect_shuffle(items):
    for i in range(len(items)):
        randomIndex = random.randint(0, len(items)-1)
        temp = items[randomIndex]
        items[randomIndex] = items[i]
        items[i] = temp
    return items

The above algorithm swaps every element in the list with another randomly-chosen element in the list. But there are three problems with this algorithm: Read more on Fisher-Yates Shuffle – An Algorithm Every Developer Should Know…

An Introduction to Gradient Descent and Linear Regression

Gradient descent is one of those “greatest hits” algorithms that can offer a new perspective for solving problems. Unfortunately, it’s rarely taught in undergraduate computer science programs. In this post I’ll give an introduction to the gradient descent algorithm, and walk through an example that demonstrates how gradient descent can be used to solve machine learning problems such as linear regression.

At a theoretical level, gradient descent is an algorithm that minimizes functions. Given a function defined by a set of parameters, gradient descent starts with an initial set of parameter values and iteratively moves toward a set of parameter values that minimize the function. This iterative minimization is achieved using calculus, taking steps in the negative direction of the function gradient.

It’s sometimes difficult to see how this mathematical explanation translates into a practical setting, so it’s helpful to look at an example. The canonical example when explaining gradient descent is linear regression. Read more on An Introduction to Gradient Descent and Linear Regression…

Basic Machine Learning with KNN and Racket

The web is allowing us to obtain an enormous amount of data about our behavior, which raises an interesting question: how do we manage and use it? One way is with intelligent algorithms, which are being used to bring us ad targeting, recommendation engines, and spam detection, among others.

In this post, I’ll show you how to implement the K Nearest Neighbor (KNN) algorithm in Racket, a dialect of Scheme. Don’t worry, even though some algorithms have roots in neuroscience, machine learning will not make your computer self aware.

Machine Learning with K Nearest Neighbor

KNN is a machine learning classification algorithm that’s lazy (it defers computation until classification is needed) and supervised (it is provided an initial set of training data with class labels). KNN is a simple, easy-to-understand algorithm and requires no prior knowledge of statistics.

The basic premise of KNN is simplistic and reasonable: given an instance of data, look for the k closest neighboring instances, and choose the most popular class among them.

Read more on Basic Machine Learning with KNN and Racket…

Solving Sudoku in C with Recursive Backtracking

One of my favorite types of algorithms in computer science is recursive backtracking. By following a shockingly simple procedure, you can solve complex problems in reasonable amounts of time, with no bookkeeping.

As a practical example, I’ll walk you through an example solving Sudoku puzzles with the lingua franca of programmers, C.

A recursive backtracking algorithm follows a really simple formula:

  1. Find a possible solution. If out of possibilities, go up a level.
  2. Move to the next cell.
  3. ????
  4. PROFIT.

Read more on Solving Sudoku in C with Recursive Backtracking…