Article summary
Genetic algorithms are a class of algorithms designed to explore a large search space and find optimal solutions by mimicking evolution and natural selection. Potential solutions are randomly found, evaluated, and bred with one another in hopes of producing better solutions.
Basic Steps
The process of using genetic algorithms goes like this:
- Determine the problem and goal
- Break down the solution to bite-sized properties (genomes)
- Build a population by randomizing said properties
- Evaluate each unit in the population
- Selectively breed (pick genomes from each parent)
- Rinse and repeat
When to Use Genetic Algorithms
GAs are not good for all kinds of problems. They’re best for problems where there is a clear way to evaluate fitness. If your search space is not well constrained or your evaluation process is computationally expensive, GAs may not find solutions in a sane amount of time. In my experience, they’re most helpful when there is a decent algorithm in place, but the “knobs” just need to be tweaked.
For Example…
On a recent project, we had an algorithm that we knew would work, but we needed to find the best possible settings for the tune-able parameters.
We were using pressure data from inside a testing lung to parse out breath waves and calculate derived values such as breath rate, tidal volumes, and peak pressures. We had many recordings of pressure data and knew the expected results. The goal was simple: find the set of variables that would get us as close as possible to the expected results. Our properties were a handful of settings for our existing algorithm: low/high thresholds, smoothing factors, averaging window sizes, etc.
A simpler illustration
To walk through how it works, let’s use something a little simpler: a cannon.
Like the old Cannon Fodder game, we have a cannon that has two variables: powder and elevation. Our goal is to shoot as far as possible. (Let’s pretend we don’t know about physics and don’t know that 45 degrees is the answer.)
Our goal: max distance
Our genomes: powder (0-100) P
and elevation (-90-90 degrees) E
Initial population
We start by generating a sample population and evaluating its members. Normally, this sample would be much larger, but for our example, we’ll keep it small:
Powder | Elevation | Distance |
---|---|---|
5 | -90 | 0 |
100 | 30 | 80 |
75 | 44 | 65 |
25 | 90 | 35 |
33 | 70 | 20 |
Given these results, we can use an elitist strategy to select the top X percent of the population to “reproduce.” Once selected, we use crossover/recombination to blend the best of the population.
Crossover and mutation
Our “Elites” include:
Powder | Elevation | Distance |
---|---|---|
100 | 30 | 80 |
75 | 44 | 65 |
You may use more than two “parents,” but to keep things simple, I’ll just use two. We mix and match values from the parents (just like in biology). We also mutate a percentage of the values to introduce some randomness into our genomes.
The amount of mutation can greatly affect your results and should be tweaked based on your domain and the problem you are trying to solve. To keep our gene pool from becoming too focused, we’ll also include a couple of non-elites from the previous generation.
After crossover:
Powder | Elevation |
---|---|
100 | 30 |
100 | 75 |
75 | 30 |
75 | 44 |
With non-elites for diversity:
Powder | Elevation |
---|---|
100 | 30 |
100 | 75 |
75 | 30 |
75 | 44 |
25 | 90 |
After mutation:
Powder | Elevation |
---|---|
100 | 30 |
92 | 75 |
75 | 30 |
95 | 44 |
25 | 80 |
Keeping the non-elites and mutating some of the values will keep us from reaching local optima.
Are we done yet?
Now, we just repeat this process until we’re done. But, how do we know we’re done? GAs are usually terminated by:
- Fixed number of generations
- Fixed amount of processing time
- Optimal/good enough solution is found
- Good ol’ manual killing
For our example here, the algorithm will trend toward a full powder shot at 45 degrees, and we’ll have our clear winner.
Depending on runtime, problem, and domain, any of these terminators are acceptable. For our breath parsing from pressure samples, we used a manual intervention. If possible, I recommend saving off your populations so you can resume long-running simulations.
In the end, using a GA helped us get our algorithm within tolerances fairly quickly. When algorithms changed or new genomes were discovered, we simply added the genome and started over again with a few of our saved elite values. GAs aren’t a perfect fit for a lot of problems, but they’re definitely a fun and interesting tool to have in the toolbox.