Tuesday, November 22, 2011

Genetic Algorithms Overview


Genetic Algorithms Overview

Introduction

Genetic algorithms are one of the best ways to solve a problem for which little is known. They are a very general algorithm and so will work well in any search space. All you need to know is what you need the solution to be able to do well, and a genetic algorithm will be able to create a high quality solution. Genetic algorithms use the principles of selection and evolution to produce several solutions to a given problem.
Genetic algorithms tend to thrive in an environment in which there is a very large set of candidate solutions and in which the search space is uneven and has many hills and valleys. True, genetic algorithms will do well in any environment, but they will be greatly outclassed by more situation specific algorithms in the simpler search spaces. Therefore you must keep in mind that genetic algorithms are not always the best choice. Sometimes they can take quite a while to run and are therefore not always feasible for real time use. They are, however, one of the most powerful methods with which to (relatively) quickly create high quality solutions to a problem. Now, before we start, I'm going to provide you with some key terms so that this article makes sense.
  • Individual - Any possible solution
  • Population - Group of all individuals
  • Search Space - All possible solutions to the problem
  • Chromosome - Blueprint for an individual
  • Trait - Possible aspect of an individual
  • Allele - Possible settings for a trait
  • Locus - The position of a gene on the chromosome
  • Genome - Collection of all chromosomes for an individual

Foundations in Science

In the mid 1800s, 1859 to be exact, a British naturalist named Charles Darwin published a book that would change the way humans view the world. In this book, The Origin of Species, Darwin proposed that humans, and in fact all creatures, were not put on this planet by God and made unchanging, but rather that they evolved from other creatures. At the time, the idea sounded preposterous, but time and time again have we discovered that he may be correct. Advances in technology have made it possible for us to read our DNA and that of other creatures, and what it has shown us is that we aren't as different from other creatures as we think. Over time, creatures change to adapt to their environment to survive and thrive.
One of the most striking examples of this is the Galapagos Islands. Located in the Pacific Ocean, off the coast of Ecuador, this series of islands is one of the most prominent examples of evolution and adaptation. The island contains many species not found anywhere else on the planet, including several species of birds that share many characteristics; too many for it to be a coincidence. It is believed that many birds were blown to the islands by winds and were unable to get back. Over time, the birds spread throughout the islands and began to change to better survive in the differing environments of the islands. Some birds developed large, strong beaks suited to cracking nuts, others long, narrow beaks more suitable for digging bugs out of wood. The birds that had these characteristics when blown to the island survived longer than other birds. This allowed them to reproduce more and therefore have more offspring that also had this unique characteristic. Those without the characteristic gradually died out from starvation. Eventually all of the birds had a type of beak that helped it survive on its island. This is the process of natural selection and evolution. The individuals themselves do not change, but those that survive better, or have a higher fitness, will survive longer and produce more offspring. This continues to happen, with the individuals becoming more suited to their environment every generation. It was this continuous improvement that inspired computer scientists, one of the most prominent being John Holland, to create genetic algorithms.

Basics of Genetic Algorithms

The most common type of genetic algorithm works like this: a population is created with a group of individuals created randomly. The individuals in the population are then evaluated. The evaluation function is provided by the programmer and gives the individuals a score based on how well they perform at the given task. Two individuals are then selected based on their fitness, the higher the fitness, the higher the chance of being selected. These individuals then "reproduce" to create one or more offspring, after which the offspring are mutated randomly. This continues until a suitable solution has been found or a certain number of generations have passed, depending on the needs of the programmer.

Selection

While there are many different types of selection, I will cover the most common type - roulette wheel selection. In roulette wheel selection, individuals are given a probability of being selected that is directly proportionate to their fitness. Two individuals are then chosen randomly based on these probabilities and produce offspring. Pseudo-code for a roulette wheel selection algorithm is shown below.
for all members of population
    sum += fitness of this individual
end for
    
for all members of population
    probability = sum of probabilities + (fitness / sum)
    sum of probabilities += probability
end for
    
loop until new population is full
     do this twice
         number = Random between 0 and 1
       for all members of population
           if number > probability but less than next probability 
                then you have been selected
       end for
     end
     create offspring
end loop
While this code is very general and will obviously not compile, it illustrates the basic structure of a selection algorithm. Besides, you should write the code yourself, you learn better that way.

Crossover

So now you have selected your individuals, and you know that you are supposed to somehow produce offspring with them, but how should you go about doing it? The most common solution is something called crossover, and while there are many different kinds of crossover, the most common type is single point crossover. In single point crossover, you choose a locus at which you swap the remaining alleles from on parent to the other. This is complex and is best understood visually.
As you can see, the children take one section of the chromosome from each parent. The point at which the chromosome is broken depends on the randomly selected crossover point. This particular method is called single point crossover because only one crossover point exists. Sometimes only child 1 or child 2 is created, but oftentimes both offspring are created and put into the new population. Crossover does not always occur, however. Sometimes, based on a set probability, no crossover occurs and the parents are copied directly to the new population. The probability of crossover occurring is usually 60% to 70%.

Mutation

After selection and crossover, you now have a new population full of individuals. Some are directly copied, and others are produced by crossover. In order to ensure that the individuals are not all exactly the same, you allow for a small chance of mutation. You loop through all the alleles of all the individuals, and if that allele is selected for mutation, you can either change it by a small amount or replace it with a new value. The probability of mutation is usually between 1 and 2 tenths of a percent. A visual for mutation is shown below.
As you can easily see, mutation is fairly simple. You just change the selected alleles based on what you feel is necessary and move on. Mutation is, however, vital to ensuring genetic diversity within the population.

Applications

Genetic algorithms are a very effective way of quickly finding a reasonable solution to a complex problem. Granted they aren't instantaneous, or even close, but they do an excellent job of searching through a large and complex search space. Genetic algorithms are most effective in a search space for which little is known. You may know exactly what you want a solution to do but have no idea how you want it to go about doing it. This is where genetic algorithms thrive. They produce solutions that solve the problem in ways you may never have even considered. Then again, they can also produce solutions that only work within the test environment and flounder once you try to use them in the real world. Put simply: use genetic algorithms for everything you cannot easily do with another algorithm.

Last Words

So, now you understand the basics of genetic algorithms. You may be wondering if code is provided. The answer is, aside from the pseudo-code above, no. The only way for you to truly understand this sort of thing is to write it yourself. They aren't very complex. A very basic genetic algorithm class will only take up a few hundred lines. However, if you run across problems or get stuck with something, feel free to contact me atmike91285@aol.com. I also recommend that you read other articles on this topic. There is no one right way to go about doing this sort of thing. Do what you feel is right and seems to work for you. That is the only "right" solution.

0 comments: