Introduction
Genetic Algorithms are a family of computational models inspired by evolution. They are used for a number of different application areas. An example of this would be multidimensional optimization problems in which the character string of the chromosome can be used to encode the values for the different parameters being optimized.
Genetic algorithms are good at finding the better locations on a global scale. GAs have been successfully applied to solve optimization problems, both for continuous (whether differentiable or not) and discrete functions.
Main purpose of GAs are:
Optimization is a process that finds a best, or optimal, solution for a problem. The Optimization problems are centered around three factors :
An optimization problem is defined as: Finding values of the variables that minimize or maximize the objective function while satisfying the constrains.
There are at least three situations where genetic algorithms are useful:
The typical algorithm selects two "parents" from the population, and commingles their "genes" to get a new solution. This new solution may or may not enter the population depending on how good its objective is.
GAs are based on an analogy with the genetic structure and behaviour of chromosomes within a population of individuals using the following foundations:
A GAs consist of three things:
The most commonly used methods of selecting chromosomes for parents to crossover are :
Modeling GAs in R
The following is example from http://fishyoperations.com/. For demonstration it explains the Knapsack problem.
Task description: you are going to spend a month in the wilderness. You’re taking a backpack with you, however, the maximum weight it can carry is 20 kilograms. You have a number of survival items available, each with its own number of "survival points". You’re objective is to maximize the number of survival points.
ITEM | SURVIVALPOINTS | WEIGHT |
---|---|---|
pocketknife | 10 | 1 |
beans | 20 | 5 |
potatoes | 15 | 10 |
unions | 2 | 1 |
sleeping bag | 30 | 7 |
rope | 10 | 5 |
compass | 30 | 1 |
In R I have used the package genalg to set-up the model.
First of all, let's install genalg.
install.packages("genalg")
Let's define the dataset and weight constraint.
library(genalg) dataset = data.frame(item = c("pocketknife", "beans", "potatoes", "unions", "sleeping bag", "rope", "compass"), survivalpoints = c(10, 20, 15, 2, 30, 10, 30), weight = c(1, 5, 10, 1, 7, 5, 1)) weightlimit = 20
Before creating the model we have to set-up an evaluation function. The evaluation function will evaluate the different individuals (chromosomes) of the population on the value of their gene configuration.
An individual can for example have the following gene configuration: 1001100.
Each number in this binary string represents whether or not to take an item with you. A value of 1 refers to putting the specific item in the knapsack while a 0 refers to leave the item at home. Given the example gene configuration we would take the following items.
chromosome = c(1, 0, 0, 1, 1, 0, 0) dataset[chromosome == 1, ] ## item survivalpoints weight ## 1 pocketknife 10 1 ## 4 unions 2 1 ## 5 sleeping bag 30 7
We can check to what amount of surivival points this configuration sums up.
cat(chromosome %*% dataset$survivalpoints) ## 42
Above we gave a value to the gene configuration of a given chromosome. This is exactly what the evaluation function does.
The genalg algorithm tries to optimize towards the minimum value. Therefore, the value is calculated as above and multiplied with -1. A configuration which leads to exceeding the weight constraint returns a value of 0 (a higher value can also be given).
We define the evaluation function as follows.
evalFunc = function(x) { current_solution_survivalpoints = x %*% dataset$survivalpoints current_solution_weight = x %*% dataset$weight if (current_solution_weight > weightlimit) return(0) else return(-current_solution_survivalpoints) }
Next, we choose the number of iterations, design and run the model.
iter = 100 GAmodel = rbga.bin(size = 7, popSize = 200, iters = iter, mutationChance = 0.01, elitism = T, evalFunc = evalFunc) cat(summary.rbga(GAmodel)) ## GA Settings ## Type = binary chromosome ## Population size = 200 ## Number of Generations = 100 ## Elitism = TRUE ## Mutation Chance = 0.01 ## ## Search Domain ## Var 1 = [,] ## Var 0 = [,] ## ## GA Results ## Best Solution : 1 1 0 1 1 1 1
The best solution is found to be 1111101. This leads us to take the following items with us on our trip into the wild.
solution = c(1, 1, 1, 1, 1, 0, 1) dataset[solution == 1, ] ## item survivalpoints weight ## 1 pocketknife 10 1 ## 2 beans 20 5 ## 3 potatoes 15 10 ## 4 unions 2 1 ## 5 sleeping bag 30 7 ## 7 compass 30 1
This in turn gives us the total number of survival points.
# solution vs available cat(paste(solution %*% dataset$survivalpoints, "/", sum(dataset$survivalpoints))) ## 107 / 117
Let's visualize how the model evolves.
plot(GAmodel)
Vocabulary
Useful links