# Genetic Algorithm: Optimizing the Traveling Salesman The traveling salesman is an interesting problem to test a simple genetic algorithm on something more complex.

Let’s check how it’s done in python.

## What is the traveling salesman problem? On any number of points on a map: What is the shortest route between the points? We start at any point, visit each point once and go back to the first one. So a pretty good solution for the problem here would look like this: Humans are good at doing this up to a point. Computers are not. There are more efficient algorithms, but let’s assume we don’t have them yet. To solve even this simple 7 point problem we would have to check every possible path.

How many are there? Well like:

7 * 6 * 5 * 4 * 3* 2 * 1 = 5040

That’s doable. But assume 25 points:

25*24*23*22 … *3*2*1 = 15511210043331000000000000

Thats for all practical purposes impossible.

## Why can a genetic algorithm solve this?

“Solve” is such a strong word. But genetic algorithms optimize genetic information. Everything that can be written as a row of numbers can be optimized if we can differ between a good and a bad piece of information. Also check “What is a genetic algorithm?”

If we number these points from 0 to 6(seven points) we can write a route like this:

2-1-4-3-6-0-5

Also, we can easily check the length of a path to compare it. On a 2d grid, the distance between two points a and b is:

absolute(xa -xb) + absolute(ya – yb)

We just add all the distances between the points and get the length of the route. The shorter the better.

## So how do we do it?

Remember the steps of a genetic algorithm: 1. create some random routes
2. find the best routes among them
3. breed new routes from the best ones
4. change some routes randomly
5. repeat 2-4 as long as we want

Some of that is more or less difficult. Well see it in detail soon.

## Let’s get coding

We are doing this in python. Let’s start by importing all dependencies:

```#import dependencies
import numpy as np
import random
import matplotlib.pyplot as plt```

We need a little helper function. It just takes a numpy array a and deletes all values from it that are in b.

```#helper function that deletes all values from a that are also in b
def deletebfroma(a,b):
index = np.array([], dtype = np.int16)
for number in range(len(a)):
if a[number] in b:
index = np.append(index, number)

return np.delete(a, index)```

We build an object and initialize it. Therefore we create some parameters to control its behavior:

```# the whole thing is an object
class salesman(object):
#initialize the object
def __init__(self, xymax, numberofstops, maxmen, mutationrate, verbose = False, mutatebest = True):

self.numberofstops = numberofstops #number of points to connect
self.mutatebest = mutatebest #wether the best path at each iteration should be mutated or not
self.verbose = verbose #wether the best and worst paths for each iteration should be shown
self.maxmen = maxmen #maximum number of specimen
self.xymax = xymax #size of the "map"
self.mutationrate = mutationrate # rate of mutation 0.1 is plenty

#randomly initialize the targets
self.targets = np.random.randint(xymax, size=(numberofstops, 2))
#randomly initialize the specimen
self.men = np.empty((maxmen, numberofstops), dtype = np.int32)
for number in range(maxmen):
tempman = np.arange(numberofstops, dtype = np.int32)
np.random.shuffle(tempman)
self.men[number] = tempman

#find the best specimen of the first created
self.best = np.array(self.getbestsalesmen())[...,0]```

This is a method to get the best routes. We do this for all routes and then return the best 50% in order(longest to shortest).

```#Method that returns the best route at runtime
def getbestsalesmen(self):
#initiate a temporary order
temporder = np.empty([len(self.men), 2], dtype = np.int32)
#write the indexes of the route to temporder before ordering changes them
for number in range(len(self.men)):
temporder[number] = [number, 0,]
#get length of path for all route
for number in range(len(self.men)):
templength = 0
#get length of path
for target in range(len(self.targets) - 1):
diffx = abs(self.targets[self.men[number][target]] - self.targets[self.men[number][target + 1]])
diffy = abs(self.targets[self.men[number][target]] - self.targets[self.men[number][target + 1]])
diff = diffy + diffx
templength = templength + diff
diffx = abs(self.targets[self.men[number]] - self.targets[self.men[number][-1]])
diffy = abs(self.targets[self.men[number]] - self.targets[self.men[number][-1]])
diff = diffy + diffx
templength = templength + diff
temporder[number] = templength
#Sort route by length of path
temporder = sorted(temporder, key=lambda x: -x)
#return the best half of the route rounded up
return temporder[int(len(temporder)/2):]```

We breed them by using half of the information of parent a and add them exactly where it was in parent a. Then we fill it up with the rest from parent b in order. Like this:

Parent A: 6435210

ParentB: 1203654

Offspring: 0635214

We do it with this methods:

```#function that breeds two route and returns the offspring
def breed(self, parent1, parent2):
#get length of first dna strand (half of it)
halflength = round(len(self.men[parent1])/2) #length of half a dna strand
#get a random starting point for taking the first dna sample
start = np.random.randint(halflength)
#Get first dna sample
dna1 = self.men[parent1][start:start+halflength]
#get second dna sample by deleting first sample from dna of second parent
dna2 = self.men[parent2][:]
dna2 = deletebfroma(dna2,dna1)
#create offspring by using dna2 as base and inserting dna1 at the position it was in Parent 1
offspring = np.insert(dna2, start ,dna1)
return offspring```
```#Edit: forgot this one thanks Dorian
#fill the route up with new offspring
def breednewgeneration(self):
#get best route indexes
best = np.array(self.getbestsalesmen())[...,0]
#replace route by newly bred offspring if they are not among the best
for i in range(len(self.men)):
if i not in best:
self.men[i] = self.breed(random.choice(best), random.choice(best))```

The next step is mutating them. We use the mutation rate parameter which should be between 0 and 1 to switch out random points. We can’t just replace them as no point can be in there twice. 0.1 is a good value for the mutation rate. Also, there is a parameter “mutatebest” we use it to block the mutation of the best path. This makes sure we never get worse by mutation.

```
#mutate route
def mutate(self):
for i in range(len(self.men)):#for ach route
if self.mutatebest == True or i != self.best:#mutate best if it is set to true
for j in range(len(self.men[i]) - 1): # for each piece of dna
if random.random() < self.mutationrate:#if random is < mutationrate
#switch places with random other gene, might hit itself, but whatever
rand = random.randint(0, self.numberofstops - 1)
temp = self.men[i][j]
self.men[i][j] = self.men[i][rand]
self.men[i][rand] = temp```

Lets put it all together. The “verbose” parameter switches a best/worse line for each iteration.

```#start calculation
def calculate(self, iterations):
#get best and bestlength
best = np.array(self.getbestsalesmen())[...,0][-1]
bestlength = np.array(self.getbestsalesmen())[...,1][-1]
#print best and bestlength at start
print(self.men[best])
print('best length: ', bestlength)
self.draw(best)
#repeat for number of iterations
for number in range(iterations):
#give indication for status
if number == round(iterations/4):
print("Status: 1/4")
if number == round((iterations/4) * 2):
print("Status: 2/4")
if number == round((iterations/4) * 3):
print("Status: 3/4")
#create new generation
self.breednewgeneration()
#set new best as parameter for next iteration
self.best = np.array(self.getbestsalesmen())[...,0][-1]
#mutate route
self.mutate()
#shout out best and worst length with each iteration if verbose is true
if self.verbose == 1:
bestlength = np.array(self.getbestsalesmen())[...,1][-1]
print('best length: ', bestlength)
worstlength = np.array(self.getbestsalesmen())[...,1]
print('worst length: ', worstlength)
#print and draw final best
best = np.array(self.getbestsalesmen())[...,0][-1]
bestlength = np.array(self.getbestsalesmen())[...,1][-1]
print(self.men[best])
print('best length: ', bestlength)
self.draw(best)```

The last method is used to draw all the shiny graphics:

```#draw shiny graphics
def draw(self, index):
plt.scatter(self.targets[[...,0]], self.targets[[...,1]], s=20)
plt.show()
plt.scatter(self.targets[[...,0]], self.targets[[...,1]], s=20)
linearray = self.targets[self.men[index]]
linearray = np.append(linearray, [linearray], axis = 0)
plt.plot(linearray[[...,0]], linearray[[...,1]])
plt.show()```

Cool. We are done. Let’s give it a go:

```#initialize object
man = salesman(1000, 7, 5, 0.1, verbose = False, mutatebest = False)

#start calculation
man.calculate(500)```

The code shows the points to connect first, followed by the best random route and then the best after all iterations:   Looks pretty good.

You need to understand that while these algorithms are optimizing, there is no guarantee the solution is optimal. You can make it better by using more iterations or more routes. But you can never be sure its the best one. This is the flaw in genetic algorithms. You can find “pretty good” solutions for problems that are too complex to even try with other algorithms. But they might never be optimal. Like this one for 25 points after 10000 iterations: It sure looks good. But we can’t be sure it is the best. For more complex or multidimensional problems it’s the same. We use these algorithms until we have a solution that is as good as we need it.

Ok. Your turn now. Try getting a really good solution for 100 points.

Mines here: 100 Points, 25 routes, mutation rate 0.1, 100000 iterations.

It’s not even close to optimal, but half the length of the random path.

By now you might guess that each crossover is an error that might get solved if we let this run longer. Try to do better than me.

Find this as Jupyter notebook at GitHub: GeneticMLExamples

Have fun.

## 4 Replies to “Genetic Algorithm: Optimizing the Traveling Salesman”

1. kage says:

What would be the big o notation for this algorithm and also could you provide pseudocode for this

2. Mike says:

Hi kage,

I am not entirely sure. I like the short explanation here: https://www.researchgate.net/post/How_Can_I_compute_the_time_complexity_Big-O_Notation_of_Genetic_Algorithm2

The fitness function IS the biggest problem of it though so the simplest form: “O(NG)” that’s described there isn’t really giving you an answer that helps you out much. Optimization of the fitness function is the main priority when trying to optimize the performance.

No pseudocode from me for this, sorry. Just not willing to put in the time right now. Maybe this can help you: http://www.cleveralgorithms.com/nature-inspired/evolution/genetic_algorithm.html

3. Dorian says:

Hmm, am I missing something? What is
“self.breednewgeneration()”
referring to? Doesn’t seem to be defined as far as I and my search function can see.

1. Mike says:

Thank you very much Dorian.
You were right. I added it.
I also uploaded the notebook to GitHub. The link is at the end of the article.

Have fun.