Genetic Algorithm: Optimizing the Traveling Salesman

The traveling salesman is an interesting problem to test a simple genetic algorithm on something more complex.

Let’s check how it’s done in python.

What is the traveling salesman problem?

When we talk about the traveling salesmen problem we talk about a simple task.

On any number of points on a map: What is the shortest route between the points? We start at any point, visit each point once and go back to the first one. So a pretty good solution for the problem here would look like this:

Humans are good at doing this up to a point. Computers are not. There are more efficient algorithms, but let’s assume we don’t have them yet. To solve even this simple 7 point problem we would have to check every possible path.

How many are there? Well like:

7 * 6 * 5 * 4 * 3* 2 * 1 = 5040

That’s doable. But assume 25 points:

25*24*23*22 … *3*2*1 = 15511210043331000000000000

Thats for all practical purposes impossible.

Why can a genetic algorithm solve this?

“Solve” is such a strong word. But genetic algorithms optimize genetic information. Everything that can be written as a row of numbers can be optimized if we can differ between a good and a bad piece of information. Also check “What is a genetic algorithm?”

If we number these points from 0 to 6(seven points) we can write a route like this:


Also, we can easily check the length of a path to compare it. On a 2d grid, the distance between two points a and b is:

absolute(xa -xb) + absolute(ya – yb)

We just add all the distances between the points and get the length of the route. The shorter the better.

So how do we do it?

Remember the steps of a genetic algorithm:

  1. create some random routes
  2. find the best routes among them
  3. breed new routes from the best ones
  4. change some routes randomly
  5. repeat 2-4 as long as we want

Some of that is more or less difficult. Well see it in detail soon.

Let’s get coding

We are doing this in python. Let’s start by importing all dependencies:

We need a little helper function. It just takes a numpy array a and deletes all values from it that are in b.

We build an object and initialize it. Therefore we create some parameters to control its behavior:

This is a method to get the best routes. We do this for all routes and then return the best 50% in order(longest to shortest).

We breed them by using half of the information of parent a and add them exactly where it was in parent a. Then we fill it up with the rest from parent b in order. Like this:

Parent A: 6435210

ParentB: 1203654

Offspring: 0635214

We do it with this methods:

The next step is mutating them. We use the mutation rate parameter which should be between 0 and 1 to switch out random points. We can’t just replace them as no point can be in there twice. 0.1 is a good value for the mutation rate. Also, there is a parameter “mutatebest” we use it to block the mutation of the best path. This makes sure we never get worse by mutation.

Lets put it all together. The “verbose” parameter switches a best/worse line for each iteration.

The last method is used to draw all the shiny graphics:

Cool. We are done. Let’s give it a go:

The code shows the points to connect first, followed by the best random route and then the best after all iterations:

Looks pretty good.

You need to understand that while these algorithms are optimizing, there is no guarantee the solution is optimal. You can make it better by using more iterations or more routes. But you can never be sure its the best one. This is the flaw in genetic algorithms. You can find “pretty good” solutions for problems that are too complex to even try with other algorithms. But they might never be optimal. Like this one for 25 points after 10000 iterations:

It sure looks good. But we can’t be sure it is the best. For more complex or multidimensional problems it’s the same. We use these algorithms until we have a solution that is as good as we need it.

Ok. Your turn now. Try getting a really good solution for 100 points.

Mines here:

100 Points, 25 routes, mutation rate 0.1, 100000 iterations.

It’s not even close to optimal, but half the length of the random path.

By now you might guess that each crossover is an error that might get solved if we let this run longer. Try to do better than me.

Find this as Jupyter notebook at GitHub: GeneticMLExamples

Have fun.

4 Replies to “Genetic Algorithm: Optimizing the Traveling Salesman”

  1. Hi kage,

    I am not entirely sure. I like the short explanation here:

    The fitness function IS the biggest problem of it though so the simplest form: “O(NG)” that’s described there isn’t really giving you an answer that helps you out much. Optimization of the fitness function is the main priority when trying to optimize the performance.

    No pseudocode from me for this, sorry. Just not willing to put in the time right now. Maybe this can help you:

  2. Hmm, am I missing something? What is
    referring to? Doesn’t seem to be defined as far as I and my search function can see.

    1. Thank you very much Dorian.
      You were right. I added it.
      I also uploaded the notebook to GitHub. The link is at the end of the article.

      Have fun.

Leave a Reply

Your email address will not be published. Required fields are marked *