Greedy algorithm and simulated annealing algorithm

Greedy algorithm is a relatively common algorithm. The essence of greed is to choose the local optimal of each stage to achieve the global optimal. Simulated annealing algorithms are more common in artificial intelligence.

The reason why these two algorithms are considered together is that the simulated annealing algorithm is like an enhanced version of the greedy algorithm.

If the local optimal solution of each step of a problem cannot obtain the global optimal solution, the algorithm thinking of our conventional front-end and back-end programmers is Dynamic Programming.

But for AI development, the parameters may be hundreds of millions of levels, there is no way Dynamic Programming, this time you can try to use simulated annealing algorithm to jump out of this part with a certain probability, to find a better answer in the whole, but in fact This algorithm is not like a conventional algorithm, and will get an inevitable result.

This article summarizes and compares the two algorithms together to see where they are applicable and how to use them.

Greedy algorithm

What is a greedy algorithm

The essence of greed is to choose the local optimum of each stage, so as to achieve the global optimum.

This is a bit abstract, let’s take an example:

For example, if you have a pile of banknotes, you can take ten of them. If you want to reach the maximum amount, how do you take them?

Specify that you take the largest amount each time, and the end result is to take the largest amount of money.

Each time you take the largest is the local optimal, and finally take the largest amount of money is to launch the global optimal.

As another example, if there are a bunch of boxes, and you have a backpack with a volume of n, how to fill the backpack as much as possible, if you still choose the largest box every time, it will not work. At this time, Dynamic Programming is needed.

When to use a greedy algorithm

To be honest, there is no fixed routine for greedy algorithms.

So the only difficulty is how to use the local optimum to derive the overall optimum.

So how can we see if the local optimum can lead to the overall optimum? Are there any fixed strategies or routines?

Sorry, no! Manually simulate by yourself. If the simulation is feasible, you can try the greedy strategy. If it is not feasible, Dynamic Programming may be required.

A colleague asked how to verify whether a greedy algorithm can be used?

The best strategy is to give counterexamples. If you can’t think of counterexamples, try greed.

Greedy algorithm general steps:

  • Break down the problem into several sub-problems
  • Find the right greedy strategy
  • Solve the optimal solution for each sub-problem
  • Stack locally optimal solutions into globally optimal solutions

These four steps are actually too theoretical. We usually do greedy topics, and it is difficult to think according to these four steps. It is really a bit “chicken ribs”.

When doing the question, as long as you think clearly, what is the local optimal, if you deduce the global optimal, it is actually enough.

Simulated annealing algorithm

Metal annealing principle

Metal annealing is a metal heat treatment process in which the metal is heated to a certain temperature, kept for a sufficient time, and then cooled at a suitable speed (usually slow cooling, sometimes controlled cooling). The simulated annealing algorithm is derived from the principle of solid annealing, which heats the solid to a sufficiently high temperature, and then allows it to cool slowly. When heated, the internal particles of the solid become disordered with the temperature rise, and the internal energy increases, while when slowly cooled, the particles gradually become Orderly, reaching an equilibrium state at each temperature, and finally reaching the ground state at room temperature, the internal energy is minimized.

When in a low temperature state, the molecules in the solid have very low internal energy and vibrate in a small range in their original position. If the solid is heated to a certain temperature, the internal energy of the molecules will increase, the thermal movement will intensify, and the disorder of the molecular arrangement will increase. At this time, the temperature is slowly lowered, and at each temperature an equilibrium state (that is, a quasi-static process) is reached, the energy of the molecules gradually decreases, and eventually returns to the state of orderly arrangement, and the internal energy of the molecules also drops to a minimum.

Simulated annealing algorithm

The earliest idea of simulated annealing algorithm (Simulated Annealing, SA) was proposed by N. Metropolis et al in 1953. In 1983, S. Kirkpatrick et al successfully introduced the annealing idea into the field of combinatorial optimization. It is a stochastic optimization algorithm based on Monte-Carlo iterative solution strategy, and its starting point is based on the similarity between the annealing process of solid matter in physics and general combinatorial optimization problems.

Before introducing simulated annealing, it is necessary to introduce the mountain climbing algorithm.

Hill climbing algorithm

The hill climbing algorithm is a simple greedy search algorithm, which selects an optimal solution from the adjacent solution space of the current solution as the current solution every time until a local optimal solution is reached.

The implementation of the hill-climbing algorithm is very simple, and its main disadvantage is that it will fall into the local optimal solution, and it may not be able to search for the global optimal solution. As shown in the figure above: Assuming that point C is the current solution, the hill-climbing algorithm will stop searching when it searches for the local optimal solution at point A, because no matter where point A moves in that direction, it cannot get a better solution.

Core idea of simulated annealing

Simulated annealing is actually a greedy algorithm, but its search process introduces random factors. The simulated annealing algorithm accepts a solution that is worse than the current solution with a certain probability, so it is possible to jump out of this local optimal solution and achieve a global optimal solution.

The simulated annealing algorithm starts from a higher initial temperature, with the continuous decline of temperature parameters, combined with certain probability sudden jump characteristics, randomly finds the global optimal solution of the target function in the solution space, that is, the local optimal solution can probabilistically jump out and eventually tend to the global optimal.

The calculation of “certain probability” here refers to the annealing process of metal smelting, which is also the origin of the name of the simulated annealing algorithm. The temperature T is regarded as the control parameter, the target function value f is regarded as the internal energy E, and a state of a solid at a certain temperature T corresponds to a solution
Then the algorithm attempts to reduce the target function f (internal energy E) as the control parameter T decreases, until it tends to the global minimum (the lowest energy state at low temperature in annealing), just like the metal annealing process.

Mathematical principles of simulated annealing

From the above, we know that the global optimal solution of the target function will be randomly found in the solution space in combination with the probability sudden jump characteristic. So what is the specific mechanism for updating the solution? If the new solution is better than the current solution, the new solution is accepted, otherwise it is judged whether to accept the new solution based on the Metropolis criterion. The acceptance probability is:

\begin{equation} P = \begin{cases} 1 & E_{t + 1} < E_t;\\ e^{-\frac{E_{t + 1} - E_t}{kT}} & E_{t + 1} \geq E_t; \end{cases} \end{equation}

Assuming that the solution of the search at the current time is $x_t $, and the corresponding system energy (target function) is $E_t $, a random perturbation is applied to the search point to generate a new solution $x_ {t + 1} $, and accordingly, the system energy is $E_ {t + 1} $, then the acceptance probability of the system from the search point to the transition is the above formula.

That is, if the energy corresponding to the new solution is lower, then the probability of acceptance is 1, that is, it must be accepted. If the energy corresponding to the new solution is high, then accept the new solution with the probability of $e ^ {-\ frac {E_ {t + 1} - E_t} {kT}} $, that is, use this probability to jump out of this local optimal solution.

This process is to randomly select x, and then find a corresponding E lowest x out, where E is lower corresponding to our specific algorithm is closer to the goal we need

There is also a problem here, which is how to choose the size of each change in x when we randomly select x.

This problem is actually the origin of the word’annealing ', that is, the temperature will gradually decrease, and the magnitude of x change will become smaller and smaller.

Simulated annealing process

The essence of the algorithm is divided into two layers of cycles. At any temperature level, random disturbances generate a new solution, and calculate the change of the target function value to decide whether to accept it. Since the initial temperature of the algorithm is relatively high, the new solution that increases E may also be accepted initially, so it can jump out of the local minimum, and then by slowly reducing the temperature, the algorithm may eventually converge to the global optimal solution. The specific process is:

  1. Let $T = T_0 $, representing the initial temperature at which the annealing starts, randomly generate an initial solution $x_0 $, and calculate the corresponding target function value $E_0 $;
  2. Let $T = kT $, where k is between 0 and 1, which is the rate of temperature drop;
  3. Apply random perturbation to the current solution $x_t $, generate a new solution $x_ {t + 1} $in its neighborhood, and calculate the corresponding target function value $E_ {t + 1}\Delta E =E_{t+1} - E_t$
  4. If $\ Delta E < 0 $accepts the new solution as the current solution, otherwise it is judged whether to accept the new solution according to the probability $e ^ {-\ frac {\ Delta E} {kT}} $;
  5. Repeat the disturbance and acceptance process L times at temperature T, i.e. perform steps 3 and 4;
  6. Determine whether the temperature reaches the termination temperature level, if so, terminate the algorithm, otherwise return to step 2.

There are a few points to note:

  • The selection of the initial point has a certain impact on the results of the algorithm, it is best to run multiple times to make a comprehensive judgment on the results.
  • In the early stage of algorithm operation, the temperature drops quickly to avoid accepting too many poor results. As the running time increases, the temperature drop slows down to stabilize the results faster.
  • When the number of iterations increases to a certain number, the result may have reached stability, but there is still some time before the algorithm ends. Appropriate output conditions should be added when designing the program, and the program can be ended if the output conditions are met.

Application of simulated annealing

  • The application of simulated annealing algorithm in VLSI design, using simulated annealing algorithm to optimize the design of VLSI (Very Large Scale Integration, Very Large Scale Integrated Circuit), is one of the most successful application examples of simulated annealing algorithm at present. The simulated annealing algorithm can almost complete all optimized VLSI design work well. Such as global wiring, board layout, layout and logic minimization, etc.

  • Simulated annealing algorithm can be used for image restoration and other work, that is, to restore a contaminated image into a clear original image and filter out the distorted part. Therefore, its application prospect in image processing is broad.

  • Application of simulated annealing algorithm in neural network computer. The simulated annealing algorithm has the ability to jump out of the trap of local optima. In the Boltzmann machine, even if the system falls into the trap of local optima, after a period of time, it can jump out again, and the system will eventually converge in the direction of the global optimal value.

  • In the force-oriented layout algorithm, the idea of simulated annealing is actually implicit. At the beginning, all nodes are randomly arranged, and then the position of the node changes due to the action of the force between the nodes. The position change is equivalent to the x change, and the changed position, if the resultant force received by the node decreases, which is equivalent to a decrease in energy E, we accept the new solution, and then as the resultant force decreases, it is actually equivalent to a decrease in T, so the impact on the position will become smaller, the full resultant force decreases below a certain threshold, and the position change is not large, and we think the algorithm converges to the result.