Overview

Josiann is a Python library which implements the simulated annealing algorithm for noisy cost functions. It has support for vectorized functions, multiprocessing and provides a parallel mode for optimizing several similar but independent problems at once.

Description

A classical problem in optimization consists in finding the global minimum of some cost function. In several cases however, measures of the cost function might carry noise because of imprecise measurement techniques or because of intrinsic randomness. Josiann is an implementation of the modified Simulated Annealing algorithm introduced in [1].

Warning

The algorithm converges under the condition that the cost function has the form \(f(x) = g(x) + e(x)\), where \(e(x)\) is a random term drawn from a random symmetrical distribution with mean 0.

Briefly, the algorithm takes multiple evaluations of the cost function and averages them together to reduce the amount of noise. The number of evaluations to be averaged depends on the Temperature parameter and on the maximum allowed number of evaluations, but grows with the number of iterations (so that the standard error of the noise decreases on the order of \(O(k^{-\gamma})\) with k the iteration number and \(\gamma\) an arbitrary constant larger than 1).

Implementations

Josiann provides 4 implementations of the algorithm, 3 sequential modes and 1 parallel mode.

Sequential modes

In sequential modes, ONE problem is solved at a time though the cost function can be any dimensional.

  • josiann.sa() is the base version of the algorithm.

  • josiann.vsa() expects a vectorized cost function to compute multiple evaluations of the function at once (it either evaluates the cost function multiple times at the same position or evaluates it at different positions when possible).

  • josiann.mcsa() expects a regular cost function but runs on multiple CPU cores to obtain multiple evaluations faster.

Note

where m is the number of evaluations to compute.

Parallel mode

In parallel mode, multiple independent optimization problems are solved in parallel. The cost functions can be any-dimensional but must be the same in each parallel optimization task.

Note

As for josiann.vsa(), josiann.parallel.psa() expects a cost functions of the form f: (n, m) matrix ⟶ m-dim vector.

More precisely, the cost function should take as first argument a josiann.parallel.ParallelArgument object which will hold all instructions for one iteration’s computation :

The functions evaluations should not be returned by the cost function but rather store in the josiann.parallel.ParallelArgument.result attribute which will ensure that all function evaluations were indeed computed.

Moves

Moves define how a position vector (at which the cost function is evaluated) is modified at each iteration to generate new candidate positions. Some commonly used moves have been implemented :

Sequential moves josiann.moves.sequential

Ensemble moves josiann.moves.ensemble (moves with multiple walkers)

Discrete moves josiann.moves.discrete

  • josiann.moves.discrete.SetStep (A set of possible positions \({a_1, ..., a_p}\) is defined for each dimension of the position vector. At each iteration, element \(e\) of the current position vector (\(e = a_i, i \in {1, ..., p}\)) is updated by picking at random the left or right neighbor in the set of positions (e is set to \(a_{i-1}\) or \(a_{i+1}\)).)

  • josiann.moves.discrete.SetStretch (Modified version of the Stretch move with a defined set of possible positions (during the stretch process, the closest value in the set is chosen).)

Parallel moves josiann.parallel.moves.discrete

Walkers

A walker is a focus point of Josiann. In regular settings, Josiann only cares about a single position vector so the number of walkers is 1.

With some special moves (ensemble moves) however, it is possible to obtain more information about the cost function to optimize by evaluating it in multiple positions. Each of the positions Josiann keeps track of is a distinct walker, each defined by its position vector and associated cost. At each iteration, the positions of all walkers are updated by the ensemble move, usually by taking into account the position of other walkers.

Parameter nb_walkers allows the user to choose how many walkers to run.

Note

When using josiann.vsa(), it is possible to vectorize on walkers (instead of vectorizing on the number of function evaluations): the cost for the position of all walkers will be computed at once. To do so, set parameter vectorized_on_evaluations to False.

Backup

When using discrete moves, the probability for a walker to reach the same position twice is sufficiently high to cache past function evaluations to save later computation costs. This is especially true with final iterations, when the walker does not move much. When the backup is active, positions and associated costs are stored so that only \(n - l\) evaluations need to be computed instead of \(n\), where \(l\) is the number of already computed function evaluations at the same position.

To activate the backup, set the backup parameter to True when calling josiann.sa(), josiann.vsa(), josiann.mcsa() or josiann.parallel.psa().

Convergence detection

As the Temperature parameter drops, the fraction of accepted candidate moves drops significantly. When a strong minimum exists, a large fraction of the last iterations are spent repeatedly evaluating the cost functions at the same neighboring positions while. To detect such cases of early convergence, Josiann can compute the Root-Mean-Square Deviation (RMSD) of the costs of the last \(w\) accepted positions. When the RMSD drops below a threshold value, convergence is declared to have occurred and the algorithm stops early.

Convergence detection is activated by default but can be managed with the detect_convergence parameter. The number \(w\) of pasted accepted positions to consider and the threshold values can be configured through the window_size and tol parameters respectively.

Result and Trace objects

All 4 algorithms return a josiann.Result object which contains :

Warning

Plotting requires the Python library plotly.

Parameters

Common parameters are :

  • fun: the cost function to optimize

  • x0: a vector (or matrix) of initial positions

  • args: additional arguments to be passed to fun. Those arguments are constant and will not be updated by the optimization algorithm.

  • bounds: Optionally, min and max bounds for each dimension can be defined to limit the optimization search space.

  • moves: one or more moves to update position vectors.

  • max_iter: maximum number of iterations to compute before stopping the optimization algorithm.

  • max_measures: maximum number of function evaluations to compute at a single position.

  • T_0: initial value for the Temperature parameter.

  • detect_convergence: whether to stop the optimization algorithm before reaching max_iter if convergence is detected.

  • tol: tolerance for detecting convergence.

  • window_size: number of iterations to consider for detecting convergence, computing the fraction of accepted moves and finding the best position reached (the position with the lowest associated cost).

  • seed: an optional seed value to fix the randomness.