# Overview

Josiann is a Python library which implements the simulated annealing algorithm for noisy cost functions. It has support for vectorized functions, multiprocessing and provides a parallel mode for optimizing several similar but independent problems at once.

## Description

A classical problem in optimization consists in finding the global minimum of some cost function. In several cases however, measures of the cost function might carry noise because of imprecise measurement techniques or because of intrinsic randomness. Josiann is an implementation of the modified Simulated Annealing algorithm introduced in [1].

Warning

The algorithm converges under the condition that the cost function has the form \(f(x) = g(x) + e(x)\), where \(e(x)\) is a random term drawn from a random symmetrical distribution with mean 0.

Briefly, the algorithm takes multiple evaluations of the cost function and averages them together to reduce the amount of noise. The number of evaluations to be averaged depends on the Temperature parameter and on the maximum allowed number of evaluations, but grows with the number of iterations (so that the standard error of the noise decreases on the order of \(O(k^{-\gamma})\) with k the iteration number and \(\gamma\) an arbitrary constant larger than 1).

## Implementations

Josiann provides 4 implementations of the algorithm, 3 sequential modes and 1 parallel mode.

### Sequential modes

In sequential modes, **ONE** problem is solved at a time though the cost function can be any dimensional.

`josiann.sa()`

is the base version of the algorithm.`josiann.vsa()`

expects a vectorized cost function to compute multiple evaluations of the function at once (it either evaluates the cost function multiple times at the same position or evaluates it at different positions when possible).`josiann.mcsa()`

expects a regular cost function but runs on multiple CPU cores to obtain multiple evaluations faster.

Note

`josiann.sa()`

and`josiann.mcsa()`

expect a cost function of the form f: n-dim vector ⟶ float.`josiann.vsa()`

expects a function of the form f: (n, m) matrix ⟶ m-dim vector.

where m is the number of evaluations to compute.

### Parallel mode

In parallel mode, multiple independent optimization problems are solved in parallel. The cost functions can be any-dimensional but must be the same in each parallel optimization task.

`josiann.parallel.psa()`

is the parallel version of the algorithm.

Note

As for `josiann.vsa()`

, `josiann.parallel.psa()`

expects a cost functions of the form
f: (n, m) matrix ⟶ m-dim vector.

More precisely, the cost function should take as first argument a `josiann.parallel.ParallelArgument`

object
which will hold all instructions for one iteration’s computation :

`josiann.parallel.ParallelArgument.positions`

: the position vectors (in an (n, m) matrix)`josiann.parallel.ParallelArgument.nb_evaluations`

: how many evaluations are required per position vector

The functions evaluations should not be returned by the cost function but rather store in the
`josiann.parallel.ParallelArgument.result`

attribute which will ensure that all function evaluations were
indeed computed.

## Moves

*Moves* define how a position vector (at which the cost function is evaluated) is modified at each
iteration to generate new candidate positions.
Some commonly used moves have been implemented :

### Sequential moves `josiann.moves.sequential`

`josiann.moves.sequential.Metropolis`

(next position is drawn at random from a multivariate normal distribution centered at the current position)`josiann.moves.sequential.Metropolis1D`

(same as Metropolis but only one dimension is updated per iteration.)`josiann.moves.sequential.RandomStep`

(next position is generated by incrementing one coordinate of the current position vector by a random value. The updated coordinate is chosen at random.)

### Ensemble moves `josiann.moves.ensemble`

(moves with multiple walkers)

`josiann.moves.ensemble.Stretch`

(adapted from [2])`josiann.moves.ensemble.StretchAdaptive`

(adapted from [2] with varying parameter a.)

### Discrete moves `josiann.moves.discrete`

`josiann.moves.discrete.SetStep`

(A set of possible positions \({a_1, ..., a_p}\) is defined for each dimension of the position vector. At each iteration, element \(e\) of the current position vector (\(e = a_i, i \in {1, ..., p}\)) is updated by picking at random the left or right neighbor in the set of positions (e is set to \(a_{i-1}\) or \(a_{i+1}\)).)`josiann.moves.discrete.SetStretch`

(Modified version of the Stretch move with a defined set of possible positions (during the stretch process, the closest value in the set is chosen).)

### Parallel moves `josiann.parallel.moves.discrete`

`josiann.parallel.moves.discrete.ParallelSetStep`

(A version of`josiann.moves.discrete.SetStep`

suited for the parallel mode.)

## Walkers

A *walker* is a focus point of Josiann.
In regular settings, Josiann only cares about a single position vector so the number of walkers is 1.

With some special moves (ensemble moves) however, it is possible to obtain more information about the cost function to optimize by evaluating it in multiple positions. Each of the positions Josiann keeps track of is a distinct walker, each defined by its position vector and associated cost. At each iteration, the positions of all walkers are updated by the ensemble move, usually by taking into account the position of other walkers.

Parameter nb_walkers allows the user to choose how many walkers to run.

Note

When using `josiann.vsa()`

, it is possible to vectorize on walkers (instead of vectorizing on the number of
function evaluations):
the cost for the position of all walkers will be computed at once. To do so, set parameter
vectorized_on_evaluations to False.

## Backup

When using discrete moves, the probability for a walker to reach the same position twice is sufficiently high to cache past function evaluations to save later computation costs. This is especially true with final iterations, when the walker does not move much. When the backup is active, positions and associated costs are stored so that only \(n - l\) evaluations need to be computed instead of \(n\), where \(l\) is the number of already computed function evaluations at the same position.

To activate the backup, set the backup parameter to True when calling `josiann.sa()`

, `josiann.vsa()`

,
`josiann.mcsa()`

or `josiann.parallel.psa()`

.

## Convergence detection

As the Temperature parameter drops, the fraction of accepted candidate moves drops significantly. When a strong minimum exists, a large fraction of the last iterations are spent repeatedly evaluating the cost functions at the same neighboring positions while. To detect such cases of early convergence, Josiann can compute the Root-Mean-Square Deviation (RMSD) of the costs of the last \(w\) accepted positions. When the RMSD drops below a threshold value, convergence is declared to have occurred and the algorithm stops early.

Convergence detection is activated by default but can be managed with the detect_convergence parameter. The number \(w\) of pasted accepted positions to consider and the threshold values can be configured through the window_size and tol parameters respectively.

## Result and Trace objects

All 4 algorithms return a `josiann.Result`

object which contains :

a

`josiann.Result.success`

boolean value to indicate if the optimization process was successful.a final

`josiann.Result.message`

string with details on the success.a

`josiann.Result.trace`

(`josiann.Trace`

) object which stores the coordinates of positions reached at each iteration and provides functions for plotting those positions.a

`josiann.Result.parameters`

object which stores run parameters.

Warning

Plotting requires the Python library plotly.

## Parameters

Common parameters are :

fun: the cost function to optimize

x0: a vector (or matrix) of initial positions

args: additional arguments to be passed to fun. Those arguments are constant and will not be updated by the optimization algorithm.

bounds: Optionally, min and max bounds for each dimension can be defined to limit the optimization search space.

moves: one or more moves to update position vectors.

max_iter: maximum number of iterations to compute before stopping the optimization algorithm.

max_measures: maximum number of function evaluations to compute at a single position.

T_0: initial value for the Temperature parameter.

detect_convergence: whether to stop the optimization algorithm before reaching max_iter if convergence is detected.

tol: tolerance for detecting convergence.

window_size: number of iterations to consider for detecting convergence, computing the fraction of accepted moves and finding the best position reached (the position with the lowest associated cost).

seed: an optional seed value to fix the randomness.