How to do multistart optimizations¶

Sometimes you want to make sure that your optimization is robust to the initial parameter values, i.e. that it does not get stuck at a local optimum. This is where multistart comes in handy.

What does multistart (not) do¶

In short, multistart iteratively runs local optimizations from different initial conditions. If enough local optimization convergence to the same point, it stops. Importantly, it cannot guarantee that the result is the global optimum, but it can increase your confidence in the result.

TL;DR¶

To activate multistart at the default options, pass multistart=True to the minimize or maximize function, as well as finite bounds on the parameters (which are used to sample the initial points). The default options are discussed below.

import numpy as np
import plotly.io as pio

pio.renderers.default = "notebook_connected"

import optimagic as om


def fun(x):
    return x @ x


x0 = np.arange(7) - 4

bounds = om.Bounds(
    lower=np.full_like(x0, -5),
    upper=np.full_like(x0, 10),
)

algo_options = {"stopping_maxfun": 1_000}

res = om.minimize(
    fun=fun,
    x0=x0,
    algorithm="scipy_neldermead",
    algo_options=algo_options,
    bounds=bounds,
    multistart=True,
)

In this example, we limited each local optimization to 1_000 function evaluations. In general, it is a good idea to limit the number of iterations and function evaluations for the local optimization. Because of the iterative nature of multistart, this limitation will usually not result in a precision issue.

What does multistart mean in optimagic?¶

Our multistart optimizations are inspired by the TikTak algorithm and consist of the following steps:

Draw a large exploration sample of parameter vectors randomly or using a low-discrepancy sequence.
Evaluate the objective function in parallel on the exploration sample.
Sort the parameter vectors from best to worst according to their objective function values.
Run local optimizations iteratively. That is, the first local optimization is started from the best parameter vector in the sample. All subsequent ones are started from a convex combination of the currently best known parameter vector and the next sample point.

Visualizing multistart results¶

To illustrate the multistart results, we will consider the optimization of a slightly more complex objective function, compared to fun from above. We also limit the number of exploration samples to 100.

def alpine(x):
    return np.sum(np.abs(x * np.sin(x) + 0.1 * x))


res = om.minimize(
    alpine,
    x0=x0,
    algorithm="scipy_neldermead",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(n_samples=100, seed=0),
)

fig = om.criterion_plot(res, monotone=True)
fig.show()

In the above image we see the optimization history for all of the local optimizations that have been run by multistart. The turquoise line represents the history corresponding to the local optimization that found the overall best parameter.

We see that running a single optimization would not have sufficed, as some local optimizations are stuck.

Multistart does not always run many optimization¶

Since the local optimizations are run iteratively by multistart, it is possible that only a handful of optimizations are actually run if all of them converge to the same point. This convergence is determined by the convergence_max_discoveries option, which defaults to 2. This means that if 2 local optimizations report the same point, multistart will stop. Below we see that if we use the simpler objective function (fun), and the scipy_lbfgsb algorithm, multistart runs only 2 local optimizations, and then stops, as both of them converge to the same point. Note that, the scipy_lbfgsb algorithm can solve this simple problem precisely, without reaching the maximum number of function evaluations.

res = om.minimize(
    fun,
    x0=x0,
    algorithm="scipy_lbfgsb",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(n_samples=100, seed=0),
)

fig = om.criterion_plot(res)
fig.show()

How to configure multistart¶

Configuration of multistart can be done by passing an instance of optimagic.MultistartOptions to minimize or maximize. Let’s look at a few examples configurations.

How to run a specific number of optimizations¶

To run a specific number of local optimizations, you need to set the stopping_maxopt option. Note that this does not set the number of exploration samples, which is controlled by the n_samples option. The number of exploration samples always needs to be at least as large as the number of local optimizations.

Note that, as long as convergence_max_discoveries is smaller than stopping_maxopt, it is possible that a smaller number of local optimizations are run. To avoid this, set convergence_max_discoveries to a value at least as large as stopping_maxopt.

To run, for example, 10 local optimizations from 15 exploration samples, do:

res = om.minimize(
    alpine,
    x0=x0,
    algorithm="scipy_neldermead",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(
        n_samples=15,
        stopping_maxopt=10,
        convergence_max_discoveries=10,
    ),
)

res.multistart_info.n_optimizations

How to set a custom exploration sample¶

If you want to start the multistart algorithm with a custom exploration sample, you can do so by passing a sequence of parameters to the sample option. Note that sequence elements must be of the same type as your parameter.

To generate a sample of 100 random parameters and run them through the multistart algorithm, do:

rng = np.random.default_rng(12345)

sample = [x0 + rng.uniform(-1, 1, size=len(x0)) for _ in range(100)]

res = om.minimize(
    alpine,
    x0=x0,
    algorithm="scipy_neldermead",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(sample=sample),
)

How to run multistart in parallel¶

The multistart algorithm can be run in parallel by setting the n_cores option to a value greater than 1. This will run the algorithm in batches. By default, the batch size is set to n_cores, but can be controlled by setting the batch_size option. The default batch evaluator is joblib, but can be controlled by setting the batch_evaluator option to "pathos" or a custom callable.

To run the multistart algorithm in parallel, do:

res = om.minimize(
    alpine,
    x0=x0,
    algorithm="scipy_lbfgsb",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(n_cores=2),
)

What to do if you do not have bounds¶

Multistart requires finite bounds on the parameters. If your optimization problem is not bounded, you can set soft lower and upper bounds. These bounds will only be used to draw the exploration sample, and will not be used to constrain the local optimizations.

To set soft bounds, do:

res = om.minimize(
    alpine,
    x0=x0,
    algorithm="scipy_lbfgsb",
    bounds=om.Bounds(soft_lower=np.full_like(x0, -3), soft_upper=np.full_like(x0, 8)),
    multistart=True,
)

Understanding multistart results¶

When activating multistart, the optimization result object corresponds to the local optimization that found the best objective function value. The result object has the additional attribute multistart_info, where all of the additional information is stored. It has the following attributes:

local_optima: A list with the results from all local optimizations that were performed.
start_parameters: A list with the start parameters from those optimizations
exploration_sample: A list with parameter vectors at which the objective function was evaluated in an initial exploration phase.
exploration_results: The corresponding objective values.
n_optimizations: The number of local optimizations that were run.

To illustrate the multistart results, let us consider the optimization of the simple fun objective function from above.

res = om.minimize(
    fun,
    x0=x0,
    algorithm="scipy_lbfgsb",
    bounds=bounds,
    algo_options=algo_options,
    multistart=om.MultistartOptions(n_samples=100, convergence_max_discoveries=2),
)

Start parameters¶

The start parameters are the parameter vectors from which the local optimizations were started. Since the default number of convergence_max_discoveries is 2, and both local optimizations were successfull, the start parameters have 2 rows.

res.multistart_info.start_parameters

[array([-4., -3., -2., -1.,  0.,  1.,  2.]),
 array([ 3.15214709,  0.21473643,  1.65809255, -2.0197051 , -1.66057214,
        -0.43555428, -1.21130945])]

Local Optima¶

The local optima are the results from the local optimizations. Since in this example only two local optimizations were run, the local optima list has two elements, each of which is an optimization result object.

len(res.multistart_info.local_optima)

Exploration sample¶

The exploration sample is a list of parameter vectors at which the objective function was evaluated. Above, we chose a random exploration sample of 100 parameter vectors.

np.vstack(res.multistart_info.exploration_sample).shape

(100, 7)

Exploration results¶

The exploration results are the objective function values at the exploration sample.

len(res.multistart_info.exploration_results)

Number of local optimizations¶

res.multistart_info.n_optimizations