Internal optimizers for optimagic¶

optimagic provides a large collection of optimization algorithm that can be used by passing the algorithm name as algorithm into maximize or minimize. Advanced users can also use optimagic with their own algorithm, as long as it conforms with the internal optimizer interface.

The advantages of using the algorithm with optimagic over using it directly are:

You can collect the optimizer history and create criterion_plots and params_plots.
You can use flexible formats for your start parameters (e.g. nested dicts or namedtuples)
optimagic turns unconstrained optimizers into constrained ones.
You can use logging.
You get great error handling for exceptions in the criterion function or gradient.
You get a parallelized and customizable numerical gradient if you don’t have a closed form gradient.
You can compare your optimizer with all the other optimagic optimizers on our benchmark sets.

All of this functionality is achieved by transforming a more complicated user provided problem into a simpler problem and then calling “internal optimizers” to solve the transformed problem.

Functions and classes for internal optimizers¶

The functions and classes below are everything you need to know to add an optimizer to optimagic. To see them in action look at this guide

mark.minimizer

The mark.minimizer decorator is used to provide algorithm specific information to optimagic. This information is used in the algorithm selection tool, for better error handling and for processing of the user provided optimization problem.

optimagic.mark.minimizer(name: str, solver_type: AggregationLevel, is_available: bool, is_global: bool, needs_jac: bool, needs_hess: bool, needs_bounds: bool, supports_parallelism: bool, supports_bounds: bool, supports_infinite_bounds: bool, supports_linear_constraints: bool, supports_nonlinear_constraints: bool, disable_history: bool = False, experimental: bool = False) → Callable[[AlgorithmSubclass], AlgorithmSubclass][source]¶

Mark an algorithm as a optimagic minimizer and add AlgoInfo.

Parameters:

name – The name of the algorithm as a string. Used in error messages, warnings and the OptimizeResult.
solver_type – The type of optimization problem the algorithm solves. Used to distinguish between scalar, least-squares and likelihood optimizers. Can take the values AggregationLevel.SCALAR, AggregationLevel.LEAST_SQUARES and AggregationLevel.LIKELIHOOD.
is_available – Whether the algorithm is installed.
is_global – Whether the algorithm is a global optimizer.
needs_jac – Whether the algorithm needs some kind of first derivative. This needs to be True if the algorithm uses jac or fun_and_jac.
needs_hess – Whether the algorithm needs some kind of second derivative. This is not yet implemented and will be False for all currently wrapped algorithms.
needs_bounds – Whether the algorithm needs bounds to run. This is different from supports_bounds in that algorithms that support bounds can run without requiring them.
supports_parallelism – Whether the algorithm supports parallelism. This needs to be True if the algorithm previously took n_cores and/or batch_evaluator as arguments.
supports_bounds – Whether the algorithm supports bounds. This needs to be True if the algorithm previously took lower_bounds and/or upper_bounds as arguments.
supports_infinite_bounds – Whether the algorithm supports infinite values in bounds.
supports_linear_constraints – Whether the algorithm supports linear constraints. This is not yet implemented and will be False for all currently wrapped algorithms.
supports_nonlinear_constraints – Whether the algorithm supports nonlinear constraints. This needs to be True if the algorithm previously took nonlinear_constraints as an argument.
disable_history – Whether the algorithm should disable history collection.
experimental – Whether the algorithm is experimental and should skip tests.

InternalOptimizationProblem

The InternalOptimizationProblem is optimagic’s internal representation of objective functions, derivatives, bounds, constraints, and more. This representation is already pretty close to what most algorithms expect (e.g. parameters and bounds are flat numpy arrays, no matter which format the user provided).

class optimagic.optimization.internal_optimization_problem.InternalOptimizationProblem[source]¶

fun(x: ndarray[tuple[Any, ...], dtype[float64]]) → float | ndarray[tuple[Any, ...], dtype[float64]][source]¶

Evaluate the objective function at x.

Parameters:

x – The parameter vector at which to evaluate the objective function.

Returns:

The function value at x. This is a scalar for scalar problems and an array: for least squares or likelihood problems.

jac(x: ndarray[tuple[Any, ...], dtype[float64]]) → ndarray[tuple[Any, ...], dtype[float64]][source]¶

Evaluate the first derivative at x.

Parameters:

x – The parameter vector at which to evaluate the first derivative.

Returns:

The first derivative at x. This is a 1d array for scalar problems (the: gradient) and a 2d array for least squares or likelihood problems (the Jacobian).

fun_and_jac(x: ndarray[tuple[Any, ...], dtype[float64]]) → tuple[float | ndarray[tuple[Any, ...], dtype[float64]], ndarray[tuple[Any, ...], dtype[float64]]][source]¶

Simultaneously evaluate the objective function and its first derivative.

See .fun and .jac for details.

batch_fun(x_list: list[ndarray[tuple[Any, ...], dtype[float64]]], n_cores: int, batch_size: int | None = None) → list[float | ndarray[tuple[Any, ...], dtype[float64]]][source]¶

Parallelized batch version of .fun.

Parameters:

x_list – A list of parameter vectors at which to evaluate the objective function.
n_cores – The number of cores to use for the parallel evaluation.
batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns:

A list of function values at the points in x_list. See .fun for details.

batch_jac(x_list: list[ndarray[tuple[Any, ...], dtype[float64]]], n_cores: int, batch_size: int | None = None) → list[ndarray[tuple[Any, ...], dtype[float64]]][source]¶

Parallelized batch version of .jac.

Parameters:

x_list – A list of parameter vectors at which to evaluate the first derivative.
n_cores – The number of cores to use for the parallel evaluation.
batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns:

A list of first derivatives at the points in x_list. See .jac for details.

batch_fun_and_jac(x_list: list[ndarray[tuple[Any, ...], dtype[float64]]], n_cores: int, batch_size: int | None = None) → list[tuple[float | ndarray[tuple[Any, ...], dtype[float64]], ndarray[tuple[Any, ...], dtype[float64]]]][source]¶

Parallelized batch version of .fun_and_jac.

Parameters:

x_list – A list of parameter vectors at which to evaluate the objective function and its first derivative.
n_cores – The number of cores to use for the parallel evaluation.
batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns:

A list of tuples containing the function value and the first derivative: at the points in x_list. See .fun_and_jac for details.

exploration_fun(x_list: list[ndarray[tuple[Any, ...], dtype[float64]]], n_cores: int, batch_size: int | None = None) → list[float][source]¶

with_new_history() → Self[source]¶

with_error_handling(error_handling: ErrorHandling) → Self[source]¶

with_step_id(step_id: int) → Self[source]¶

property bounds: InternalBounds¶: Bounds of the optimization problem.

property converter: Converter¶

Converter between external and internal parameter representation.

The converter transforms parameters between their user-provided representation (the external representation) and the flat numpy array used by the optimizer (the internal representation).

This transformation includes: - Flattening and unflattening of pytree structures. - Applying parameter constraints via reparametrizations. - Scaling and unscaling of parameter values.

The Converter object provides the following main attributes:

params_to_internal: Callable that converts a pytree of external parameters to a flat numpy array of internal parameters.
params_from_internal: Callable that converts a flat numpy array of internal parameters to a pytree of external parameters.
derivative_to_internal: Callable that converts the derivative from the external parameter space to the internal space.
has_transforming_constraints: Boolean that is True if the conversion involves constraints that are handled by reparametrization.

Examples

The converter is particularly useful for algorithms that require initial values in the internal (flat) parameter space, while allowing the user to specify these values in the more convenient external (pytree) format.

Here’s how an optimization algorithm might use the converter internally to prepare parameters for the optimizer:

>>> from optimagic.optimization.internal_optimization_problem import (
...     SphereExampleInternalOptimizationProblem
... )
>>> import numpy as np
>>>
>>> # Optimization problem instance.
>>> problem = SphereExampleInternalOptimizationProblem()
>>>
>>> # User provided parameters in external format.
>>> user_params = np.array([1.0, 2.0, 3.0])
>>>
>>> # Convert to internal format for optimization algorithms.
>>> internal_params = problem.converter.params_to_internal(user_params)
>>> internal_params
array([1., 2., 3.])

property linear_constraints: list[dict[str, Any]] | None¶

property nonlinear_constraints: list[dict[str, Any]] | None¶

Internal representation of nonlinear constraints.

Compared to the user provided constraints, we have done the following transformations:

1. The constraint a <= g(x) <= b is transformed to h(x) >= 0, where h(x) is - h(x) = g(x), if a == 0 and b == inf - h(x) = g(x) - a, if a != 0 and b == inf - h(x) = (g(x) - a, -g(x) + b) >= 0, if a != 0 and b != inf.

2. The equality constraint g(x) = v is transformed to h(x) >= 0, where h(x) = (g(x) - v, -g(x) + v).

3. Vector constraints are transformed to a list of scalar constraints. g(x) = (g1(x), g2(x), …) >= 0 is transformed to (g1(x) >= 0, g2(x) >= 0, …).

4. The constraint function (defined on a selection of user-facing parameters) is transformed to be evaluated on the internal parameters.

property direction: Direction¶: Direction of the optimization problem.

property history: History¶: History container for the optimization problem.

property logger: LogStore[Any, Any] | None¶: Logger for the optimization problem.

Naming conventions for algorithm specific arguments¶

To make switching between different algorithm as simple as possible, we align the names of commonly used convergence and stopping criteria. We also align the default values for stopping and convergence criteria as much as possible.

You can find the harmonized names and value here: The default algorithm options.

To align the names of other tuning parameters as much as possible with what is already there, simple have a look at the optimizers we already wrapped. For example, if you are wrapping a bfgs or lbfgs algorithm from some libray, try to look at all existing wrappers of bfgs algorithms and use the same names for the same options.

Algorithms that parallelize¶

Algorithms that evaluate the objective function or derivatives in parallel should only do so via InternalOptimizationProblem.batch_fun, InternalOptimizationProblem.batch_jac or InternalOptimizationProblem.batch_fun_and_jac.

If you parallelize in any other way, the automatic history collection will stop to work.

In that case, call om.mark.minimizer with disable_history=True. In that case you can either do your own history collection and add that history to InternalOptimizeResult or the user has to rely on logging.

Nonlinear constraints¶

(to be written)