Internal optimizers for optimagic#

optimagic provides a large collection of optimization algorithm that can be used by passing the algorithm name as algorithm into maximize or minimize. Advanced users can also use optimagic with their own algorithm, as long as it conforms with the internal optimizer interface.

The advantages of using the algorithm with optimagic over using it directly are:

  • You can collect the optimizer history and create criterion_plots and params_plots.

  • You can use flexible formats for your start parameters (e.g. nested dicts or namedtuples)

  • optimagic turns unconstrained optimizers into constrained ones.

  • You can use logging.

  • You get great error handling for exceptions in the criterion function or gradient.

  • You get a parallelized and customizable numerical gradient if you don’t have a closed form gradient.

  • You can compare your optimizer with all the other optimagic optimizers on our benchmark sets.

All of this functionality is achieved by transforming a more complicated user provided problem into a simpler problem and then calling “internal optimizers” to solve the transformed problem.

Functions and classes for internal optimizers#

The functions and classes below are everything you need to know to add an optimizer to optimagic. To see them in action look at this guide

mark.minimizer

The mark.minimizer decorator is used to provide algorithm specific information to optimagic. This information is used in the algorithm selection tool, for better error handling and for processing of the user provided optimization problem.

optimagic.mark.minimizer(name: str, solver_type: optimagic.typing.AggregationLevel, is_available: bool, is_global: bool, needs_jac: bool, needs_hess: bool, supports_parallelism: bool, supports_bounds: bool, supports_linear_constraints: bool, supports_nonlinear_constraints: bool, disable_history: bool = False) Callable[[optimagic.mark.AlgorithmSubclass], optimagic.mark.AlgorithmSubclass][source]#

Mark an algorithm as a optimagic minimizer and add AlgoInfo.

Parameters
  • name – The name of the algorithm as a string. Used in error messages, warnings and the OptimizeResult.

  • solver_type – The type of optimization problem the algorithm solves. Used to distinguish between scalar, least-squares and likelihood optimizers. Can take the values AggregationLevel.SCALAR, AggregationLevel.LEAST_SQUARES and AggregationLevel.LIKELIHOOD.

  • is_available – Whether the algorithm is installed.

  • is_global – Whether the algorithm is a global optimizer.

  • needs_jac – Whether the algorithm needs some kind of first derivative. This needs to be True if the algorithm uses jac or fun_and_jac.

  • needs_hess – Whether the algorithm needs some kind of second derivative. This is not yet implemented and will be False for all currently wrapped algorithms.

  • supports_parallelism – Whether the algorithm supports parallelism. This needs to be True if the algorithm previously took n_cores and/or batch_evaluator as arguments.

  • supports_bounds – Whether the algorithm supports bounds. This needs to be True if the algorithm previously took lower_bounds and/or upper_bounds as arguments.

  • supports_linear_constraints – Whether the algorithm supports linear constraints. This is not yet implemented and will be False for all currently wrapped algorithms.

  • supports_nonlinear_constraints – Whether the algorithm supports nonlinear constraints. This needs to be True if the algorithm previously took nonlinear_constraints as an argument.

  • disable_history – Whether the algorithm should disable history collection.

InternalOptimizationProblem

The InternalOptimizationProblem is optimagic’s internal representation of objective functions, derivatives, bounds, constraints, and more. This representation is already pretty close to what most algorithms expect (e.g. parameters and bounds are flat numpy arrays, no matter which format the user provided).

class optimagic.optimization.internal_optimization_problem.InternalOptimizationProblem[source]#
fun(x: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]) float | numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]][source]#

Evaluate the objective function at x.

Parameters

x – The parameter vector at which to evaluate the objective function.

Returns

The function value at x. This is a scalar for scalar problems and an array

for least squares or likelihood problems.

jac(x: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]) numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]][source]#

Evaluate the first derivative at x.

Parameters

x – The parameter vector at which to evaluate the first derivative.

Returns

The first derivative at x. This is a 1d array for scalar problems (the

gradient) and a 2d array for least squares or likelihood problems (the Jacobian).

fun_and_jac(x: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]) tuple[float | numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]][source]#

Simultaneously evaluate the objective function and its first derivative.

See .fun and .jac for details.

batch_fun(x_list: list[numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]], n_cores: int, batch_size: int | None = None) list[float | numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]][source]#

Parallelized batch version of .fun.

Parameters
  • x_list – A list of parameter vectors at which to evaluate the objective function.

  • n_cores – The number of cores to use for the parallel evaluation.

  • batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns

A list of function values at the points in x_list. See .fun for details.

batch_jac(x_list: list[numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]], n_cores: int, batch_size: int | None = None) list[numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]][source]#

Parallelized batch version of .jac.

Parameters
  • x_list – A list of parameter vectors at which to evaluate the first derivative.

  • n_cores – The number of cores to use for the parallel evaluation.

  • batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns

A list of first derivatives at the points in x_list. See .jac for details.

batch_fun_and_jac(x_list: list[numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]], n_cores: int, batch_size: int | None = None) list[tuple[float | numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]]]][source]#

Parallelized batch version of .fun_and_jac.

Parameters
  • x_list – A list of parameter vectors at which to evaluate the objective function and its first derivative.

  • n_cores – The number of cores to use for the parallel evaluation.

  • batch_size – Batch size that can be used by some algorithms to simulate the behavior under parallelization on more cores than are actually available. Only used by criterion_plots and benchmark plots.

Returns

A list of tuples containing the function value and the first derivative

at the points in x_list. See .fun_and_jac for details.

property bounds: optimagic.optimization.internal_optimization_problem.InternalBounds#

Bounds of the optimization problem.

property nonlinear_constraints: list[dict[str, Any]] | None#

Internal representation of nonlinear constraints.

Compared to the user provided constraints, we have done the following transformations:

1. The constraint a <= g(x) <= b is transformed to h(x) >= 0, where h(x) is - h(x) = g(x), if a == 0 and b == inf - h(x) = g(x) - a, if a != 0 and b == inf - h(x) = (g(x) - a, -g(x) + b) >= 0, if a != 0 and b != inf.

2. The equality constraint g(x) = v is transformed to h(x) >= 0, where h(x) = (g(x) - v, -g(x) + v).

3. Vector constraints are transformed to a list of scalar constraints. g(x) = (g1(x), g2(x), …) >= 0 is transformed to (g1(x) >= 0, g2(x) >= 0, …).

4. The constraint function (defined on a selection of user-facing parameters) is transformed to be evaluated on the internal parameters.

property direction: optimagic.typing.Direction#

Direction of the optimization problem.

property history: optimagic.optimization.history.History#

History container for the optimization problem.

property logger: Optional[optimagic.logging.logger.LogStore[Any, Any]]#

Logger for the optimization problem.

InternalOptimizeResult

This is what you need to create from the output of a wrapped algorithm.

class optimagic.optimization.algorithm.InternalOptimizeResult(x: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]], fun: float | numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]], success: bool | None = None, message: str | None = None, status: int | None = None, n_fun_evals: int | None = None, n_jac_evals: int | None = None, n_hess_evals: int | None = None, n_iterations: int | None = None, jac: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]] | None = None, hess: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]] | None = None, hess_inv: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]] | None = None, max_constraint_violation: float | None = None, info: dict[str, Any] | None = None, history: optimagic.optimization.history.History | None = None, multistart_info: dict[str, Any] | None = None)[source]#

Internal representation of the result of an optimization problem.

Parameters
  • x – The optimal parameters.

  • fun – The function value at the optimal parameters.

  • success – Whether the optimization was successful.

  • message – A message from the optimizer.

  • status – The status of the optimization.

  • n_fun_evals – The number of function evaluations.

  • n_jac_evals – The number of gradient or jacobian evaluations.

  • n_hess_evals – The number of Hessian evaluations.

  • n_iterations – The number of iterations.

  • jac – The Jacobian of the objective function at the optimal parameters.

  • hess – The Hessian of the objective function at the optimal parameters.

  • hess_inv – The inverse of the Hessian of the objective function at the optimal parameters.

  • max_constraint_violation – The maximum constraint violation.

  • info – Additional information from the optimizer.

Algorithm
class optimagic.optimization.algorithm.Algorithm[source]#

Base class for all optimization algorithms in optimagic.

To add an optimizer to optimagic you need to subclass Algorithm and overide the _solve_internal_problem method.

with_option(**kwargs: Any) Self[source]#

Create a modified copy with the given options.

with_stopping(**kwargs: Any) Self[source]#

Create a modified copy with the given stopping options.

with_convergence(**kwargs: Any) Self[source]#

Create a modified copy with the given convergence options.

solve_internal_problem(problem: optimagic.optimization.internal_optimization_problem.InternalOptimizationProblem, x0: numpy.ndarray[tuple[int, ...], numpy.dtype[numpy.float64]], step_id: int) optimagic.optimization.algorithm.InternalOptimizeResult[source]#

Solve the internal optimization problem.

This method is called internally by minimize or maximize to solve the internal optimization problem and process the results.

property algo_info: optimagic.optimization.algorithm.AlgoInfo#

Information about the algorithm.

Naming conventions for algorithm specific arguments#

To make switching between different algorithm as simple as possible, we align the names of commonly used convergence and stopping criteria. We also align the default values for stopping and convergence criteria as much as possible.

You can find the harmonized names and value here.

To align the names of other tuning parameters as much as possible with what is already there, simple have a look at the optimizers we already wrapped. For example, if you are wrapping a bfgs or lbfgs algorithm from some libray, try to look at all existing wrappers of bfgs algorithms and use the same names for the same options.

Algorithms that parallelize#

Algorithms that evaluate the objective function or derivatives in parallel should only do so via InternalOptimizationProblem.batch_fun, InternalOptimizationProblem.batch_jac or InternalOptimizationProblem.batch_fun_and_jac.

If you parallelize in any other way, the automatic history collection will stop to work.

In that case, call om.mark.minimizer with disable_history=True. In that case you can either do your own history collection and add that history to InternalOptimizeResult or the user has to rely on logging.

Nonlinear constraints#

(to be written)