# Minimisation¤

In addition to the following, note that the Optax library offers an extensive collection of minimisers via first-order gradient methods -- as are in widespread use for neural networks. If you would like to use these through the Optimistix API then an `optimistix.OptaxMinimiser`

wrapper is provided.

####
`optimistix.minimise(fn: Union[Callable[[~Y, Any], tuple[Array, ~Aux]], Callable[[~Y, Any], Array]], solver: AbstractMinimiser, y0: ~Y, args: PyTree[Any] = None, options: Optional[dict[str, Any]] = None, *, has_aux: bool = False, max_steps: Optional[int] = 256, adjoint: AbstractAdjoint = ImplicitAdjoint(linear_solver=AutoLinearSolver(well_posed=None)), throw: bool = True, tags: frozenset[object] = frozenset()) -> Solution[~Y, ~Aux]`

¤

Minimise a function.

This minimises a nonlinear function `fn(y, args)`

which returns a scalar value.

**Arguments:**

`fn`

: The objective function. This should take two arguments:`fn(y, args)`

and return a scalar.`solver`

: The minimiser solver to use. This should be an`optimistix.AbstractMinimiser`

.`y0`

: An initial guess for what`y`

may be.`args`

: Passed as the`args`

of`fn(y, args)`

.`options`

: Individual solvers may accept additional runtime arguments. See each individual solver's documentation for more details.`has_aux`

: If`True`

, then`fn`

may return a pair, where the first element is its function value, and the second is just auxiliary data. Keyword only argument.`max_steps`

: The maximum number of steps the solver can take. Keyword only argument.`adjoint`

: The adjoint method used to compute gradients through the fixed-point solve. Keyword only argument.`throw`

: How to report any failures. (E.g. an iterative solver running out of steps, or encountering divergent iterates.) If`True`

then a failure will raise an error. If`False`

then the returned solution object will have a`result`

field indicating whether any failures occured. (See`optimistix.Solution`

.) Keyword only argument.`tags`

: Lineax tags describing the any structure of the Hessian of`fn`

with respect to`y`

. Used with`optimistix.ImplicitAdjoint`

to implement the implicit function theorem as efficiently as possible. Keyword only argument.

**Returns:**

An `optimistix.Solution`

object.

`optimistix.minimise`

supports any of the following minimisers.

`optimistix.AbstractMinimiser`

####
```
optimistix.AbstractMinimiser
```

¤

Abstract base class for all minimisers.

#####
`init(self, fn: Callable[[~Y, Any], tuple[~Out, ~Aux]], y: ~Y, args: PyTree, options: dict[str, Any], f_struct: PyTree[jax.ShapeDtypeStruct], aux_struct: PyTree[jax.ShapeDtypeStruct], tags: frozenset[object]) -> ~SolverState`

`abstractmethod`

¤

Perform all initial computation needed to initialise the solver state.

For example, the `optimistix.Chord`

method computes the Jacobian `df/dy`

with respect to the initial guess `y`

, and then uses it throughout the
computation.

**Arguments:**

`fn`

: The function to iterate over. This is expected to take two argumetns`fn(y, args)`

and return a pytree of arrays in the first element, and any auxiliary data in the second argument.`y`

: The value of`y`

at the current (first) iteration.`args`

: Passed as the`args`

of`fn(y, args)`

.`options`

: Individual solvers may accept additional runtime arguments. See each individual solver's documentation for more details.`f_struct`

: A pytree of`jax.ShapeDtypeStruct`

s of the same shape as the output of`fn`

. This is used to initialise any information in the state which may rely on the pytree structure, array shapes, or dtype of the output of`fn`

.`aux_struct`

: A pytree of`jax.ShapeDtypeStruct`

s of the same shape as the auxiliary data returned by`fn`

.`tags`

: exact meaning depends on whether this is a fixed point, root find, least squares, or minimisation problem; see their relevant entry points.

**Returns:**

A PyTree representing the initial state of the solver.

#####
`step(self, fn: Callable[[~Y, Any], tuple[~Out, ~Aux]], y: ~Y, args: PyTree, options: dict[str, Any], state: ~SolverState, tags: frozenset[object]) -> tuple[~Y, ~SolverState, ~Aux]`

`abstractmethod`

¤

Perform one step of the iterative solve.

**Arguments:**

`fn`

: The function to iterate over. This is expected to take two argumetns`fn(y, args)`

and return a pytree of arrays in the first element, and any auxiliary data in the second argument.`y`

: The value of`y`

at the current (first) iteration.`args`

: Passed as the`args`

of`fn(y, args)`

.`options`

: Individual solvers may accept additional runtime arguments. See each individual solver's documentation for more details.`state`

: A pytree representing the state of a solver. The shape of this pytree is solver-dependent.`tags`

: exact meaning depends on whether this is a fixed point, root find, least squares, or minimisation problem; see their relevant entry points.

**Returns:**

A 3-tuple containing the new `y`

value in the first element, the next solver
state in the second element, and the aux output of `fn(y, args)`

in the third
element.

#####
`terminate(self, fn: Callable[[~Y, Any], tuple[~Out, ~Aux]], y: ~Y, args: PyTree, options: dict[str, Any], state: ~SolverState, tags: frozenset[object]) -> tuple[Array, RESULTS]`

`abstractmethod`

¤

Determine whether or not to stop the iterative solve.

**Arguments:**

`fn`

: The function to iterate over. This is expected to take two argumetns`fn(y, args)`

and return a pytree of arrays in the first element, and any auxiliary data in the second argument.`y`

: The value of`y`

at the current iteration.`args`

: Passed as the`args`

of`fn(y, args)`

.`options`

: Individual solvers may accept additional runtime arguments. See each individual solver's documentation for more details.`state`

: A pytree representing the state of a solver. The shape of this pytree is solver-dependent.`tags`

: exact meaning depends on whether this is a fixed point, root find, least squares, or minimisation problem; see their relevant entry points.

**Returns:**

A 2-tuple containing a bool indicating whether or not to stop iterating in the
first element, and an `optimistix.RESULTS`

object in the second element.

#####
`postprocess(self, fn: Callable[[~Y, Any], tuple[~Out, ~Aux]], y: ~Y, aux: ~Aux, args: PyTree, options: dict[str, Any], state: ~SolverState, tags: frozenset[object], result: RESULTS) -> tuple[~Y, ~Aux, dict[str, Any]]`

`abstractmethod`

¤

Any final postprocessing to perform on the result of the solve.

**Arguments:**

`fn`

: The function to iterate over. This is expected to take two argumetns`fn(y, args)`

and return a pytree of arrays in the first element, and any auxiliary data in the second argument.`y`

: The value of`y`

at the last iteration.`aux`

: The auxiliary output at the last iteration.`args`

: Passed as the`args`

of`fn(y, args)`

.`options`

: Individual solvers may accept additional runtime arguments. See each individual solver's documentation for more details.`state`

: A pytree representing the final state of a solver. The shape of this pytree is solver-dependent.`tags`

: exact meaning depends on whether this is a fixed point, root find, least squares, or minimisation problem; see their relevant entry points.`result`

: as returned by the final call to`terminate`

.

**Returns:**

A 3-tuple of:

`final_y`

: the final`y`

to return as the solution of the solve.`final_aux`

: the final`aux`

to return as the auxiliary output of the solve.`stats`

: any additional information to place in the`sol.stats`

dictionary.

Info

Most solvers will not need to use this, so that this method may be defined as:

```
def postprocess(self, fn, y, aux, args, options, state, tags, result):
return y, aux, {}
```

#### ¤

##### ¤

##### ¤

##### ¤

##### ¤

`optimistix.AbstractGradientDescent`

####
```
optimistix.AbstractGradientDescent (AbstractMinimiser)
```

¤

The gradient descent method for unconstrained minimisation.

At every step, this algorithm performs a line search along the steepest descent
direction. You should subclass this to provide it with a particular choice of line
search. (E.g. `optimistix.GradientDescent`

uses a simple learning rate step.)

Subclasses must provide the following abstract attributes, with the following types:

`rtol: float`

`atol: float`

`norm: Callable[[PyTree], Scalar]`

`descent: AbstractDescent`

`search: AbstractSearch`

#### ¤

####
```
optimistix.GradientDescent (AbstractGradientDescent)
```

¤

Classic gradient descent with a learning rate `learning_rate`

.

#####
`__init__(self, learning_rate: float, rtol: float, atol: float, norm: Callable[[PyTree], Array] = <function max_norm>)`

¤

**Arguments:**

`learning_rate`

: Specifies a constant learning rate to use at each step.`rtol`

: Relative tolerance for terminating the solve.`atol`

: Absolute tolerance for terminating the solve.`norm`

: The norm used to determine the difference between two iterates in the convergence criteria. Should be any function`PyTree -> Scalar`

. Optimistix includes three built-in norms:`optimistix.max_norm`

,`optimistix.rms_norm`

, and`optimistix.two_norm`

.

`optimistix.AbstractBFGS`

####
```
optimistix.AbstractBFGS (AbstractMinimiser)
```

¤

Abstract BFGS (Broyden--Fletcher--Goldfarb--Shanno) minimisation algorithm.

This is a quasi-Newton optimisation algorithm, whose defining feature is the way it progressively builds up a Hessian approximation using multiple steps of gradient information.

This abstract version may be subclassed to choose alternative descent and searches.

#### ¤

####
```
optimistix.BFGS (AbstractBFGS)
```

¤

BFGS (Broyden--Fletcher--Goldfarb--Shanno) minimisation algorithm.

This is a quasi-Newton optimisation algorithm, whose defining feature is the way it progressively builds up a Hessian approximation using multiple steps of gradient information.

#####
`__init__(self, rtol: float, atol: float, norm: Callable[[PyTree], Array] = <function max_norm>, use_inverse: bool = True)`

¤

**Arguments:**

`rtol`

: Relative tolerance for terminating the solve.`atol`

: Absolute tolerance for terminating the solve.`norm`

: The norm used to determine the difference between two iterates in the convergence criteria. Should be any function`PyTree -> Scalar`

. Optimistix includes three built-in norms:`optimistix.max_norm`

,`optimistix.rms_norm`

, and`optimistix.two_norm`

.`use_inverse`

: The BFGS algorithm involves computing matrix-vector products of the form`B^{-1} g`

, where`B`

is an approximation to the Hessian of the function to be minimised. This means we can either (a) store the approximate Hessian`B`

, and do a linear solve on every step, or (b) store the approximate Hessian inverse`B^{-1}`

, and do a matrix-vector product on every step. Option (a) is generally cheaper for sparse Hessians (as the inverse may be dense). Option (b) is generally cheaper for dense Hessians (as matrix-vector products are cheaper than linear solves). The default is (b), denoted via`use_inverse=True`

. Note that this is incompatible with line search methods like`optimistix.ClassicalTrustRegion`

, which use the Hessian approximation`B`

as part of their own computations.

####
```
optimistix.OptaxMinimiser (AbstractMinimiser)
```

¤

A wrapper to use Optax first-order gradient-based optimisers with
`optimistix.minimise`

.

#####
`__init__(self, optim, rtol: float, atol: float, norm: Callable[[PyTree], Array] = <function max_norm>, verbose: frozenset[str] = frozenset())`

¤

**Arguments:**

`optim`

: The Optax optimiser to use.`rtol`

: Relative tolerance for terminating the solve. Keyword only argument.`atol`

: Absolute tolerance for terminating the solve. Keyword only argument.`norm`

: The norm used to determine the difference between two iterates in the convergence criteria. Should be any function`PyTree -> Scalar`

. Optimistix includes three built-in norms:`optimistix.max_norm`

,`optimistix.rms_norm`

, and`optimistix.two_norm`

. Keyword only argument.`verbose`

: Whether to print out extra information about how the solve is proceeding. Should be a frozenset of strings, specifying what information to print out. Valid entries are`step`

,`loss`

,`y`

. For example`verbose=frozenset({"step", "loss"})`

.

`optim`

in `optimistix.OptaxMinimiser`

is an instance of an Optax minimiser. For example, correct usage is `optimistix.OptaxMinimiser(optax.adam(...), ...)`

, not `optimistix.OptaxMinimiser(optax.adam, ...)`

.

####
```
optimistix.NonlinearCG (AbstractGradientDescent)
```

¤

The nonlinear conjugate gradient method.

#####
`__init__(self, rtol: float, atol: float, norm: Callable[[PyTree[Array]], Array] = <function max_norm>, method: Callable[[~Y, ~Y, ~Y], Array] = <function polak_ribiere>, search: AbstractSearch[~Y, FunctionInfo.EvalGrad, FunctionInfo.Eval, Any] = BacktrackingArmijo(decrease_factor=0.5, slope=0.1, step_init=1.0))`

¤

**Arguments:**

`rtol`

: Relative tolerance for terminating solve.`atol`

: Absolute tolerance for terminating solve.`norm`

: The norm used to determine the difference between two iterates in the convergence criteria. Should be any function`PyTree -> Scalar`

. Optimistix includes three built-in norms:`optimistix.max_norm`

,`optimistix.rms_norm`

, and`optimistix.two_norm`

.`method`

: The function which computes`beta`

in`NonlinearCG`

. Defaults to`polak_ribiere`

. Optimistix includes four built-in methods:`optimistix.polak_ribiere`

,`optimistix.fletcher_reeves`

,`optimistix.hestenes_stiefel`

, and`optimistix.dai_yuan`

, but any function`(Y, Y, Y) -> Scalar`

will work.`search`

: The (line) search to use at each step.

`optimistix.NonlinearCG`

supports several different methods for computing its β parameter. If you are trying multiple solvers to see which works best on your problem, then you may wish to try all four versions of nonlinear CG. These can each be passed as `NonlinearCG(..., method=...)`

.

####
`optimistix.polak_ribiere(grad_vector: ~Y, grad_prev: ~Y, y_diff_prev: ~Y) -> Array`

¤

The Polak--Ribière formula for β. Used with `optimistix.NonlinearCG`

and
`optimistix.NonlinearCGDescent`

.

####
`optimistix.fletcher_reeves(grad: ~Y, grad_prev: ~Y, y_diff_prev: ~Y) -> Array`

¤

The Fletcher--Reeves formula for β. Used with `optimistix.NonlinearCG`

and
`optimistix.NonlinearCGDescent`

.

####
`optimistix.hestenes_stiefel(grad: ~Y, grad_prev: ~Y, y_diff_prev: ~Y) -> Array`

¤

The Hestenes--Stiefel formula for β. Used with `optimistix.NonlinearCG`

and
`optimistix.NonlinearCGDescent`

.

####
`optimistix.dai_yuan(grad: ~Y, grad_prev: ~Y, y_diff_prev: ~Y) -> Array`

¤

The Dai--Yuan formula for β. Used with `optimistix.NonlinearCG`

and
`optimistix.NonlinearCGDescent`

.

####
```
optimistix.BestSoFarMinimiser (AbstractMinimiser)
```

¤

Wraps another minimiser, to return the best-so-far value. That is, it makes a
copy of the best `y`

seen, and returns that.

#####
`__init__(self, solver: AbstractMinimiser[~Y, tuple[Array, ~Aux], Any])`

¤

**Arguments:**

`solver`

: the minimiser to wrap.