Skip to content

Descents¤

optimistix.AbstractDescent

optimistix.AbstractDescent ¤

The abstract base class for descents. A descent consumes a scalar (e.g. a step size), and returns the diff to take at point y, so that y + diff is the next iterate in a nonlinear optimisation problem.

See this documentation for more information.

¤

optimistix.SteepestDescent (AbstractDescent) ¤

The descent direction given by locally following the gradient.

__init__(self, norm: Optional[Callable[[PyTree], Array]] = None) ¤

Arguments:


optimistix.NonlinearCGDescent (AbstractDescent) ¤

The nonlinear conjugate gradient step.

__init__(self, method: Callable[[~Y, ~Y, ~Y], Array]) ¤

Arguments:

  • method: A callable method(vector, vector_prev, diff_prev) describing how to calculate the beta parameter of nonlinear CG. Nonlinear CG uses the previous search direction, scaled by beta, and subtracts the gradient to find the next search direction. This parameter, in the nonlinear case, is the same as the parameter β_n described e.g. on Wikipedia for the linear case. Defaults to polak_ribiere. Optimistix includes four built-in methods: optimistix.polak_ribiere, optimistix.fletcher_reeves, optimistix.hestenes_stiefel, and optimistix.dai_yuan.

optimistix.NewtonDescent (AbstractDescent) ¤

Newton descent direction.

Given a quadratic bowl x -> x^T Hess x -- a local quadratic approximation to the target function -- this corresponds to moving in the direction of the bottom of the bowl. (Which is not the same as steepest descent.)

This is done by solving a linear system of the form Hess^{-1} grad.

__init__(self, norm: Optional[Callable[[PyTree], Array]] = None, linear_solver: AbstractLinearSolver = AutoLinearSolver(well_posed=None)) ¤

Arguments:

  • norm: If passed, then normalise the gradient using this norm. (The returned step will have length step_size with respect to this norm.) Optimistix includes three built-in norms: optimistix.max_norm, optimistix.rms_norm, and optimistix.two_norm.
  • linear_solver: The linear solver used to compute the Newton step.

optimistix.DampedNewtonDescent (AbstractDescent) ¤

The damped Newton (Levenberg--Marquardt) descent.

That is: gradient descent is about moving in the direction of -grad. (Quasi-)Newton descent is about moving in the direction of -Hess^{-1} grad. Damped Newton interpolates between these two regimes, by moving in the direction of -(Hess + λI)^{-1} grad.

The value λ is often referred to as a the "Levenberg--Marquardt" parameter, and in version is handled directly, as λ = 1/step_size. Larger step sizes correspond to Newton directions; smaller step sizes correspond to gradient directions. (And additionally also reduces the step size, hence the term "damping".) This is because a line search expects the step to be smaller as the step size decreases.

__init__(self, linear_solver: AbstractLinearSolver = AutoLinearSolver(well_posed=None)) ¤

Arguments:

  • linear_solver: The linear solver used to compute the Newton step.

optimistix.IndirectDampedNewtonDescent (AbstractDescent) ¤

The indirect damped Newton (Levenberg--Marquardt) trust-region descent.

If the above line just looks like technical word soup, here's what's going on:

Gradient descent is about moving in the direction of -grad. (Quasi-)Newton descent is about moving in the direction of -Hess^{-1} grad. Damped Newton interpolates between these two regimes, by moving in the direction of -(Hess + λI)^{-1} grad.

This can be derived as the dual problem of a trust region method, see Conn, Gould, Toint: "Trust-Region Methods" section 7.3. λ is interpreted as a Lagrange multiplier. This involves solving a one-dimensional root-finding problem for λ at each descent.

__init__(self, lambda_0: Union[Array, ndarray, numpy.bool, numpy.number, bool, int, float, complex] = 1.0, linear_solver: AbstractLinearSolver = AutoLinearSolver(well_posed=False), root_finder: AbstractRootFinder = Newton(rtol=0.01,atol=0.01,norm=<function max_norm>,kappa=0.01,linear_solver=AutoLinearSolver(well_posed=None),cauchy_termination=True), trust_region_norm: Callable[[PyTree], Array] = <function two_norm>) ¤

Arguments:

  • lambda_0: The initial value of the Levenberg--Marquardt parameter used in the root- find to hit the trust-region radius. If IndirectDampedNewtonDescent is failing, this value may need to be increased.
  • linear_solver: The linear solver used to compute the Newton step.
  • root_finder: The root finder used to find the Levenberg--Marquardt parameter which hits the trust-region radius.
  • trust_region_norm: The norm used to determine the trust-region shape.

optimistix.DoglegDescent (AbstractDescent) ¤

The Dogleg descent step, which switches between the Cauchy and the Newton descent directions.

__init__(self, linear_solver: AbstractLinearSolver = AutoLinearSolver(well_posed=None), root_finder: AbstractRootFinder[Array, Array, NoneType, Any] = Bisection(rtol=0.001, atol=0.001, flip='detect', expand_if_necessary=False), trust_region_norm: Callable[[PyTree], Array] = <function two_norm>) ¤

Arguments:

  • linear_solver: The linear solver used to compute the Newton step.
  • root_finder: The root finder used to find the point where the trust-region intersects the dogleg path. This is ignored if trust_region_norm=optimistix.two_norm, for which there is an analytic formula instead.
  • trust_region_norm: The norm used to determine the trust-region shape.