# Descents¤

`optimistix.AbstractDescent`

####
```
optimistix.AbstractDescent
```

¤

The abstract base class for descents. A descent consumes a scalar (e.g. a step
size), and returns the `diff`

to take at point `y`

, so that `y + diff`

is the next
iterate in a nonlinear optimisation problem.

See this documentation for more information.

#### ¤

####
```
optimistix.SteepestDescent (AbstractDescent)
```

¤

The descent direction given by locally following the gradient.

####
```
optimistix.NonlinearCGDescent (AbstractDescent)
```

¤

The nonlinear conjugate gradient step.

####
```
optimistix.NewtonDescent (AbstractDescent)
```

¤

Newton descent direction.

Given a quadratic bowl `x -> x^T Hess x`

-- a local quadratic approximation
to the target function -- this corresponds to moving in the direction of the bottom
of the bowl. (Which is *not* the same as steepest descent.)

This is done by solving a linear system of the form `Hess^{-1} grad`

.

####
```
optimistix.DampedNewtonDescent (AbstractDescent)
```

¤

The damped Newton (Levenberg--Marquardt) descent.

That is: gradient descent is about moving in the direction of `-grad`

.
(Quasi-)Newton descent is about moving in the direction of `-Hess^{-1} grad`

. Damped
Newton interpolates between these two regimes, by moving in the direction of
`-(Hess + λI)^{-1} grad`

.

The value λ is often referred to as a the "Levenberg--Marquardt" parameter, and in version is handled directly, as λ = 1/step_size. Larger step sizes correspond to Newton directions; smaller step sizes correspond to gradient directions. (And additionally also reduces the step size, hence the term "damping".) This is because a line search expects the step to be smaller as the step size decreases.

####
```
optimistix.IndirectDampedNewtonDescent (AbstractDescent)
```

¤

The indirect damped Newton (Levenberg--Marquardt) trust-region descent.

If the above line just looks like technical word soup, here's what's going on:

Gradient descent is about moving in the direction of `-grad`

. (Quasi-)Newton descent
is about moving in the direction of `-Hess^{-1} grad`

. Damped Newton interpolates
between these two regimes, by moving in the direction of
`-(Hess + λI)^{-1} grad`

.

This can be derived as the dual problem of a trust region method, see Conn, Gould, Toint: "Trust-Region Methods" section 7.3. λ is interpreted as a Lagrange multiplier. This involves solving a one-dimensional root-finding problem for λ at each descent.

####
```
optimistix.DoglegDescent (AbstractDescent)
```

¤

The Dogleg descent step, which switches between the Cauchy and the Newton descent directions.