SDE solvers¤
See also How to choose a solver.
Term structure
The type of solver chosen determines how the terms
argument of diffeqsolve
should be laid out.
Most solvers handle both ODEs and SDEs in the same way, and expect a single term. So for an ODE you would pass terms=ODETerm(vector_field)
, and for an SDE you would pass terms=MultiTerm(ODETerm(drift), ControlTerm(diffusion, brownian_motion))
. For example:
drift = lambda t, y, args: -y
diffusion = lambda t, y, args: y[..., None]
bm = UnsafeBrownianPath(shape=(1,), key=...)
terms = MultiTerm(ODETerm(drift), ControlTerm(diffusion, bm))
diffeqsolve(terms, solver=Euler(), ...)
For any individual solver then this is documented below, and is also available programatically under <solver>.term_structure
.
For advanced users, note that we typically accept any AbstractTerm
for the diffusion, so it could be a custom one that implements more-efficient behaviour for the structure of your diffusion matrix.
Explicit Runge--Kutta (ERK) methods¤
These solvers can be used to solve SDEs just as well as they can be used to solve ODEs.
diffrax.Euler (AbstractItoSolver)
¤
Euler's method.
1st order explicit Runge--Kutta method. Does not support adaptive step sizing. Uses 1 stage. Uses 1st order local linear interpolation for dense/ts output.
When used to solve SDEs, converges to the Itô solution.
diffrax.Heun (AbstractERK, AbstractStratonovichSolver)
¤
Heun's method.
2nd order explicit Runge--Kutta method. Has an embedded Euler method for adaptive step sizing. Uses 2 stages. Uses 2nd-order Hermite interpolation for dense/ts output.
Also sometimes known as either the "improved Euler method", "modified Euler method" or "explicit trapezoidal rule".
Should not be confused with Heun's third order method, which is a different (higher order) method occasionally also just referred to as "Heun's method".
When used to solve SDEs, converges to the Stratonovich solution.
diffrax.Midpoint (AbstractERK, AbstractStratonovichSolver)
¤
Midpoint method.
2nd order explicit Runge--Kutta method. Has an embedded Euler method for adaptive step sizing. Uses 2 stages. Uses 2nd order Hermite interpolation for dense/ts output.
Also sometimes known as the "modified Euler method".
When used to solve SDEs, converges to the Stratonovich solution.
diffrax.Ralston (AbstractERK, AbstractStratonovichSolver)
¤
Ralston's method.
2nd order explicit Runge--Kutta method. Has an embedded Euler method for adaptive step sizing. Uses 2 stages. Uses 2nd order Hermite interpolation for dense output.
When used to solve SDEs, converges to the Stratonovich solution.
Info
In addition to the solvers above, then most higher-order ODE solvers can actually also be used as SDE solvers. They will typically converge to the Stratonovich solution. In practice this is computationally wasteful as they will not obtain more accurate solutions when applied to SDEs.
SDE-only solvers¤
Term structure
These solvers are SDE-specific. For these, terms
must specifically be of the form MultiTerm(ODETerm(...), SomeOtherTerm(...))
(Typically SomeOTherTerm
will be a ControlTerm
representing the drift and diffusion specifically.
diffrax.EulerHeun (AbstractStratonovichSolver)
¤
Euler-Heun method.
Uses a 1st order local linear interpolation scheme for dense/ts output.
This should be called with terms=MultiTerm(drift_term, diffusion_term)
, where the
drift is an ODETerm
.
Used to solve SDEs, and converges to the Stratonovich solution.
diffrax.ItoMilstein (AbstractItoSolver)
¤
Milstein's method; Itô version.
Used to solve SDEs, and converges to the Itô solution. Uses local linear interpolation for dense/ts output.
This should be called with terms=MultiTerm(drift_term, diffusion_term)
, where the
drift is an ODETerm
.
Warning
Requires commutative noise. Note that this commutativity condition is not checked.
diffrax.StratonovichMilstein (AbstractStratonovichSolver)
¤
Milstein's method; Stratonovich version.
Used to solve SDEs, and converges to the Stratonovich solution. Uses local linear interpolation for dense/ts output.
This should be called with terms=MultiTerm(drift_term, diffusion_term)
, where the
drift is an ODETerm
.
Warning
Requires commutative noise. Note that this commutativity condition is not checked.
Stochastic Runge--Kutta (SRK)¤
These are a particularly important class of SDE-only solvers.
diffrax.SEA (AbstractSRK, AbstractStratonovichSolver)
¤
Shifted Euler method for SDEs with additive noise.
Makes one evaluation of the drift and diffusion per step and has a strong order 1.
Compared to diffrax.Euler
, it has a better constant factor in the global
error, and an improved local error of \(O(h^2)\) instead of \(O(h^{1.5})\).
This solver is useful for solving additive-noise SDEs with as few drift and diffusion evaluations per step as possible.
Reference
This solver is based on equation (5.8) in
@article{foster2023high,
title={High order splitting methods for SDEs satisfying a commutativity
condition},
author={James Foster and Goncalo dos Reis and Calum Strange},
year={2023},
journal={arXiv:2210.17543},
}
diffrax.SRA1 (AbstractSRK, AbstractStratonovichSolver)
¤
The SRA1 method for additive-noise SDEs.
Makes two evaluations of the drift and diffusion per step and has a strong order 1.5.
See also diffrax.ShARK
, which is very similar.
Reference
@article{rossler2010runge
author = {Andreas R\"{o}\ss{}ler},
title = {Runge–Kutta Methods for the Strong Approximation of Solutions of
Stochastic Differential Equations},
journal = {SIAM Journal on Numerical Analysis},
volume = {48},
number = {3},
pages = {922--952},
year = {2010},
doi = {10.1137/09076636X},
diffrax.ShARK (AbstractSRK, AbstractStratonovichSolver)
¤
Shifted Additive-noise Runge-Kutta method for additive SDEs.
Makes two evaluations of the drift and diffusion per step and has a strong order 1.5.
This is the recommended choice for SDEs with additive noise.
See also diffrax.SRA1
, which is very similar.
Reference
This solver is based on equation (6.1) in
@article{foster2023high,
title={High order splitting methods for SDEs satisfying a commutativity
condition},
author={James Foster and Goncalo dos Reis and Calum Strange},
year={2023},
journal={arXiv:2210.17543},
}
diffrax.GeneralShARK (AbstractSRK, AbstractStratonovichSolver)
¤
ShARK method for Stratonovich SDEs.
As compared to diffrax.ShARK
this can handle any SDE, not only those with
additive noise.
Makes two evaluations of the drift and three evaluations of the diffusion per step. Has the following orders of convergence:
- 1.5 for SDEs with additive noise (but
diffrax.ShARK
is recommended instead) - 1.0 for Stratonovich SDEs with commutative noise
(
diffrax.SlowRK
is recommended instead) - 0.5 for Stratonovich SDEs with general noise.
For general Stratonovich SDEs this is equally precise as three steps of
diffrax.Heun
or a single step of diffrax.SPaRK
, while requiring
one fewer evaluation of the drift, so this is the recommended choice for general
SDEs with an expensive drift vector field. If embedded error estimation is needed
(e.g. for adaptive time-stepping) then diffrax.SPaRK
is recommended instead.
Reference
This solver is based on equation (6.1) from
@misc{foster2023high,
title={High order splitting methods for SDEs satisfying
a commutativity condition},
author={James Foster and Goncalo dos Reis and Calum Strange},
year={2023},
eprint={2210.17543},
archivePrefix={arXiv},
primaryClass={math.NA}
diffrax.SlowRK (AbstractSRK, AbstractStratonovichSolver)
¤
SLOW-RK method for commutative-noise Stratonovich SDEs.
Makes two evaluations of the drift and five evaluations of the diffusion per step. Applied to SDEs with commutative noise, it converges strongly with order 1.5. Can be used for SDEs with non-commutative noise, but then it only converges strongly with order 0.5.
This solver is an excellent choice for Stratonovich SDEs with commutative noise.
For non-commutative Stratonovich SDEs, consider using diffrax.GeneralShARK
or diffrax.SPaRK
instead.
Reference
This solver is based on equation (6.2) from
@article{foster2023high,
title={High order splitting methods for SDEs satisfying a commutativity
condition},
author={James Foster and Goncalo dos Reis and Calum Strange},
year={2023},
journal={arXiv:2210.17543},
}
diffrax.SPaRK (AbstractSRK, AbstractStratonovichSolver)
¤
The Splitting Path Runge-Kutta method.
It uses three evaluations of the drift and diffusion per step, and has the following strong orders of convergence:
- 1.5 for SDEs with additive noise (but
diffrax.ShARK
is recommended instead) - 1.0 for Stratonovich SDEs with commutative noise
(
diffrax.SlowRK
is recommended instead) - 0.5 for Stratonovich SDEs with general noise.
For general Stratonovich SDEs this is equally precise as three steps of
diffrax.Heun
or a single step of diffrax.GeneralShARK
. Unlike those,
this method has an embedded error estimate, so it is the recommended choice for
adaptive time-stepping. Otherwise, diffrax.GeneralShARK
is more efficient.
Reference
This solver is based on Definition 1.6 from
@misc{foster2023convergence,
title={On the convergence of adaptive approximations
for stochastic differential equations},
author={James Foster},
year={2023},
archivePrefix={arXiv},
primaryClass={math.NA}
}
Reversible methods¤
These are reversible in the same way as when applied to ODEs. See here.
diffrax.ReversibleHeun (AbstractAdaptiveSolver, AbstractStratonovichSolver)
¤
Reversible Heun method.
Algebraically reversible 2nd order method. Has an embedded 1st order method for adaptive step sizing. Uses 1st order local linear interpolation for dense/ts output.
When used to solve SDEs, converges to the Stratonovich solution.
Reference
@article{kidger2021efficient,
author={Kidger, Patrick and Foster, James and Li, Xuechen and Lyons, Terry},
title={{E}fficient and {A}ccurate {G}radients for {N}eural {SDE}s},
year={2021},
journal={Advances in Neural Information Processing Systems}
}
Wrapper solvers¤
diffrax.HalfSolver (AbstractAdaptiveSolver, AbstractWrappedSolver)
¤
Wraps another solver, trading cost in order to provide error estimates. (That is, it means the solver can be used with an adaptive step size controller, regardless of whether the underlying solver supports adaptive step sizing.)
For every step of the wrapped solver, it does this by also making two half-steps, and comparing the results between the full step and the two half steps. Hence the name "HalfSolver".
As such each step costs 3 times the computational cost of the wrapped solver, whilst producing results that are roughly twice as accurate, in addition to producing error estimates.
Tip
Many solvers already provide error estimates, making HalfSolver
primarily
useful when using a solver that doesn't provide error estimates -- e.g.
diffrax.Euler
. Such solvers are most common when solving SDEs.
__init__(self, solver: AbstractSolver[~_SolverState])
¤
Arguments:
solver
: The solver to wrap.
Underdamped Langevin solvers¤
These solvers are specifically designed for the Underdamped Langevin diffusion (ULD), which takes the form
where \(x(t), v(t) \in \mathbb{R}^d\) represent the position and velocity, \(w\) is a Brownian motion in \(\mathbb{R}^d\), \(f: \mathbb{R}^d \rightarrow \mathbb{R}\) is a potential function, and \(\gamma , u \in \mathbb{R}^{d \times d}\) are diagonal matrices governing the friction and the damping of the system.
They are more precise for this diffusion than the general-purpose solvers above, but cannot be used for any other SDEs. They only accept special terms as described in the Underdamped Langevin terms section. For an example of their usage, see the Underdamped Langevin example.
diffrax.ALIGN (AbstractFosterLangevinSRK)
¤
The Adaptive Langevin via Interpolated Gradients and Noise method
designed by James Foster. This is a second order solver for the
Underdamped Langevin Diffusion, and accepts terms of the form
MultiTerm(UnderdampedLangevinDriftTerm, UnderdampedLangevinDiffusionTerm)
.
Uses two evaluations of the vector
field per step, but is FSAL, so in practice it only requires one.
Reference
This is a modification of the Strang-Splitting method from Definition 4.2 of
@misc{foster2021shiftedode,
title={The shifted ODE method for underdamped Langevin MCMC},
author={James Foster and Terry Lyons and Harald Oberhauser},
year={2021},
eprint={2101.03446},
archivePrefix={arXiv},
primaryClass={math.NA},
url={https://arxiv.org/abs/2101.03446},
}
__init__(self, taylor_threshold: float = 0.1)
¤
Arguments:
taylor_threshold
: If the producth*gamma
is less than this, then the Taylor expansion will be used to compute the coefficients. Otherwise they will be computed directly. When using float32, the empirically optimal value is 0.1, and for float64 about 0.01.
diffrax.ShOULD (AbstractFosterLangevinSRK)
¤
The Shifted-ODE Runge-Kutta Three method
designed by James Foster. This is a third order solver for the
Underdamped Langevin Diffusion, the terms of the form
MultiTerm(UnderdampedLangevinDriftTerm, UnderdampedLangevinDiffusionTerm)
.
Uses three evaluations of the vector
field per step, but is FSAL, so in practice it only requires two.
Reference
This solver is based on Definition 7.1 from
@misc{foster2021shiftedode,
title={The shifted ODE method for underdamped Langevin MCMC},
author={James Foster and Terry Lyons and Harald Oberhauser},
year={2021},
eprint={2101.03446},
archivePrefix={arXiv},
primaryClass={math.NA},
url={https://arxiv.org/abs/2101.03446},
}
__init__(self, taylor_threshold: Union[float, int] = 0.1)
¤
Arguments:
taylor_threshold
: If the producth*gamma
is less than this, then the Taylor expansion will be used to compute the coefficients. Otherwise they will be computed directly. When using float32, the empirically optimal value is 0.1, and for float64 about 0.01.
diffrax.QUICSORT (AbstractFosterLangevinSRK)
¤
The QUadrature Inspired and Contractive Shifted ODE with Runge-Kutta Three
method by James Foster and Daire O'Kane. This is a third order solver for the
Underdamped Langevin Diffusion, and accepts terms of the form
MultiTerm(UnderdampedLangevinDriftTerm, UnderdampedLangevinDiffusionTerm)
.
Uses two evaluations of the vector field per step.
Reference
This is a variant of the SORT method from Definition 1.2 of
@misc{foster2021shiftedode,
title={The shifted ODE method for underdamped Langevin MCMC},
author={James Foster and Terry Lyons and Harald Oberhauser},
year={2021},
eprint={2101.03446},
archivePrefix={arXiv},
primaryClass={math.NA},
url={https://arxiv.org/abs/2101.03446},
}
with the modifications inspired by the UBU method from
@misc{chada2024unbiased,
title={Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients},
author={Neil K. Chada and Benedict Leimkuhler and Daniel Paulin
and Peter A. Whalley},
year={2024},
eprint={2311.05025},
archivePrefix={arXiv},
primaryClass={stat.CO},
url={https://arxiv.org/abs/2311.05025},
}
__init__(self, taylor_threshold: float = 0.1)
¤
Arguments:
taylor_threshold
: If the producth*gamma
is less than this, then the Taylor expansion will be used to compute the coefficients. Otherwise they will be computed directly. When using float32, the empirically optimal value is 0.1, and for float64 about 0.01.