```
using Turing
@model function gdemo(x)
~ InverseGamma(2, 3)
s² ~ Normal(0, sqrt(s²))
m
for i in eachindex(x)
~ Normal(m, sqrt(s²))
x[i] end
end
```

`gdemo (generic function with 2 methods)`

After defining a statistical model, in addition to sampling from its distributions, one may be interested in finding the parameter values that maximise for instance the posterior distribution density function or the likelihood. This is called mode estimation. Turing provides support for two mode estimation techniques, maximum likelihood estimation (MLE) and maximum a posterior (MAP) estimation.

To demonstrate mode estimation, let us load Turing and declare a model:

```
using Turing
@model function gdemo(x)
s² ~ InverseGamma(2, 3)
m ~ Normal(0, sqrt(s²))
for i in eachindex(x)
x[i] ~ Normal(m, sqrt(s²))
end
end
```

`gdemo (generic function with 2 methods)`

Once the model is defined, we can construct a model instance as we normally would:

`DynamicPPL.Model{typeof(gdemo), (:x,), (), (), Tuple{Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(Main.Notebook.gdemo, (x = [1.5, 2.0],), NamedTuple(), DynamicPPL.DefaultContext())`

Finding the maximum aposteriori or maximum likelihood parameters is as simple as

```
# Generate a MLE estimate.
mle_estimate = maximum_likelihood(model)
# Generate a MAP estimate.
map_estimate = maximum_a_posteriori(model)
```

```
ModeResult with maximized lp of -4.62
[0.9074074074048499, 1.1666666666681416]
```

The estimates are returned as instances of the `ModeResult`

type. It has the fields `values`

for the parameter values found and `lp`

for the log probability at the optimum, as well as `f`

for the objective function and `optim_result`

for more detailed results of the optimisation procedure.

```
mle_estimate.values = [0.062499999999288104, 1.749999999999744]
mle_estimate.lp = -0.06528834416956464
```

Under the hood `maximum_likelihood`

and `maximum_a_posteriori`

use the Optimization.jl package, which provides a unified interface to many other optimisation packages. By default Turing typically uses the LBFGS method from Optim.jl to find the mode estimate, but we can easily change that:

```
using OptimizationOptimJL: NelderMead
@show maximum_likelihood(model, NelderMead())
using OptimizationNLopt: NLopt.LD_TNEWTON_PRECOND_RESTART
@show maximum_likelihood(model, LD_TNEWTON_PRECOND_RESTART());
```

```
maximum_likelihood(model, NelderMead()) = [0.062500666047239, 1.7499991015123628]
maximum_likelihood(model, LD_TNEWTON_PRECOND_RESTART()) = [0.06249999999999884, 1.7499999999999971]
```

The above are just two examples, Optimization.jl supports many more.

We can also help the optimisation by giving it a starting point we know is close to the final solution, or by specifying an automatic differentiation method

```
using ADTypes: AutoReverseDiff
import ReverseDiff
maximum_likelihood(
model, NelderMead(); initial_params=[0.1, 2], adtype=AutoReverseDiff()
)
```

```
ModeResult with maximized lp of -0.07
[0.062494553692639856, 1.7500042095865365]
```

When providing values to arguments like `initial_params`

the parameters are typically specified in the order in which they appear in the code of the model, so in this case first `s²`

then `m`

. More precisely it’s the order returned by `Turing.Inference.getparams(model, Turing.VarInfo(model))`

.

We can also do constrained optimisation, by providing either intervals within which the parameters must stay, or costraint functions that they need to respect. For instance, here’s how one can find the MLE with the constraint that the variance must be less than 0.01 and the mean must be between -1 and 1.:

```
ModeResult with maximized lp of -59.73
[0.009999999998999467, 0.9999999999589781]
```

The arguments for lower (`lb`

) and upper (`ub`

) bounds follow the arguments of `Optimization.OptimizationProblem`

, as do other parameters for providing constraints, such as `cons`

. Any extraneous keyword arguments given to `maximum_likelihood`

or `maximum_a_posteriori`

are passed to `Optimization.solve`

. Some often useful ones are `maxiters`

for controlling the maximum number of iterations and `abstol`

and `reltol`

for the absolute and relative convergence tolerances:

```
ModeResult with maximized lp of -0.66
[0.23158175948002974, 1.6998581911219297]
```

We can check whether the optimisation converged using the `optim_result`

field of the result:

```
badly_converged_mle.optim_result = retcode: Failure
u: [-1.4628222951067338, 1.6998581911219297]
Final objective value: 0.655794502892618
```

For more details, such as a full list of possible arguments, we encourage the reader to read the docstring of the function `Turing.Optimisation.estimate_mode`

, which is what `maximum_likelihood`

and `maximum_a_posteriori`

call, and the documentation of Optimization.jl.

Turing extends several methods from `StatsBase`

that can be used to analyze your mode estimation results. Methods implemented include `vcov`

, `informationmatrix`

, `coeftable`

, `params`

, and `coef`

, among others.

For example, let’s examine our ML estimate from above using `coeftable`

:

Coef. | Std. Error | z | Pr(> | z | ) | |
---|---|---|---|---|---|---|

s² | 0.0625 | 0.0625 | 1.0 | 0.317311 | -0.0599977 | 0.184998 |

m | 1.75 | 0.176777 | 9.89949 | 4.18383e-23 | 1.40352 | 2.09648 |

Standard errors are calculated from the Fisher information matrix (inverse Hessian of the log likelihood or log joint). Note that standard errors calculated in this way may not always be appropriate for MAP estimates, so please be cautious in interpreting them.

You can begin sampling your chain from an MLE/MAP estimate by extracting the vector of parameter values and providing it to the `sample`

function with the keyword `initial_params`

. For example, here is how to sample from the full posterior using the MAP estimate as the starting point: