Changelog
0.43.0
DynamicPPL 0.40 and VarNamedTuple
DynamicPPL v0.40 includes a major overhaul of Turing’s internal data structures. Most notably, cases where we might once have used Dict{VarName} or NamedTuple have all been replaced with a single data structure, called VarNamedTuple.
This provides substantial benefits in terms of robustness and performance.
However, it does place some constraints on Turing models, and introduces some breaking changes to the user interface. Specifically, the types of containers that can include random variables are now more limited: if x[i] ~ dist is a random variable, then x must obey the following criteria:
They must be
AbstractArrays. Dicts and other containers are currently unsupported (we have an issue to track this). If you really need this functionality, please open an issue and let us know; we can try to make it a priority.@model function f() # Allowed x = Array{Float64}(undef, 1) x[1] ~ Normal() # Forbidden x = Dict{Int,Float64}() return x[1] ~ Normal() endThey must not be resized between calls to
~. The following is forbidden (you should initialisexto the correct size before the loop):x = Float64[] for i in 1:10 push!(x, 0.0) x[i] ~ Normal() end
However, please note that this only applies to containers that contain random variables on the left-hand side of tilde-statements. In general, there are no restrictions on containers of observed data, containers that are not used in tilde-statements, or containers that are themselves random variables (e.g. x ~ MvNormal(...)).
Likewise, arrays of random variables should ideally have a constant size from iteration to iteration. That means a model like this will fail sometimes (but see below):
n ~ Poisson(2.0) x = Vector{Float64}(undef, n) for i in 1:n x[i] ~ Normal() endTechnically speaking: Inference (e.g. MCMC sampling) on this model will still work, but if you want to use
returnedorpredict, both of the following conditions must hold true: (1) you must use FlexiChains.jl; (2) all elements ofxmust be random variables, i.e., you cannot have a mixture ofx[i]’s being random variables and observed.
VarNamedTuple and @vnt are now re-exported from Turing directly. There is a docs page explaining how to use and create VarNamedTuples, which can be found here.
Conditioning and fixing
When providing conditioned or fixed variables to Turing models, we recommend that you use a VarNamedTuple to do so. The main benefit of this is that it correctly captures the structure of arrays of random variables. In the past, this used to depend on whether you specified variables exactly as they were seen in the model: for example, if the model had x ~ MvNormal(zeros(2), I), and you conditioned separately on x[1] and x[2], the conditioned variables would be silently ignored. There were also similar inconsistencies with colons in variable names.
With a VarNamedTuple-based approach, both should be equivalent: you can do
vnt1 = @vnt begin
@template x = zeros(2)
x[1] := 1.0
x[2] := 2.0
end
cond_model = model() | vnt1
vnt2 = @vnt begin
x := [1.0, 2.0]
end
cond_model = model() | vnt2and both should work correctly.
Optimisation interface
Turing.jl’s optimisation interface has been completely overhauled in this release. The aim of this has been to provide users with a more consistent and principled way of specifying constraints.
The crux of the issue is that Optimization.jl expects vectorised inputs, whereas Turing models are more high-level: they have named variables which may be scalars, vectors, or in general anything. Prior to this version, Turing’s interface required the user to provide the vectorised inputs ‘raw’, which is both unintuitive and error-prone (especially when considering that optimisation may run in linked or unlinked space).
Going forward, initial parameters for optimisation are specified using AbstractInitStrategy (for more information about this, please see the docs on MCMC sampling). If specific parameters are provided (via InitFromParams), these must be in model space (i.e. untransformed). This directly mimics the interface for MCMC sampling that has been in place since v0.41.
Furthermore, lower and upper bounds (if desired) can be specified as VarNamedTuples using the lb and ub keyword arguments. Bounds are always provided in model space; Turing will handle the transformation of these bounds to linked space if necessary. Constraints are respected when creating initial parameters for optimisation: if the AbstractInitStrategy provided is incompatible with the constraints (for example InitFromParams((; x = 2.0)) but x is constrained to be between [0, 1]), an error will be raised.
Here is a (very simplified) example of the new interface:
using Turing
@model f() = x ~ Beta(2, 2)
maximum_a_posteriori(
f();
# All of the following are in unlinked space.
# We use NamedTuples here for simplicity, but you can use
# VarNamedTuple or Dict{<:VarName} as well (internally they
# will be converted to VarNamedTuple).
initial_params=InitFromParams((; x=0.3)),
lb=(; x=0.1),
ub=(; x=0.4),
)For more information, please see the docstring of estimate_mode.
Note that in some cases, the translation of bounds to linked space may not be well-defined. This is especially true for distributions where the samples have elements that are not independent (for example, Dirichlet, or LKJCholesky). In these cases, Turing will raise an error if bounds are provided. Users who wish to perform optimisation with such constraints should directly use LogDensityFunction and Optimization.jl. Documentation on this matter will be forthcoming.
Other changes to the optimisation interface
estimate_mode,maximum_a_posteriori, andmaximum_likelihoodnow accept an optionalrngfirst argument for reproducible initialisation.- New keyword argument
link::Bool=truecontrols whether to optimise in linked (transformed) space. - New keyword argument
check_constraints_at_runtime::Bool=trueenables runtime constraint checking during model evaluation. - Generic (non-box) constraints via
cons,lcons,uconsare no longer supported. Users who need these should useLogDensityFunctionand Optimization.jl directly.
ModeResult changes
The return type from an optimisation procedure, ModeResult, has been substantially reworked:
ModeResult.paramsis now aVarNamedTuple(previously anAbstractDict{<:VarName}). Parameters can be accessed via e.g.m.params[@varname(x)].- The
values::NamedArrayfield has been removed. Usevector_names_and_params(m)(newly exported) to obtain(Vector{VarName}, Vector{values}). Base.get(m::ModeResult, ...)has been removed; usem.params[@varname(x)]instead.StatsBase.coefnow returns a plainVector(not aNamedArray).StatsBase.coefnamesnow returns aVector{VarName}(not strings or symbols).StatsBase.informationmatrix: thehessian_functionkeyword argument has been replaced byadtype::ADTypes.AbstractADType(defaultAutoForwardDiff()). Hessian computation uses DifferentiationInterface under the hood.
IS sampler
The IS sampler has been removed (its behaviour was in fact exactly the same as Prior). To see an example of importance sampling (via Prior() and then subsequent reweighting), see e.g. this issue.
MH sampler
The interface of the MH sampler is slightly different. It no longer accepts AdvancedMH proposals, and is now more flexible: you can specify proposals for individual VarNames (not just top-level symbols), and any unspecified VarNames will be drawn from the prior, instead of being silently ignored. It is also faster than before (by around 30% on simple models).
Additional changes:
A new type
LinkedRWallows specifying random-walk proposals in linked (unconstrained) space, e.g.MH(@varname(x) => LinkedRW(cov_matrix)).Callable (conditional) proposals now receive a
VarNamedTupleof the full parameter state, rather than a single scalar. For example:# Old MH(:m => x -> Normal(x, 1)) # New MH(@varname(m) => (vnt -> Normal(vnt[@varname(m)], 1)))MH now reports whether each proposal was
acceptedin the chain stats.At the start of sampling, MH logs
@infomessages showing which proposal is used for each variable (disable withverbose=false). This helps detect misspecified proposals.MH validates initial parameters against the proposal distribution; if they have zero or NaN probability, a clear error is thrown.
Gibbs sampler
Both the compilation and runtime of Gibbs sampling should now be significantly faster. (This is largely due to the underlying changes in DynamicPPL’s data structures.)
HMC / NUTS
HMC-family samplers now check for discrete variables before sampling begins. If a model contains discrete variables (e.g. x ~ Categorical(...)) and an HMC sampler is used, an ArgumentError is thrown immediately. Previously, this would silently proceed.
GibbsConditional
When defining a conditional posterior, instead of being provided with a Dict of values, the function must now take a VarNamedTuple containing the values. Note that indexing into a VarNamedTuple is very similar to indexing into a Dict; however, it is more flexible since you can use syntax such as x[1:2] even if x[1] and x[2] are separate variables in the model.
filldist and arraydist
These two convenience functions are now imported and re-exported from DynamicPPL, rather than DistributionsAD.jl. They are now just wrappers around Distributions.product_distribution, instead of the specialised implementations that were in DistributionsAD.jl. DistributionsAD.jl is for all intents and purposes deprecated: it is no longer a dependency in the Turing stack.
0.42.9
Improve handling of model evaluator functions with Libtask.
This means that when running SMC or PG on a model with keyword arguments, you no longer need to use @might_produce (see patch notes of v0.42.5 for more details on this).
It also means that submodels with observations inside will now be reliably handled by the SMC/PG samplers, which was not the case before (the observations were only picked up if the submodel was inlined by the Julia compiler, which could lead to correctness issues).
0.42.8
Add support for TensorBoardLogger.jl via AbstractMCMC.mcmc_callback. See the AbstractMCMC documentation for more details.
0.42.7
Avoid reevaluating the model on MCMC iterations where the transition is not saved to the chain (e.g. in initial burn-in, or when using thinning). Also avoid each component sampler of Gibbs unnecessarily evaluating the model once per iteration.
0.42.6
Fixed a bug in SMC and PG where results were not always stored correctly in Libtask traces (due to incorrect objectid checks).
0.42.5
SMC and PG can now be used for models with keyword arguments, albeit with one requirement: the user must mark the model function as being able to produce. For example, if the model is
@model foo(x; y) = a ~ Normal(x, y)then before samping from this with SMC or PG, you will have to run
using Turing
@might_produce(foo)0.42.4
Fixes a typo that caused NUTS to perform one less adaptation step than in versions prior to 0.41.
0.42.3
Removes some dead code.
0.42.2
InitFromParams(mode_estimate), where mode_estimate was obtained from an optimisation on a Turing model, now accepts a second optional argument which provides a fallback initialisation strategy if some parameters are missing from mode_estimate.
This also means that you can now invoke returned(model, mode_estimate) to calculate a model’s return values given the parameters in mode_estimate.
0.42.1
Avoid passing a full VarInfo to check_model, which allows more models to be checked safely for validity.
0.42.0
DynamicPPL 0.39
Turing.jl v0.42 brings with it all the underlying changes in DynamicPPL 0.39. Please see the DynamicPPL changelog for full details; in here we summarise only the changes that are most pertinent to end-users of Turing.jl.
Thread safety opt-in
Turing.jl has supported threaded tilde-statements for a while now, as long as said tilde-statements are observations (i.e., likelihood terms). For example:
@model function f(y)
x ~ Normal()
Threads.@threads for i in eachindex(y)
y[i] ~ Normal(x)
end
endModels where tilde-statements or @addlogprob! are used in parallel require what we call ‘threadsafe evaluation’. In previous releases of Turing.jl, threadsafe evaluation was enabled whenever Julia was launched with more than one thread. However, this is an imprecise way of determining whether threadsafe evaluation is really needed. It causes performance degradation for models that do not actually need threadsafe evaluation, and generally led to ill-defined behaviour in various parts of the Turing codebase.
In Turing.jl v0.42, threadsafe evaluation is now opt-in. To enable threadsafe evaluation, after defining a model, you now need to call setthreadsafe(model, true) (note that this is not a mutating function, it returns a new model):
y = randn(100)
model = f(y)
model = setthreadsafe(model, true)You only need to do this if your model uses tilde-statements or @addlogprob! in parallel. You do not need to do this if:
- your model has other kinds of parallelism but does not include tilde-statements inside;
- or you are using
MCMCThreads()orMCMCDistributed()to sample multiple chains in parallel, but your model itself does not use parallelism.
If your model does include parallelised tilde-statements or @addlogprob! calls, and you evaluate it/sample from it without setting setthreadsafe(model, true), then you may get statistically incorrect results without any warnings or errors.
Faster performance
Many operations in DynamicPPL have been substantially sped up. You should find that anything that uses LogDensityFunction (i.e., HMC/NUTS samplers, optimisation) is faster in this release. Prior sampling should also be much faster than before.
predict improvements
If you have a model that requires threadsafe evaluation (i.e., parallel observations), you can now use this with predict. Carrying on from the previous example, you can do:
model = setthreadsafe(f(y), true)
chain = sample(model, NUTS(), 1000)
pdn_model = f(fill(missing, length(y)))
pdn_model = setthreadsafe(pdn_model, true) # set threadsafe
predictions = predict(pdn_model, chain) # generate new predictions in parallelLog-density names in chains
When sampling from a Turing model, the resulting MCMCChains.Chains object now contains the log-joint, log-prior, and log-likelihood under the names :logjoint, :logprior, and :loglikelihood respectively. Previously, :logjoint would be stored under the name :lp.
Log-evidence in chains
When sampling using MCMCChains, the chain object will no longer have its chain.logevidence field set. Instead, you can calculate this yourself from the log-likelihoods stored in the chain. For SMC samplers, the log-evidence of the entire trajectory is stored in chain[:logevidence] (which is the same for every particle in the ‘chain’).
Turing.Inference.Transition
Turing.Inference.Transition(model, vi[, stats]) has been removed; you can directly replace this with DynamicPPL.ParamsWithStats(vi, model[, stats]).
AdvancedVI 0.6
Turing.jl v0.42 updates AdvancedVI.jl compatibility to 0.6 (we skipped the breaking 0.5 update as it does not introduce new features). AdvancedVI.jl@0.6 introduces major structural changes including breaking changes to the interface and multiple new features. The summary of the changes below are the things that affect the end-users of Turing. For a more comprehensive list of changes, please refer to the changelogs in AdvancedVI.
Breaking changes
A new level of interface for defining different variational algorithms has been introduced in AdvancedVI v0.5. As a result, the function Turing.vi now receives a keyword argument algorithm. The object algorithm <: AdvancedVI.AbstractVariationalAlgorithm should now contain all the algorithm-specific configurations. Therefore, keyword arguments of vi that were algorithm-specific such as objective, operator, averager and so on, have been moved as fields of the relevant <: AdvancedVI.AbstractVariationalAlgorithm structs.
In addition, the outputs also changed. Previously, vi returned both the last-iterate of the algorithm q and the iterate average q_avg. Now, for the algorithms running parameter averaging, only q_avg is returned. As a result, the number of returned values reduced from 4 to 3.
For example,
q, q_avg, info, state = vi(
model, q, n_iters; objective=RepGradELBO(10), operator=AdvancedVI.ClipScale()
)is now
q_avg, info, state = vi(
model,
q,
n_iters;
algorithm=KLMinRepGradDescent(adtype; n_samples=10, operator=AdvancedVI.ClipScale()),
)Similarly,
vi(
model,
q,
n_iters;
objective=RepGradELBO(10; entropy=AdvancedVI.ClosedFormEntropyZeroGradient()),
operator=AdvancedVI.ProximalLocationScaleEntropy(),
)is now
vi(model, q, n_iters; algorithm=KLMinRepGradProxDescent(adtype; n_samples=10))Lastly, to obtain the last-iterate q of KLMinRepGradDescent, which is not returned in the new interface, simply select the averaging strategy to be AdvancedVI.NoAveraging(). That is,
q, info, state = vi(
model,
q,
n_iters;
algorithm=KLMinRepGradDescent(
adtype;
n_samples=10,
operator=AdvancedVI.ClipScale(),
averager=AdvancedVI.NoAveraging(),
),
)Additionally,
- The default hyperparameters of
DoGandDoWGhave been altered. - The deprecated
AdvancedVI@0.2-era interface is now removed. estimate_objectivenow always returns the value to be minimized by the optimization algorithm. For example, for ELBO maximization algorithms,estimate_objectivewill return the negative ELBO. This is breaking change from the previous behavior where the ELBO was returned.- The initial value for the
q_meanfield_gaussian,q_fullrank_gaussian, andq_locationscalehave changed. Specificially, the default initial value for the scale matrix has been changed fromIto0.6*I. - When using algorithms that expect to operate in unconstrained spaces, the user is now explicitly expected to provide a
Bijectors.TransformedDistributionwrapping an unconstrained distribution. (Refer to the docstring ofvi.)
New Features
AdvancedVI@0.6 adds numerous new features including the following new VI algorithms:
KLMinWassFwdBwd: Also known as “Wasserstein variational inference,” this algorithm minimizes the KL divergence under the Wasserstein-2 metric.KLMinNaturalGradDescent: This algorithm, also known as “online variational Newton,” is the canonical “black-box” natural gradient variational inference algorithm, which minimizes the KL divergence via mirror descent under the KL divergence as the Bregman divergence.KLMinSqrtNaturalGradDescent: This is a recent variant ofKLMinNaturalGradDescentthat operates in the Cholesky-factor parameterization of Gaussians instead of precision matrices.FisherMinBatchMatch: This algorithm called “batch-and-match,” minimizes the variation of the 2nd order Fisher divergence via a proximal point-type algorithm.
Any of the new algorithms above can readily be used by simply swappin the algorithm keyword argument of vi. For example, to use batch-and-match:
vi(model, q, n_iters; algorithm=FisherMinBatchMatch())External sampler interface
The interface for defining an external sampler has been reworked. In general, implementations of external samplers should now no longer need to depend on Turing. This is because the interface functions required have been shifted upstream to AbstractMCMC.jl.
In particular, you now only need to define the following functions:
AbstractMCMC.step(rng::Random.AbstractRNG, model::AbstractMCMC.LogDensityModel, ::MySampler; kwargs...)(and also a method withstate, and the correspondingstep_warmupmethods if needed)AbstractMCMC.getparams(::MySamplerState)-> Vector{<:Real}AbstractMCMC.getstats(::MySamplerState)-> NamedTupleAbstractMCMC.requires_unconstrained_space(::MySampler)-> Bool (defaulttrue)
This means that you only need to depend on AbstractMCMC.jl. As long as the above functions are defined correctly, Turing will be able to use your external sampler.
The Turing.Inference.isgibbscomponent(::MySampler) interface function still exists, but in this version the default has been changed to true, so you should not need to overload this.
Optimisation interface
The Optim.jl interface has been removed (so you cannot call Optim.optimize directly on Turing models). You can use the maximum_likelihood or maximum_a_posteriori functions with an Optim.jl solver instead (via Optimization.jl: see https://docs.sciml.ai/Optimization/stable/optimization_packages/optim/ for documentation of the available solvers).
Internal changes
The constructors of OptimLogDensity have been replaced with a single constructor, OptimLogDensity(::DynamicPPL.LogDensityFunction).
0.41.4
Fixed a bug where the check_model=false keyword argument would not be respected when sampling with multiple threads or cores.
0.41.3
Fixed NUTS not correctly specifying the number of adaptation steps when calling AdvancedHMC.initialize! (this bug led to mass matrix adaptation not actually happening).
0.41.2
Add GibbsConditional, a “sampler” that can be used to provide analytically known conditional posteriors in a Gibbs sampler.
In Gibbs sampling, some variables are sampled with a component sampler, while holding other variables conditioned to their current values. Usually one e.g. takes turns sampling one variable with HMC and the other with a particle sampler. However, sometimes the posterior distribution of one variable is known analytically, given the conditioned values of other variables. GibbsConditional provides a way to implement these analytically known conditional posteriors and use them as component samplers for Gibbs. See the docstring of GibbsConditional for details.
Note that GibbsConditional used to exist in Turing.jl until v0.36, at which it was removed when the whole Gibbs sampler was rewritten. This reintroduces the same functionality, though with a slightly different interface.
0.41.1
The ModeResult struct returned by maximum_a_posteriori and maximum_likelihood can now be wrapped in InitFromParams(). This makes it easier to use the parameters in downstream code, e.g. when specifying initial parameters for MCMC sampling. For example:
@model function f()
# ...
end
model = f()
opt_result = maximum_a_posteriori(model)
sample(model, NUTS(), 1000; initial_params=InitFromParams(opt_result))If you need to access the dictionary of parameters, it is stored in opt_result.params but note that this field may change in future breaking releases as that Turing’s optimisation interface is slated for overhaul in the near future.
0.41.0
DynamicPPL 0.38
Turing.jl v0.41 brings with it all the underlying changes in DynamicPPL 0.38. Please see the DynamicPPL changelog for full details: in this section we only describe the changes that will directly affect end-users of Turing.jl.
Performance
A number of functions such as returned and predict will have substantially better performance in this release.
ProductNamedTupleDistribution
Distributions.ProductNamedTupleDistribution can now be used on the right-hand side of ~ in Turing models.
Initial parameters
Initial parameters for MCMC sampling must now be specified in a different form. You still need to use the initial_params keyword argument to sample, but the allowed values are different. For almost all samplers in Turing.jl (except Emcee) this should now be a DynamicPPL.AbstractInitStrategy.
There are three kinds of initialisation strategies provided out of the box with Turing.jl (they are exported so you can use these directly with using Turing):
InitFromPrior(): Sample from the prior distribution. This is the default for most samplers in Turing.jl (if you don’t specifyinitial_params).InitFromUniform(a, b): Sample uniformly from[a, b]in linked space. This is the default for Hamiltonian samplers. Ifaandbare not specified it defaults to[-2, 2], which preserves the behaviour in previous versions (and mimics that of Stan).InitFromParams(p): Explicitly provide a set of initial parameters. Note:pmust be either aNamedTupleor anAbstractDict{<:VarName}; it can no longer be aVector. Parameters must be provided in unlinked space, even if the sampler later performs linking.- For this release of Turing.jl, you can also provide a
NamedTupleorAbstractDict{<:VarName}and this will be automatically wrapped inInitFromParamsfor you. This is an intermediate measure for backwards compatibility, and will eventually be removed.
- For this release of Turing.jl, you can also provide a
This change is made because Vectors are semantically ambiguous. It is not clear which element of the vector corresponds to which variable in the model, nor is it clear whether the parameters are in linked or unlinked space. Previously, both of these would depend on the internal structure of the VarInfo, which is an implementation detail. In contrast, the behaviour of AbstractDicts and NamedTuples is invariant to the ordering of variables and it is also easier for readers to understand which variable is being set to which value.
If you were previously using varinfo[:] to extract a vector of initial parameters, you can now use Dict(k => varinfo[k] for k in keys(varinfo) to extract a Dict of initial parameters.
For more details about initialisation you can also refer to the main TuringLang docs, and/or the DynamicPPL API docs.
resume_from and loadstate
The resume_from keyword argument to sample is now removed. Instead of sample(...; resume_from=chain) you can use sample(...; initial_state=loadstate(chain)) which is entirely equivalent. loadstate is exported from Turing now instead of in DynamicPPL.
Note that loadstate only works for MCMCChains.Chains. For FlexiChains users please consult the FlexiChains docs directly where this functionality is described in detail.
pointwise_logdensities
pointwise_logdensities(model, chn), pointwise_loglikelihoods(...), and pointwise_prior_logdensities(...) now return an MCMCChains.Chains object if chn is itself an MCMCChains.Chains object. The old behaviour of returning an OrderedDict is still available: you just need to pass OrderedDict as the third argument, i.e., pointwise_logdensities(model, chn, OrderedDict).
Initial step in MCMC sampling
HMC and NUTS samplers no longer take an extra single step before starting the chain. This means that if you do not discard any samples at the start, the first sample will be the initial parameters (which may be user-provided).
Note that if the initial sample is included, the corresponding sampler statistics will be missing. Due to a technical limitation of MCMCChains.jl, this causes all indexing into MCMCChains to return Union{Float64, Missing} or similar. If you want the old behaviour, you can discard the first sample (e.g. using discard_initial=1).
0.40.5
Bump Optimization.jl compatibility to include v5.
0.40.4
Fixes a bug where initial_state was not respected for NUTS if resume_from was not also specified.
0.40.3
This patch makes the resume_from keyword argument work correctly when sampling multiple chains.
In the process this also fixes a method ambiguity caused by a bugfix in DynamicPPL 0.37.2.
This patch means that if you are using RepeatSampler() to sample from a model, and you want to obtain MCMCChains.Chains from it, you need to specify sample(...; chain_type=MCMCChains.Chains). This only applies if the sampler itself is a RepeatSampler; it doesn’t apply if you are using RepeatSampler within another sampler like Gibbs.
0.40.2
sample(model, NUTS(), N; verbose=false) now suppresses the ‘initial step size’ message.
0.40.1
Extra release to trigger Documenter.jl build (when 0.40.0 was released GitHub was having an outage). There are no code changes.
0.40.0
Breaking changes
DynamicPPL 0.37
Turing.jl v0.40 updates DynamicPPL compatibility to 0.37. The summary of the changes provided here is intended for end-users of Turing. If you are a package developer, or would otherwise like to understand these changes in-depth, please see the DynamicPPL changelog.
@submodelis now completely removed; please useto_submodel.Prior and likelihood calculations are now completely separated in Turing. Previously, the log-density used to be accumulated in a single field and thus there was no clear way to separate prior and likelihood components.
@addlogprob! f, wherefis a float, now adds to the likelihood by default.- You can instead use
@addlogprob! (; logprior=x, loglikelihood=y)to control which log-density component to add to. - This means that usage of
PriorContextandLikelihoodContextis no longer needed, and these have now been removed.
The special
__context__variable has been removed. If you still need to access the evaluation context, it is now available as__model__.context.
Log-density in chains
When sampling from a Turing model, the resulting MCMCChains.Chains object now contains not only the log-joint (accessible via chain[:lp]) but also the log-prior and log-likelihood (chain[:logprior] and chain[:loglikelihood] respectively).
These values now correspond to the log density of the sampled variables exactly as per the model definition / user parameterisation and thus will ignore any linking (transformation to unconstrained space). For example, if the model is @model f() = x ~ LogNormal(), chain[:lp] would always contain the value of logpdf(LogNormal(), x) for each sampled value of x. Previously these values could be incorrect if linking had occurred: some samplers would return logpdf(Normal(), log(x)) i.e. the log-density with respect to the transformed distribution.
Gibbs sampler
When using Turing’s Gibbs sampler, e.g. Gibbs(:x => MH(), :y => HMC(0.1, 20)), the conditioned variables (for example y during the MH step, or x during the HMC step) are treated as true observations. Thus the log-density associated with them is added to the likelihood. Previously these would effectively be added to the prior (in the sense that if LikelihoodContext was used they would be ignored). This is unlikely to affect users but we mention it here to be explicit. This change only affects the log probabilities as the Gibbs component samplers see them; the resulting chain will include the usual log prior, likelihood, and joint, as described above.
Particle Gibbs
Previously, only ‘true’ observations (i.e., x ~ dist where x is a model argument or conditioned upon) would trigger resampling of particles. Specifically, there were two cases where resampling would not be triggered:
- Calls to
@addlogprob! - Gibbs-conditioned variables: e.g.
yinGibbs(:x => PG(20), :y => MH())
Turing 0.40 changes this such that both of the above cause resampling. (The second case follows from the changes to the Gibbs sampler, see above.)
This release also fixes a bug where, if the model ended with one of these statements, their contribution to the particle weight would be ignored, leading to incorrect results.
The changes above also mean that certain models that previously worked with PG-within-Gibbs may now error. Specifically this is likely to happen when the dimension of the model is variable. For example:
@model function f()
x ~ Bernoulli()
if x
y1 ~ Normal()
else
y1 ~ Normal()
y2 ~ Normal()
end
# (some likelihood term...)
end
sample(f(), Gibbs(:x => PG(20), (:y1, :y2) => MH()), 100)This sampler now cannot be used for this model because depending on which branch is taken, the number of observations will be different. To use PG-within-Gibbs, the number of observations that the PG component sampler sees must be constant. Thus, for example, this will still work if x, y1, and y2 are grouped together under the PG component sampler.
If you absolutely require the old behaviour, we recommend using Turing.jl v0.39, but also thinking very carefully about what the expected behaviour of the model is, and checking that Turing is sampling from it correctly (note that the behaviour on v0.39 may in general be incorrect because of the fact that Gibbs-conditioned variables did not trigger resampling). We would also welcome any GitHub issues highlighting such problems. Our support for dynamic models is incomplete and is liable to undergo further changes.
Other changes
- Sampling using
Prior()should now be about twice as fast because we now avoid evaluating the model twice on every iteration. Turing.Inference.Transitionnow has different fields. Ift isa Turing.Inference.Transition,t.statis always a NamedTuple, notnothing(if it genuinely has no information then it’s an empty NamedTuple). Furthermore,t.lphas now been split up intot.logpriorandt.loglikelihood(see also ‘Log-density in chains’ section above).
0.39.10
Added a compatibility entry for DataStructures v0.19.
0.39.9
Revert a bug introduced in 0.39.5 in the external sampler interface. For Turing 0.39, external samplers should define
Turing.Inference.getparams(::DynamicPPL.Model, ::MySamplerTransition)
rather than
AbstractMCMC.getparams(::DynamicPPL.Model, ::MySamplerState)
to obtain a vector of parameters from the model.
Note that this may change in future breaking releases.
0.39.8
MCMCChains.jl doesn’t understand vector- or matrix-valued variables, and in Turing we split up such values into their individual components. This patch carries out some internal refactoring to avoid splitting up VarNames until absolutely necessary. There are no user-facing changes in this patch.
0.39.7
Update compatibility to AdvancedPS 0.7 and Libtask 0.9.
These new libraries provide significant speedups for particle MCMC methods.
0.39.6
Bumped compatibility of AbstractPPL to include 0.13.
0.39.5
Fixed a bug where sampling with an externalsampler would not set the log probability density inside the resulting chain. Note that there are still potentially bugs with the log-Jacobian term not being correctly included. A fix is being worked on.
0.39.4
Bumped compatibility of AbstractPPL to include 0.12.
0.39.3
Improved the performance of Turing.Inference.getparams when called with an untyped VarInfo as the second argument, by first converting to a typed VarInfo. This makes, for example, the post-sampling Chains construction for Prior() run much faster.
0.39.2
Fixed a bug in the support of OrderedLogistic (by changing the minimum from 0 to 1).
0.39.1
No changes from 0.39.0 — this patch is released just to re-trigger a Documenter.jl run.
0.39.0
Update to the AdvancedVI interface
Turing’s variational inference interface was updated to match version 0.4 version of AdvancedVI.jl.
AdvancedVI v0.4 introduces various new features:
- location-scale families with dense scale matrices,
- parameter-free stochastic optimization algorithms like
DoGandDoWG, - proximal operators for stable optimization,
- the sticking-the-landing control variate for faster convergence, and
- the score gradient estimator for non-differentiable targets.
Please see the Turing API documentation, and AdvancedVI’s documentation, for more details.
Removal of Turing.Essential
The Turing.Essential module has been removed. Anything exported from there can be imported from either Turing or DynamicPPL.
@addlogprob!
The @addlogprob! macro is now exported from Turing, making it officially part of the public interface.
0.38.6
Added compatibility with AdvancedHMC 0.8.
0.38.5
Added compatibility with ForwardDiff v1.
0.38.4
The minimum Julia version was increased to 1.10.2 (from 1.10.0). On versions before 1.10.2, sample() took an excessively long time to run (probably due to compilation).
0.38.3
getparams(::Model, ::AbstractVarInfo) now returns an empty Float64[] if the VarInfo contains no parameters.
0.38.2
Bump compat for MCMCChains to 7. By default, summary statistics and quantiles for chains are no longer printed; to access these you should use describe(chain).
0.38.1
The method Bijectors.bijector(::DynamicPPL.Model) was moved to DynamicPPL.jl.
0.38.0
DynamicPPL version
DynamicPPL compatibility has been bumped to 0.36. This brings with it a number of changes: the ones most likely to affect you are submodel prefixing and conditioning. Variables in submodels are now represented correctly with field accessors. For example:
using Turing
@model inner() = x ~ Normal()
@model outer() = a ~ to_submodel(inner())keys(VarInfo(outer())) now returns [@varname(a.x)] instead of [@varname(var"a.x")]
Furthermore, you can now either condition on the outer model like outer() | (@varname(a.x) => 1.0), or the inner model like inner() | (@varname(x) => 1.0). If you use the conditioned inner model as a submodel, the conditioning will still apply correctly.
Please see the DynamicPPL release notes for fuller details.
Gibbs sampler
Turing’s Gibbs sampler now allows for more complex VarNames, such as x[1] or x.a, to be used. For example, you can now do this:
@model function f()
x = Vector{Float64}(undef, 2)
x[1] ~ Normal()
return x[2] ~ Normal()
end
sample(f(), Gibbs(@varname(x[1]) => MH(), @varname(x[2]) => MH()), 100)Performance for the cases which used to previously work (i.e. VarNames like x which only consist of a single symbol) is unaffected, and VarNames with only field accessors (e.g. x.a) should be equally fast. It is possible that VarNames with indexing (e.g. x[1]) may be slower (although this is still an improvement over not working at all!). If you find any cases where you think the performance is worse than it should be, please do file an issue.
0.37.1
maximum_a_posteriori and maximum_likelihood now perform sanity checks on the model before running the optimisation. To disable this, set the keyword argument check_model=false.
0.37.0
Breaking changes
Gibbs constructors
0.37 removes the old Gibbs constructors deprecated in 0.36.
Remove Zygote support
Zygote is no longer officially supported as an automatic differentiation backend, and AutoZygote is no longer exported. You can continue to use Zygote by importing AutoZygote from ADTypes and it may well continue to work, but it is no longer tested and no effort will be expended to fix it if something breaks.
Mooncake is the recommended replacement for Zygote.
DynamicPPL 0.35
Turing.jl v0.37 uses DynamicPPL v0.35, which brings with it several breaking changes:
- The right hand side of
.~must from now on be a univariate distribution. - Indexing
VarInfoobjects by samplers has been removed completely. - The order in which nested submodel prefixes are applied has been reversed.
- The arguments for the constructor of
LogDensityFunctionhave changed.LogDensityFunctionalso now satisfies theLogDensityProblemsinterface, without needing a wrapper object.
For more details about all of the above, see the changelog of DynamicPPL here.
Export list
Turing.jl’s export list has been cleaned up a fair bit. This affects what is imported into your namespace when you do an unqualified using Turing. You may need to import things more explicitly than before.
The
DynamicPPLandAbstractMCMCmodules are no longer exported. You will need toimport DynamicPPLorusing DynamicPPL: DynamicPPL(likewiseAbstractMCMC) yourself, which in turn means that they have to be made available in your project environment.@logprob_strand@prob_strhave been removed following a long deprecation period.We no longer re-export everything from
BijectorsandLibtask. To get around this, addusing Bijectorsorusing Libtaskat the top of your script (but we recommend using more selective imports).- We no longer export
Bijectors.ordered. If you were usingordered, even Bijectors does not (currently) export this. You will have to manually import it withusing Bijectors: ordered.
- We no longer export
On the other hand, we have added a few more exports:
DynamicPPL.returnedandDynamicPPL.prefixare exported (for use with submodels).LinearAlgebra.Iis exported for convenience.
0.36.0
Breaking changes
0.36.0 introduces a new Gibbs sampler. It’s been included in several previous releases as Turing.Experimental.Gibbs, but now takes over the old Gibbs sampler, which gets removed completely.
The new Gibbs sampler currently supports the same user-facing interface as the old one, but the old constructors have been deprecated, and will be removed in the future. Also, given that the internals have been completely rewritten in a very different manner, there may be accidental breakage that we haven’t anticipated. Please report any you find.
GibbsConditional has also been removed. It was never very user-facing, but it was exported, so technically this is breaking.
The old Gibbs constructor relied on being called with several subsamplers, and each of the constructors of the subsamplers would take as arguments the symbols for the variables that they are to sample, e.g. Gibbs(HMC(:x), MH(:y)). This constructor has been deprecated, and will be removed in the future. The new constructor works by mapping symbols, VarNames, or iterables thereof to samplers, e.g. Gibbs(x=>HMC(), y=>MH()), Gibbs(@varname(x) => HMC(), @varname(y) => MH()), Gibbs((:x, :y) => NUTS(), :z => MH()). This allows more granular specification of which sampler to use for which variable.
Likewise, the old constructor for calling one subsampler more often than another, Gibbs((HMC(0.01, 4, :x), 2), (MH(:y), 1)) has been deprecated. The new way to do this is to use RepeatSampler, also introduced at this version: Gibbs(@varname(x) => RepeatSampler(HMC(0.01, 4), 2), @varname(y) => MH()).
0.35.0
Breaking changes
Julia 1.10 is now the minimum required version for Turing.
Tapir.jl has been removed and replaced with its successor, Mooncake.jl. You can use Mooncake.jl by passing adbackend=AutoMooncake(; config=nothing) to the relevant samplers.
Support for Tracker.jl as an AD backend has been removed.
0.33.0
Breaking changes
The following exported functions have been removed:
constrained_spaceget_parameter_boundsoptim_objectiveoptim_functionoptim_problem
The same functionality is now offered by the new exported functions
maximum_likelihoodmaximum_a_posteriori
0.30.5
essential/ad.jlis removed,ForwardDiffandReverseDiffintegrations viaLogDensityProblemsADare moved toDynamicPPLand live in corresponding package extensions.LogDensityProblemsAD.ADgradient(ℓ::DynamicPPL.LogDensityFunction)(i.e. the single argument method) is moved toInferencemodule. It will createADgradientusing theadtypeinformation stored incontextfield ofℓ.getADbackendfunction is renamed togetADType, the interface is preserved, but packages that previously usedgetADbackendshould be updated to usegetADType.TuringTagfor ForwardDiff is also removed, nowDynamicPPLTagis defined inDynamicPPLpackage and should serve the same purpose.
0.30.0
ADTypes.jlreplaced Turing’s global AD backend. Users should now specify the desiredADTypedirectly in sampler constructors, e.g.,HMC(0.1, 10; adtype=AutoForwardDiff(; chunksize)), orHMC(0.1, 10; adtype=AutoReverseDiff(false))(falseindicates not to use compiled tape).- Interface functions such as
ADBackend,setadbackend,setadsafe,setchunksize, andsetrdcacheare deprecated and will be removed in a future release. - Removed the outdated
verifygradfunction. - Updated to a newer version of
LogDensityProblemsAD(v1.7).
0.12.0
- The interface for defining new distributions with constrained support and making them compatible with
Turinghas changed. To make a custom distribution typeCustomDistributioncompatible withTuring, the user needs to define the methodbijector(d::CustomDistribution)that returns an instance of typeBijectorimplementing theBijectors.BijectorAPI. ~is now thread-safe when used for observations, but not assumptions (non-observed model parameters) yet.- There were some performance improvements in the automatic differentiation (AD) of functions in
DistributionsADandBijectors, leading to speeds closer to and sometimes faster than Stan’s. - An
HMCinitialization bug was fixed.HMCinitialization in Turing is now consistent with Stan’s. - Sampling from the prior is now possible using
sample. psampleis now deprecated in favour ofsample(model, sampler, parallel_method, n_samples, n_chains)whereparallel_methodcan be eitherMCMCThreads()orMCMCDistributed().MCMCThreadswill use your available threads to sample each chain (ensure that you have the environment variableJULIA_NUM_THREADSset to the number of threads you want to use), andMCMCDistributedwill dispatch chain sampling to each available process (you can add processes withaddprocs()).- Turing now uses
AdvancedMH.jlv0.5, which mostly provides behind-the-scenes restructuring. - Custom expressions and macros can be interpolated in the
@modeldefinition with$; it is possible to use@.also for assumptions (non-observed model parameters) and observations. - The macros
@varinfo,@logpdf, and@samplerare removed. Instead, one can access the internal variables_varinfo,_model,_sampler, and_contextin the@modeldefinition. - Additional constructors for
SMCandPGmake it easier to choose the resampling method and threshold.
0.11.0
- Removed some extraneous imports and dependencies (#1182)
- Minor backend changes to
sampleandpsample, which now use functions defined upstream in AbstractMCMC.jl (#1187) - Fix for an AD-related crash (#1202)
- StatsBase compat update to 0.33 (#1185)
- Bugfix for ReverseDiff caching and memoization (#1208)
- BREAKING:
VecBinomialLogitis now removed. AlsoBernoulliLogitis added (#1214) - Bugfix for cases where dynamic models were breaking with HMC methods (#1217)
- Updates to allow AdvancedHMC 0.2.23 (#1218)
- Add more informative error messages for SMC (#900)
0.10.1
- Fix bug where arrays with mixed integers, floats, and missing values were not being passed to the
MCMCChains.Chainsconstructor properly #1180.
0.10.0
- Update elliptical slice sampling to use EllipticalSliceSampling.jl on the backend. #1145. Nothing should change from a front-end perspective – you can still call
sample(model, ESS(), 1000). - Added default progress loggers in #1149.
- The symbols used to define the AD backend have changed to be the lowercase form of the package name used for AD.
forward_diffis nowforwarddiff,reverse_diffis nowtracker, andzygoteandreversediffare newly supported (see below).forward_diffandreverse_diffare deprecated and are slated to be removed. - Turing now has experimental support for Zygote.jl (#783) and ReverseDiff.jl (#1170) AD backends. Both backends are experimental, so please report any bugs you find. Zygote does not allow mutation within your model, so please be aware of this issue. You can enable Zygote with
Turing.setadbackend(:zygote)and you can enable ReverseDiff withTuring.setadbackend(:reversediff), though to use either you must import the package withusing Zygoteorusing ReverseDiff.forloops are not recommended for ReverseDiff or Zygote – see performance tips for more information. - Fix MH indexing bug #1135.
- Fix MH array sampling #1167.
- Fix bug in VI where the bijectors where being inverted incorrectly #1168.
- The Gibbs sampler handles state better by passing
Transitionstructs to the local samplers (#1169 and #1166).
0.4.0-alpha
- Fix compatibility with Julia 0.6 [#341, #330, #293]
- Support of Stan interface [#343, #326]
- Fix Binomial distribution for gradients. [#311]
- Stochastic gradient Hamiltonian Monte Carlo [#201]; Stochastic gradient Langevin dynamics [#27]
- More particle MCMC family samplers: PIMH & PMMH [#364, #369]
- Disable adaptive resampling for CSMC [#357]
- Fix resampler for SMC [#338]
- Interactive particle MCMC [#334]
- Add type alias CSMC for PG [#333]
- Fix progress meter [#317]
0.3
- NUTS implementation #188
- HMC: Transforms of ϵ for each variable #67 (replace with introducing mass matrix)
- Finish: Sampler (internal) interface design #107
- Substantially improve performance of HMC and Gibbs #7
- Vectorising likelihood computations #117 #255
- Remove obsolete
randoc,randc? #156 - Support truncated distribution. #87
- Refactoring code: Unify VarInfo, Trace, TaskLocalStorage #96
- Refactoring code: Better gradient interface #97
0.2
- Gibbs sampler ([#73])
- HMC for constrained variables ([#66]; no support for varying dimensions)
- Added support for
Mamba.Chain([#90]): describe, plot etc. - New interface design ([#55]), ([#104])
- Bugfixes and general improvements (e.g.
VarInfo[#96])
0.1.0
- Initial support for Hamiltonian Monte Carlo (no support for discrete/constrained variables)
- Require Julia 0.5
- Bugfixes and general improvements
0.0.1-0.0.4
The initial releases of Turing.
- Particle MCMC, SMC, IS
- Implemented copying for Julia Task
- Implemented copy-on-write data structure
TArrayfor Tasks