Often, there are quantities in models that we might be interested in viewing the values of, but which are not random variables in the model that are explicitly drawn from a distribution.
As a motivating example, the most natural parameterisation for a model might not be the most computationally feasible. Consider the following (efficiently reparametrized) implementation of Neal’s funnel (Neal, 2003):
usingTuringsetprogress!(false)@modelfunctionNeal()# Raw draws y_raw ~Normal(0, 1) x_raw ~arraydist([Normal(0, 1) for i in1:9])# Transform: y =3* y_raw x =exp.(y ./2) .* x_rawreturnnothingend
[ Info: [Turing]: progress logging is disabled globally
Neal (generic function with 2 methods)
In this case, the random variables exposed in the chain (x_raw, y_raw) are not in a helpful form — what we’re after are the deterministically transformed variables x and y.
There are two ways to track these extra quantities in Turing.jl.
Using := (during inference)
The first way is to use the := operator, which behaves exactly like = except that the values of the variables on its left-hand side are automatically added to the chain returned by the sampler. For example:
@modelfunctionNeal_coloneq()# Raw draws y_raw ~Normal(0, 1) x_raw ~arraydist([Normal(0, 1) for i in1:9])# Transform: y :=3* y_raw x :=exp.(y ./2) .* x_rawendsample(Neal_coloneq(), NUTS(), 1000)
Alternatively, one can specify the extra quantities as part of the model function’s return statement:
@modelfunctionNeal_return()# Raw draws y_raw ~Normal(0, 1) x_raw ~arraydist([Normal(0, 1) for i in1:9])# Transform and return as a NamedTuple y =3* y_raw x =exp.(y ./2) .* x_rawreturn (x=x, y=y)endchain =sample(Neal_return(), NUTS(), 1000)
There are some pros and cons of using returned, as opposed to :=.
One drawback is that naively using returned can lead to unnecessary computation during inference. This is because during the sampling process, the return values are also calculated (since they are part of the model function), but then thrown away. So, if the extra quantities are expensive to compute, this can be a problem.
To avoid this, you will essentially have to create two different models, one for inference and one for post-inference. The simplest way of doing this is to add a parameter to the model argument:
@modelfunctionNeal_coloneq_optional(track::Bool)# Raw draws y_raw ~Normal(0, 1) x_raw ~arraydist([Normal(0, 1) for i in1:9])if track y =3* y_raw x =exp.(y ./2) .* x_rawreturn (x=x, y=y)elsereturnnothingendendchain =sample(Neal_coloneq_optional(false), NUTS(), 1000)
Note that for the returned call to work, the Neal_with_extras() model must have the same variable names as stored in chain. This means the submodel Neal() must not be prefixed, i.e. to_submodel() must be passed a second parameter false.