Frequently Asked Questions
Why is this variable being treated as random instead of observed?
This is a common source of confusion. In Turing.jl, you can only condition or fix expressions that explicitly appear on the left-hand side (LHS) of a ~
statement.
For example, if your model contains:
~ filldist(Normal(), 2) x
You cannot directly condition on x[2]
using condition(model, @varname(x[2]) => 1.0)
because x[2]
never appears on the LHS of a ~
statement. Only x
as a whole appears there.
However, there is an important exception: when you use the broadcasting operator .~
with a univariate distribution, each element is treated as being separately drawn from that distribution, allowing you to condition on individual elements:
@model function f1()
= Vector{Float64}(undef, 3)
x .~ Normal() # Each element is a separate draw
x end
= f1() | (@varname(x[1]) => 1.0)
m1 sample(m1, NUTS(), 100) # This works!
In contrast, you cannot condition on parts of a multivariate distribution because it represents a single distribution over the entire vector:
@model function f2()
= Vector{Float64}(undef, 3)
x ~ MvNormal(zeros(3), I) # Single multivariate distribution
x end
= f2() | (@varname(x[1]) => 1.0)
m2 sample(m2, NUTS(), 100) # This doesn't work!
The key insight is that filldist
creates a single distribution (not N independent distributions), which is why you cannot condition on individual elements. The distinction is not just about what appears on the LHS of ~
, but whether you’re dealing with separate distributions (.~
with univariate) or a single distribution over multiple values (~
with multivariate or filldist
).
To understand more about how Turing determines whether a variable is treated as random or observed, see:
- Core Functionality - basic explanation of the
~
notation and conditioning
Can I use parallelism / threads in my model?
Yes, but with important caveats! There are two types of parallelism to consider:
1. Parallel Sampling (Multiple Chains)
Turing.jl fully supports sampling multiple chains in parallel:
- Multithreaded sampling: Use
MCMCThreads()
to run one chain per thread - Distributed sampling: Use
MCMCDistributed()
for distributed computing
See the Core Functionality guide for examples.
2. Threading Within Models
Using threads inside your model (e.g., Threads.@threads
) requires more care:
@model function f(y)
= Vector{Float64}(undef, length(y))
x Threads.@threads for i in eachindex(y)
~ Normal() # UNSAFE: `assume` statements in @threads can crash!
x[i] ~ Normal(x[i]) # `observe` statements are okay
y[i] end
end
Important limitations:
- Observe statements: Generally safe to use in threaded loops
- Assume statements (sampling statements): Often crash unpredictably or produce incorrect results
- AD backend compatibility: Many AD backends don’t support threading. Check the multithreaded column in ADTests for compatibility
For safe parallelism within models, consider vectorized operations instead of explicit threading.
How do I check the type stability of my Turing model?
Type stability is crucial for performance. Check out:
- Performance Tips - includes specific advice on type stability
- Use
DynamicPPL.DebugUtils.model_warntype
to check type stability of your model
How do I debug my Turing model?
For debugging both statistical and syntactical issues:
- Troubleshooting Guide - common errors and their solutions
- For more advanced debugging, DynamicPPL provides the
DynamicPPL.DebugUtils
module for inspecting model internals
What are the main differences between Turing, BUGS, and Stan syntax?
Key syntactic differences include:
- Parameter blocks: Stan requires explicit
data
,parameters
, andmodel
blocks. In Turing, everything is defined within the@model
macro - Variable declarations: Stan requires upfront type declarations in parameter blocks. Turing infers types from the sampling statements
- Transformed data: Stan has a
transformed data
block for preprocessing. In Turing, data transformations should be done before defining the model - Generated quantities: Stan has a
generated quantities
block. In Turing, use the approach described in Tracking Extra Quantities
Example comparison:
// Stan
data {
real y;
}parameters {
real mu;
real<lower=0> sigma;
}model {
0, 1);
mu ~ normal(0, 1);
sigma ~ normal(
y ~ normal(mu, sigma); }
# Turing
@model function my_model(y)
~ Normal(0, 1)
mu ~ truncated(Normal(0, 1); lower=0)
sigma ~ Normal(mu, sigma)
y end
Which automatic differentiation backend should I use?
The choice of AD backend can significantly impact performance. See:
- Automatic Differentiation Guide - comprehensive comparison of ForwardDiff, Mooncake, ReverseDiff, and other backends
- Performance Tips - quick guide on choosing backends
- AD Backend Benchmarks - performance comparisons across various models
I changed one line of my model and now it’s so much slower; why?
Small changes can have big performance impacts. Common culprits include:
- Type instability introduced by the change
- Switching from vectorized to scalar operations (or vice versa)
- Inadvertently causing AD backend incompatibilities
- Breaking assumptions that allowed compiler optimizations
See our Performance Tips and Troubleshooting Guide for debugging performance regressions.