Frequently Asked Questions
Why is this variable being treated as random instead of observed?
This is a common source of confusion. In Turing.jl, you can only condition or fix expressions that explicitly appear on the left-hand side (LHS) of a ~
statement.
For example, if your model contains:
~ filldist(Normal(), 2) x
You cannot directly condition on x[2]
using condition(model, @varname(x[2]) => 1.0)
because x[2]
never appears on the LHS of a ~
statement. Only x
as a whole appears there.
However, there is an important exception: when you use the broadcasting operator .~
with a univariate distribution, each element is treated as being separately drawn from that distribution, allowing you to condition on individual elements:
@model function f1()
= Vector{Float64}(undef, 3)
x .~ Normal() # Each element is a separate draw
x end
= f1() | (@varname(x[1]) => 1.0)
m1 sample(m1, NUTS(), 100) # This works!
In contrast, you cannot condition on parts of a multivariate distribution because it represents a single distribution over the entire vector:
@model function f2()
= Vector{Float64}(undef, 3)
x ~ MvNormal(zeros(3), I) # Single multivariate distribution
x end
= f2() | (@varname(x[1]) => 1.0)
m2 sample(m2, NUTS(), 100) # This doesn't work!
The key insight is that filldist
creates a single distribution (not N independent distributions), which is why you cannot condition on individual elements. The distinction is not just about what appears on the LHS of ~
, but whether you’re dealing with separate distributions (.~
with univariate) or a single distribution over multiple values (~
with multivariate or filldist
).
To understand more about how Turing determines whether a variable is treated as random or observed, see: - Core Functionality - basic explanation of the ~
notation and conditioning
Can I use parallelism / threads in my model?
Yes, but with important caveats! There are two types of parallelism to consider:
1. Parallel Sampling (Multiple Chains)
Turing.jl fully supports sampling multiple chains in parallel: - Multithreaded sampling: Use MCMCThreads()
to run one chain per thread - Distributed sampling: Use MCMCDistributed()
for distributed computing
See the Core Functionality guide for examples.
2. Threading Within Models
Using threads inside your model (e.g., Threads.@threads
) requires more care:
@model function f(y)
= Vector{Float64}(undef, length(y))
x Threads.@threads for i in eachindex(y)
~ Normal() # UNSAFE: `assume` statements in @threads can crash!
x[i] ~ Normal(x[i]) # `observe` statements are okay
y[i] end
end
Important limitations: - Observe statements: Generally safe to use in threaded loops - Assume statements (sampling statements): Often crash unpredictably or produce incorrect results - AD backend compatibility: Many AD backends don’t support threading. Check the multithreaded column in ADTests for compatibility
For safe parallelism within models, consider vectorized operations instead of explicit threading.
How do I check the type stability of my Turing model?
Type stability is crucial for performance. Check out: - Performance Tips - includes specific advice on type stability - Use DynamicPPL.DebugUtils.model_warntype
to check type stability of your model
How do I debug my Turing model?
For debugging both statistical and syntactical issues: - Troubleshooting Guide - common errors and their solutions - For more advanced debugging, DynamicPPL provides the DynamicPPL.DebugUtils
module for inspecting model internals
What are the main differences between Turing, BUGS, and Stan syntax?
Key syntactic differences include:
- Parameter blocks: Stan requires explicit
data
,parameters
, andmodel
blocks. In Turing, everything is defined within the@model
macro - Variable declarations: Stan requires upfront type declarations in parameter blocks. Turing infers types from the sampling statements
- Transformed data: Stan has a
transformed data
block for preprocessing. In Turing, data transformations should be done before defining the model - Generated quantities: Stan has a
generated quantities
block. In Turing, use the approach described in Tracking Extra Quantities
Example comparison:
// Stan
data {
real y;
}parameters {
real mu;
real<lower=0> sigma;
}model {
0, 1);
mu ~ normal(0, 1);
sigma ~ normal(
y ~ normal(mu, sigma); }
# Turing
@model function my_model(y)
~ Normal(0, 1)
mu ~ truncated(Normal(0, 1); lower=0)
sigma ~ Normal(mu, sigma)
y end
Which automatic differentiation backend should I use?
The choice of AD backend can significantly impact performance. See: - Automatic Differentiation Guide - comprehensive comparison of ForwardDiff, Mooncake, ReverseDiff, and other backends - Performance Tips - quick guide on choosing backends - AD Backend Benchmarks - performance comparisons across various models
I changed one line of my model and now it’s so much slower; why?
Small changes can have big performance impacts. Common culprits include: - Type instability introduced by the change - Switching from vectorized to scalar operations (or vice versa) - Inadvertently causing AD backend incompatibilities - Breaking assumptions that allowed compiler optimizations
See our Performance Tips and Troubleshooting Guide for debugging performance regressions.