Example: Logistic Regression with Random Effects
We will use the Seeds for demonstration. This example concerns the proportion of seeds that germinated on each of 21 plates. Here, we transform the data into a NamedTuple:
data = (
r = [10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3],
n = [39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7],
x1 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
x2 = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
N = 21,
)where r[i] is the number of germinated seeds and n[i] is the total number of the seeds on the $i$-th plate. Let $p_i$ be the probability of germination on the $i$-th plate. Then, the model is defined by:
\[\begin{aligned} b_i &\sim \text{Normal}(0, \tau) \\ \text{logit}(p_i) &= \alpha_0 + \alpha_1 x_{1 i} + \alpha_2 x_{2i} + \alpha_{12} x_{1i} x_{2i} + b_{i} \\ r_i &\sim \text{Binomial}(p_i, n_i) \end{aligned}\]
where $x_{1i}$ and $x_{2i}$ are the seed type and root extract of the $i$-th plate. The original BUGS program for the model is:
model
{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0, 1.0E-6)
alpha1 ~ dnorm(0.0, 1.0E-6)
alpha2 ~ dnorm(0.0, 1.0E-6)
alpha12 ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma <- 1 / sqrt(tau)
}Modeling Language
Writing a Model in BUGS
Language References:
Implementations in C++ and R:
- JAGS and its user manual
- Nimble
Language Syntax:
Writing a Model in Julia
We provide a macro which allows users to write down model definitions using Julia:
model_def = @bugs begin
for i in 1:N
r[i] ~ dbin(p[i], n[i])
b[i] ~ dnorm(0.0, tau)
p[i] = logistic(alpha0 + alpha1 * x1[i] + alpha2 * x2[i] + alpha12 * x1[i] * x2[i] + b[i])
end
alpha0 ~ dnorm(0.0, 1.0E-6)
alpha1 ~ dnorm(0.0, 1.0E-6)
alpha2 ~ dnorm(0.0, 1.0E-6)
alpha12 ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma = 1 / sqrt(tau)
endBUGS syntax carries over almost one-to-one to Julia, with minor exceptions. Modifications required are minor: curly braces are replaced with begin ... end blocks, and for loops do not require parentheses. In addition, Julia uses f(x) = ... as a shorthand for function definition, so BUGS' link function syntax is disallowed. Instead, user can call the inverse function of the link functions on the RHS expressions.
Support for Legacy BUGS Programs
The @bugs macro also works with original (R-like) BUGS syntax:
model_def = @bugs("""
model{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0,1.0E-6)
alpha1 ~ dnorm(0.0,1.0E-6)
alpha2 ~ dnorm(0.0,1.0E-6)
alpha12 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau)
}
""", true, true)By default, @bugs will translate R-style variable names like a.b.c to a_b_c, user can pass false as the second argument to disable this. User can also pass true as the third argument if model { } enclosure is not present in the BUGS program. We still encourage users to write new programs using the Julia-native syntax, because of better debuggability and perks like syntax highlighting.
Basic Workflow
Compilation
Model definition and data are the two necessary inputs for compilation, with optional initializations. The compile function creates a BUGSModel that implements the LogDensityProblems.jl interface.
compile(model_def::Expr, data::NamedTuple)And with initializations:
compile(model_def::Expr, data::NamedTuple, initializations::NamedTuple)Using the model definition and data we defined earlier, we can compile the model:
model = compile(model_def, data)BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}Parameter values will be sampled from the prior distributions in the original space.
We can provide initializations:
initializations = (alpha = 1, beta = 1)compile(model_def, data, initializations)BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}
We can also initialize parameters after compilation:
initialize!(model, initializations)BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}
initialize! also accepts a flat vector. In this case, the vector should have the same length as the number of parameters, but values can be in transformed space:
initialize!(model, rand(26))BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}
Inference
For gradient-based inference, compile your model with an AD backend using the adtype parameter (see Automatic Differentiation for details). We use AdvancedHMC.jl:
# Compile with gradient support
model = compile(model_def, data; adtype=AutoMooncake(; config=nothing))
n_samples, n_adapts = 2000, 1000
D = LogDensityProblems.dimension(model); initial_θ = rand(D)
samples_and_stats = AbstractMCMC.sample(
model,
NUTS(0.8),
n_samples;
chain_type = Chains,
n_adapts = n_adapts,
init_params = initial_θ,
discard_initial = n_adapts,
progress = false
)
describe(samples_and_stats)[ Info: Found initial step size 0.2
Chains MCMC chain (2000×40×1 Array{Real, 3}):
Iterations = 1001:1:3000
Number of chains = 1
Samples per chain = 2000
parameters = tau, alpha12, alpha2, alpha1, alpha0, b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1], sigma
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Real Float64 Float64 ⋯
tau 65.1060 167.7543 17.7757 96.3865 109.3445 1.0006 ⋯
alpha12 -0.8275 0.4353 0.0150 837.3169 857.0329 1.0004 ⋯
alpha2 1.3558 0.2841 0.0113 656.6138 619.6581 1.0017 ⋯
alpha1 0.0809 0.3093 0.0104 892.9790 952.0699 1.0002 ⋯
alpha0 -0.5507 0.1851 0.0066 786.9813 1037.4046 1.0007 ⋯
b[21] -0.0387 0.2705 0.0054 2609.8557 1064.1403 1.0006 ⋯
b[20] 0.2075 0.2599 0.0129 394.6012 1144.1288 1.0060 ⋯
b[19] -0.0109 0.2449 0.0060 1855.8132 863.7161 0.9997 ⋯
b[18] 0.0366 0.2380 0.0067 1409.7654 950.8010 0.9998 ⋯
b[17] -0.1989 0.2804 0.0133 504.5971 1100.0883 0.9995 ⋯
b[16] -0.1284 0.3054 0.0113 946.8679 606.9837 0.9995 ⋯
b[15] 0.2289 0.2712 0.0158 314.7916 871.6855 1.0004 ⋯
b[14] -0.1294 0.2541 0.0069 1418.0832 855.6476 0.9999 ⋯
b[13] -0.0635 0.2496 0.0071 1398.0850 881.2851 1.0001 ⋯
b[12] 0.1160 0.2744 0.0111 795.2156 928.0545 1.0014 ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
1 column and 12 rows omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
tau 2.7877 7.8893 13.9702 34.5906 602.5314
alpha12 -1.6868 -1.0998 -0.8134 -0.5538 0.0251
alpha2 0.8312 1.1721 1.3396 1.5251 1.9595
alpha1 -0.5674 -0.1189 0.0897 0.2906 0.6572
alpha0 -0.9183 -0.6665 -0.5513 -0.4346 -0.1830
b[21] -0.6051 -0.1809 -0.0296 0.1100 0.4902
b[20] -0.1829 0.0268 0.1601 0.3455 0.8346
b[19] -0.5153 -0.1428 -0.0079 0.1215 0.5020
b[18] -0.4280 -0.0917 0.0182 0.1604 0.5755
b[17] -0.8448 -0.3494 -0.1521 -0.0120 0.2520
b[16] -0.8641 -0.2663 -0.0765 0.0361 0.4103
b[15] -0.1713 0.0284 0.1861 0.3821 0.8862
b[14] -0.6806 -0.2780 -0.0953 0.0181 0.3436
b[13] -0.5740 -0.1986 -0.0469 0.0674 0.4101
b[12] -0.3405 -0.0513 0.0780 0.2453 0.7474
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
12 rows omittedThis is consistent with the result in the OpenBUGS seeds example.
Next Steps
- Automatic Differentiation - AD backends and configuration
- Evaluation Modes - Different log density computation modes
- Auto-Marginalization - Gradient-based inference with discrete variables
- Parallel Sampling - Multi-threaded and distributed sampling
More Examples
We have transcribed all the examples from the first volume of the BUGS Examples (original and transcribed). All programs and data are included, and can be compiled using the steps described in the tutorial above.
More worked Julia scripts are available in the examples directory, including SIR, Gaussian process, and Bayesian neural network examples using Mooncake-backed gradient sampling.