Example: Logistic Regression with Random Effects
We will use the Seeds for demonstration. This example concerns the proportion of seeds that germinated on each of 21 plates. Here, we transform the data into a NamedTuple:
data = (
r = [10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3],
n = [39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7],
x1 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
x2 = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
N = 21,
)where r[i] is the number of germinated seeds and n[i] is the total number of the seeds on the $i$-th plate. Let $p_i$ be the probability of germination on the $i$-th plate. Then, the model is defined by:
\[\begin{aligned} b_i &\sim \text{Normal}(0, \tau) \\ \text{logit}(p_i) &= \alpha_0 + \alpha_1 x_{1 i} + \alpha_2 x_{2i} + \alpha_{12} x_{1i} x_{2i} + b_{i} \\ r_i &\sim \text{Binomial}(p_i, n_i) \end{aligned}\]
where $x_{1i}$ and $x_{2i}$ are the seed type and root extract of the $i$-th plate. The original BUGS program for the model is:
model
{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0, 1.0E-6)
alpha1 ~ dnorm(0.0, 1.0E-6)
alpha2 ~ dnorm(0.0, 1.0E-6)
alpha12 ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma <- 1 / sqrt(tau)
}Modeling Language
Writing a Model in BUGS
Language References:
Implementations in C++ and R:
- JAGS and its user manual
- Nimble
Language Syntax:
Writing a Model in Julia
We provide a macro which allows users to write down model definitions using Julia:
seeds = @bugs begin
for i in 1:N
r[i] ~ dbin(p[i], n[i])
b[i] ~ dnorm(0.0, tau)
p[i] = logistic(alpha0 + alpha1 * x1[i] + alpha2 * x2[i] + alpha12 * x1[i] * x2[i] + b[i])
end
alpha0 ~ dnorm(0.0, 1.0E-6)
alpha1 ~ dnorm(0.0, 1.0E-6)
alpha2 ~ dnorm(0.0, 1.0E-6)
alpha12 ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma = 1 / sqrt(tau)
endBUGS syntax carries over almost one-to-one to Julia, with minor exceptions. Modifications required are minor: curly braces are replaced with begin ... end blocks, and for loops do not require parentheses. In addition, Julia uses f(x) = ... as a shorthand for function definition, so BUGS' link function syntax is disallowed. Instead, user can call the inverse function of the link functions on the RHS expressions.
Support for Legacy BUGS Programs
The @bugs macro also works with original (R-like) BUGS syntax:
seeds = @bugs("""
model{
for( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
b[i] ~ dnorm(0.0,tau)
logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i] + b[i]
}
alpha0 ~ dnorm(0.0,1.0E-6)
alpha1 ~ dnorm(0.0,1.0E-6)
alpha2 ~ dnorm(0.0,1.0E-6)
alpha12 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau)
}
""", true, true)By default, @bugs will translate R-style variable names like a.b.c to a_b_c, user can pass false as the second argument to disable this. User can also pass true as the third argument if model { } enclosure is not present in the BUGS program. We still encourage users to write new programs using the Julia-native syntax, because of better debuggability and perks like syntax highlighting.
Basic Workflow
Compilation
The model definition returned by @bugs is callable: call it with the data (a NamedTuple) to construct a BUGSModel, which implements the LogDensityProblems.jl interface.
model = seeds(data)BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}Parameter values will be sampled from the prior distributions in the original space.
Calling the model definition is the supported way to construct a BUGSModel. It wraps compile, which is kept for backward compatibility (compile(seeds, data) does the same thing) but is now mostly an internal function — you shouldn't need to call it directly.
We can provide initial parameter values after construction with initialize!:
initializations = (alpha = 1, beta = 1)
initialize!(model, initializations)BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}
initialize! also accepts a flat vector. In this case, the vector should have the same length as the number of parameters, but values can be in transformed space:
initialize!(model, rand(26))BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):
Model parameters:
alpha2
b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
tau
alpha12
alpha1
alpha0
Variable sizes and types:
b: size = (21,), type = Vector{Float64}
p: size = (21,), type = Vector{Float64}
n: size = (21,), type = Vector{Int64}
alpha2: type = Float64
sigma: type = Float64
alpha0: type = Float64
alpha12: type = Float64
N: type = Int64
tau: type = Float64
alpha1: type = Float64
r: size = (21,), type = Vector{Int64}
x1: size = (21,), type = Vector{Int64}
x2: size = (21,), type = Vector{Int64}
Inference
For gradient-based inference, construct your model with an AD backend using the adtype keyword (see Automatic Differentiation for details). We use AdvancedHMC.jl:
# Construct with gradient support
model = seeds(data; adtype=AutoMooncake(; config=nothing))
n_samples, n_adapts = 2000, 1000
D = LogDensityProblems.dimension(model); initial_θ = rand(D)
samples_and_stats = AbstractMCMC.sample(
model,
NUTS(0.8),
n_samples;
chain_type = VNChain,
n_adapts = n_adapts,
init_params = initial_θ,
discard_initial = n_adapts,
progress = false
)
summarystats(samples_and_stats)╭─FlexiSummary (9 statistics) ─────────────────────────────────────────────────╮
│ iter collapsed │
│ chain collapsed │
│ ↓ stat = [mean, std, mcse, ess_bulk, ess_tail, rhat, q5, q50, q95] │
│ │
│ Parameters (27) ── VarName │
│ Float64 tau, alpha12, alpha2, alpha1, alpha0, b[21], b[20], b[19], b[18], │
│ b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], │
│ b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1], sigma │
│ │
│ Extras (13) │
│ Float64 lp, n_steps, is_accept, acceptance_rate, log_density, │
│ hamiltonian_energy, hamiltonian_energy_error, │
│ max_hamiltonian_energy_error, tree_depth, numerical_error, │
│ step_size, nom_step_size, is_adapt │
│ │
│ Summary │
│ param mean std mcse ess_bulk ess_tail rhat … │
│ tau 33.1915 72.3027 8.1478 79.4857 122.3714 1.0015 … │
│ alpha12 -0.8091 0.4456 0.0171 683.7492 855.1632 1.0038 … │
│ alpha2 1.3435 0.2794 0.0105 715.5997 776.6079 0.9996 … │
│ alpha1 0.0711 0.3229 0.0122 712.9055 685.6317 1.0024 … │
│ alpha0 -0.5462 0.1924 0.0073 702.8978 775.6920 1.0009 … │
│ b[21] -0.0439 0.3042 0.0064 2597.8107 1234.6588 0.9999 … │
│ b[20] 0.2246 0.2604 0.0115 537.2186 1115.9889 1.0030 … │
│ b[19] -0.0169 0.2644 0.0062 1863.7940 826.2107 1.0002 … │
│ b[18] 0.0481 0.2447 0.0060 1746.7883 1187.1722 1.0017 … │
│ b[17] -0.2190 0.3124 0.0143 523.4137 1175.5720 1.0002 … │
│ b[16] -0.1418 0.3448 0.0112 1288.4200 702.0555 0.9998 … │
│ b[15] 0.2455 0.2802 0.0127 488.6792 1046.8212 1.0008 … │
│ b[14] -0.1524 0.2714 0.0081 1170.2926 1053.1875 0.9996 … │
│ b[13] -0.0699 0.2558 0.0062 1772.5975 1296.2529 1.0001 … │
│ b[12] 0.1226 0.2843 0.0095 1078.2502 946.2127 0.9999 … │
│ b[11] 0.0908 0.3025 0.0067 2134.9145 1032.0687 1.0008 … │
│ b[10] -0.2628 0.2472 0.0117 476.8998 803.7148 1.0018 … │
│ b[9] -0.1310 0.2489 0.0079 1060.5611 993.8418 1.0004 … │
│ b[8] 0.1919 0.2445 0.0089 796.2951 1184.0594 1.0008 … │
│ b[7] 0.0738 0.2246 0.0066 1186.9220 1072.1458 0.9998 … │
│ b[6] 0.0851 0.2998 0.0078 1533.0696 1151.8788 1.0036 … │
│ b[5] 0.1140 0.2295 0.0065 1249.3697 1004.2909 0.9998 … │
│ b[4] 0.2807 0.2500 0.0133 342.3071 786.5474 1.0011 … │
│ b[3] -0.2131 0.2367 0.0114 466.7766 1027.3604 1.0008 … │
│ b[2] 0.0042 0.2215 0.0057 1583.0487 1228.6866 1.0006 … │
│ b[1] -0.2013 0.2641 0.0118 557.6229 837.7226 1.0009 … │
│ sigma 0.2982 0.1395 0.0144 79.4857 122.3714 1.0042 … │
╰──────────────────────────────────────────────────────────────────────────────╯This is consistent with the result in the OpenBUGS seeds example.
Here chain_type = VNChain collects the samples into a FlexiChains.FlexiChain keyed by variable name. MCMCChains is also still supported: load it and pass chain_type = MCMCChains.Chains instead (or convert an existing chain with MCMCChains.Chains(samples_and_stats)).
Next Steps
- Automatic Differentiation - AD backends and configuration
- Evaluation Modes - Different log density computation modes
- Auto-Marginalization - Gradient-based inference with discrete variables
- Parallel Sampling - Multi-threaded and distributed sampling
More Examples
We have transcribed all the examples from the first volume of the BUGS Examples (original and transcribed). All programs and data are included, and can be compiled using the steps described in the tutorial above.
More worked Julia scripts are available in the examples directory, including SIR, Gaussian process, and Bayesian neural network examples using Mooncake-backed gradient sampling.