Example: Logistic Regression with Random Effects

We will use the Seeds for demonstration. This example concerns the proportion of seeds that germinated on each of 21 plates. Here, we transform the data into a NamedTuple:

data = (
    r = [10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3],
    n = [39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7],
    x1 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
    x2 = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
    N = 21,
)

where r[i] is the number of germinated seeds and n[i] is the total number of the seeds on the $i$-th plate. Let $p_i$ be the probability of germination on the $i$-th plate. Then, the model is defined by:

\[\begin{aligned} b_i &\sim \text{Normal}(0, \tau) \\ \text{logit}(p_i) &= \alpha_0 + \alpha_1 x_{1 i} + \alpha_2 x_{2i} + \alpha_{12} x_{1i} x_{2i} + b_{i} \\ r_i &\sim \text{Binomial}(p_i, n_i) \end{aligned}\]

where $x_{1i}$ and $x_{2i}$ are the seed type and root extract of the $i$-th plate. The original BUGS program for the model is:

model
{
    for( i in 1 : N ) {
        r[i] ~ dbin(p[i],n[i])
        b[i] ~ dnorm(0.0,tau)
        logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
        alpha12 * x1[i] * x2[i] + b[i]
    }
    alpha0 ~ dnorm(0.0, 1.0E-6)
    alpha1 ~ dnorm(0.0, 1.0E-6)
    alpha2 ~ dnorm(0.0, 1.0E-6)
    alpha12 ~ dnorm(0.0, 1.0E-6)
    tau ~ dgamma(0.001, 0.001)
    sigma <- 1 / sqrt(tau)
}

Modeling Language

Writing a Model in BUGS

Language References:

Implementations in C++ and R:

JAGS and its user manual
Nimble

Language Syntax:

Writing a Model in Julia

We provide a macro which allows users to write down model definitions using Julia:

seeds = @bugs begin
    for i in 1:N
        r[i] ~ dbin(p[i], n[i])
        b[i] ~ dnorm(0.0, tau)
        p[i] = logistic(alpha0 + alpha1 * x1[i] + alpha2 * x2[i] + alpha12 * x1[i] * x2[i] + b[i])
    end
    alpha0 ~ dnorm(0.0, 1.0E-6)
    alpha1 ~ dnorm(0.0, 1.0E-6)
    alpha2 ~ dnorm(0.0, 1.0E-6)
    alpha12 ~ dnorm(0.0, 1.0E-6)
    tau ~ dgamma(0.001, 0.001)
    sigma = 1 / sqrt(tau)
end

BUGS syntax carries over almost one-to-one to Julia, with minor exceptions. Modifications required are minor: curly braces are replaced with begin ... end blocks, and for loops do not require parentheses. In addition, Julia uses f(x) = ... as a shorthand for function definition, so BUGS' link function syntax is disallowed. Instead, user can call the inverse function of the link functions on the RHS expressions.

Support for Legacy BUGS Programs

The @bugs macro also works with original (R-like) BUGS syntax:

seeds = @bugs("""
model{
    for( i in 1 : N ) {
        r[i] ~ dbin(p[i],n[i])
        b[i] ~ dnorm(0.0,tau)
        logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
        alpha12 * x1[i] * x2[i] + b[i]
    }
    alpha0 ~ dnorm(0.0,1.0E-6)
    alpha1 ~ dnorm(0.0,1.0E-6)
    alpha2 ~ dnorm(0.0,1.0E-6)
    alpha12 ~ dnorm(0.0,1.0E-6)
    tau ~ dgamma(0.001,0.001)
    sigma <- 1 / sqrt(tau)
}
""", true, true)

By default, @bugs will translate R-style variable names like a.b.c to a_b_c, user can pass false as the second argument to disable this. User can also pass true as the third argument if model { } enclosure is not present in the BUGS program. We still encourage users to write new programs using the Julia-native syntax, because of better debuggability and perks like syntax highlighting.

Basic Workflow

Compilation

The model definition returned by @bugs is callable: call it with the data (a NamedTuple) to construct a BUGSModel, which implements the LogDensityProblems.jl interface.

model = seeds(data)

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

Parameter values will be sampled from the prior distributions in the original space.

`compile` is now mostly internal

Calling the model definition is the supported way to construct a BUGSModel. It wraps compile, which is kept for backward compatibility (compile(seeds, data) does the same thing) but is now mostly an internal function — you shouldn't need to call it directly.

We can provide initial parameter values after construction with initialize!:

initializations = (alpha = 1, beta = 1)
initialize!(model, initializations)

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

initialize! also accepts a flat vector. In this case, the vector should have the same length as the number of parameters, but values can be in transformed space:

initialize!(model, rand(26))

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

Inference

For gradient-based inference, construct your model with an AD backend using the adtype keyword (see Automatic Differentiation for details). We use AdvancedHMC.jl:

# Construct with gradient support
model = seeds(data; adtype=AutoMooncake(; config=nothing))

n_samples, n_adapts = 2000, 1000

D = LogDensityProblems.dimension(model); initial_θ = rand(D)

samples_and_stats = AbstractMCMC.sample(
                        model,
                        NUTS(0.8),
                        n_samples;
                        chain_type = VNChain,
                        n_adapts = n_adapts,
                        init_params = initial_θ,
                        discard_initial = n_adapts,
                        progress = false
                    )
summarystats(samples_and_stats)

╭─FlexiSummary (9 statistics) ─────────────────────────────────────────────────╮
│   iter    collapsed                                                          │
│   chain   collapsed                                                          │
│ ↓ stat  = [mean, std, mcse, ess_bulk, ess_tail, rhat, q5, q50, q95]          │
│                                                                              │
│ Parameters (27) ── VarName                                                   │
│  Float64  tau, alpha12, alpha2, alpha1, alpha0, b[21], b[20], b[19], b[18],  │
│           b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9],      │
│           b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1], sigma              │
│                                                                              │
│ Extras (13)                                                                  │
│  Float64  lp, n_steps, is_accept, acceptance_rate, log_density,              │
│           hamiltonian_energy, hamiltonian_energy_error,                      │
│           max_hamiltonian_energy_error, tree_depth, numerical_error,         │
│           step_size, nom_step_size, is_adapt                                 │
│                                                                              │
│ Summary                                                                      │
│     param     mean      std    mcse   ess_bulk   ess_tail    rhat  …         │
│       tau  33.1915  72.3027  8.1478    79.4857   122.3714  1.0015  …         │
│   alpha12  -0.8091   0.4456  0.0171   683.7492   855.1632  1.0038  …         │
│    alpha2   1.3435   0.2794  0.0105   715.5997   776.6079  0.9996  …         │
│    alpha1   0.0711   0.3229  0.0122   712.9055   685.6317  1.0024  …         │
│    alpha0  -0.5462   0.1924  0.0073   702.8978   775.6920  1.0009  …         │
│     b[21]  -0.0439   0.3042  0.0064  2597.8107  1234.6588  0.9999  …         │
│     b[20]   0.2246   0.2604  0.0115   537.2186  1115.9889  1.0030  …         │
│     b[19]  -0.0169   0.2644  0.0062  1863.7940   826.2107  1.0002  …         │
│     b[18]   0.0481   0.2447  0.0060  1746.7883  1187.1722  1.0017  …         │
│     b[17]  -0.2190   0.3124  0.0143   523.4137  1175.5720  1.0002  …         │
│     b[16]  -0.1418   0.3448  0.0112  1288.4200   702.0555  0.9998  …         │
│     b[15]   0.2455   0.2802  0.0127   488.6792  1046.8212  1.0008  …         │
│     b[14]  -0.1524   0.2714  0.0081  1170.2926  1053.1875  0.9996  …         │
│     b[13]  -0.0699   0.2558  0.0062  1772.5975  1296.2529  1.0001  …         │
│     b[12]   0.1226   0.2843  0.0095  1078.2502   946.2127  0.9999  …         │
│     b[11]   0.0908   0.3025  0.0067  2134.9145  1032.0687  1.0008  …         │
│     b[10]  -0.2628   0.2472  0.0117   476.8998   803.7148  1.0018  …         │
│      b[9]  -0.1310   0.2489  0.0079  1060.5611   993.8418  1.0004  …         │
│      b[8]   0.1919   0.2445  0.0089   796.2951  1184.0594  1.0008  …         │
│      b[7]   0.0738   0.2246  0.0066  1186.9220  1072.1458  0.9998  …         │
│      b[6]   0.0851   0.2998  0.0078  1533.0696  1151.8788  1.0036  …         │
│      b[5]   0.1140   0.2295  0.0065  1249.3697  1004.2909  0.9998  …         │
│      b[4]   0.2807   0.2500  0.0133   342.3071   786.5474  1.0011  …         │
│      b[3]  -0.2131   0.2367  0.0114   466.7766  1027.3604  1.0008  …         │
│      b[2]   0.0042   0.2215  0.0057  1583.0487  1228.6866  1.0006  …         │
│      b[1]  -0.2013   0.2641  0.0118   557.6229   837.7226  1.0009  …         │
│     sigma   0.2982   0.1395  0.0144    79.4857   122.3714  1.0042  …         │
╰──────────────────────────────────────────────────────────────────────────────╯

This is consistent with the result in the OpenBUGS seeds example.

Here chain_type = VNChain collects the samples into a FlexiChains.FlexiChain keyed by variable name. MCMCChains is also still supported: load it and pass chain_type = MCMCChains.Chains instead (or convert an existing chain with MCMCChains.Chains(samples_and_stats)).

Next Steps

Automatic Differentiation - AD backends and configuration
Evaluation Modes - Different log density computation modes
Auto-Marginalization - Gradient-based inference with discrete variables
Parallel Sampling - Multi-threaded and distributed sampling

More Examples

We have transcribed all the examples from the first volume of the BUGS Examples (original and transcribed). All programs and data are included, and can be compiled using the steps described in the tutorial above.

More worked Julia scripts are available in the examples directory, including SIR, Gaussian process, and Bayesian neural network examples using Mooncake-backed gradient sampling.