Example: Logistic Regression with Random Effects

We will use the Seeds for demonstration. This example concerns the proportion of seeds that germinated on each of 21 plates. Here, we transform the data into a NamedTuple:

data = (
    r = [10, 23, 23, 26, 17, 5, 53, 55, 32, 46, 10, 8, 10, 8, 23, 0, 3, 22, 15, 32, 3],
    n = [39, 62, 81, 51, 39, 6, 74, 72, 51, 79, 13, 16, 30, 28, 45, 4, 12, 41, 30, 51, 7],
    x1 = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
    x2 = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
    N = 21,
)

where r[i] is the number of germinated seeds and n[i] is the total number of the seeds on the $i$-th plate. Let $p_i$ be the probability of germination on the $i$-th plate. Then, the model is defined by:

\[\begin{aligned} b_i &\sim \text{Normal}(0, \tau) \\ \text{logit}(p_i) &= \alpha_0 + \alpha_1 x_{1 i} + \alpha_2 x_{2i} + \alpha_{12} x_{1i} x_{2i} + b_{i} \\ r_i &\sim \text{Binomial}(p_i, n_i) \end{aligned}\]

where $x_{1i}$ and $x_{2i}$ are the seed type and root extract of the $i$-th plate. The original BUGS program for the model is:

model
{
    for( i in 1 : N ) {
        r[i] ~ dbin(p[i],n[i])
        b[i] ~ dnorm(0.0,tau)
        logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
        alpha12 * x1[i] * x2[i] + b[i]
    }
    alpha0 ~ dnorm(0.0, 1.0E-6)
    alpha1 ~ dnorm(0.0, 1.0E-6)
    alpha2 ~ dnorm(0.0, 1.0E-6)
    alpha12 ~ dnorm(0.0, 1.0E-6)
    tau ~ dgamma(0.001, 0.001)
    sigma <- 1 / sqrt(tau)
}

Modeling Language

Writing a Model in BUGS

Language References:

Implementations in C++ and R:

JAGS and its user manual
Nimble

Language Syntax:

Writing a Model in Julia

We provide a macro which allows users to write down model definitions using Julia:

model_def = @bugs begin
    for i in 1:N
        r[i] ~ dbin(p[i], n[i])
        b[i] ~ dnorm(0.0, tau)
        p[i] = logistic(alpha0 + alpha1 * x1[i] + alpha2 * x2[i] + alpha12 * x1[i] * x2[i] + b[i])
    end
    alpha0 ~ dnorm(0.0, 1.0E-6)
    alpha1 ~ dnorm(0.0, 1.0E-6)
    alpha2 ~ dnorm(0.0, 1.0E-6)
    alpha12 ~ dnorm(0.0, 1.0E-6)
    tau ~ dgamma(0.001, 0.001)
    sigma = 1 / sqrt(tau)
end

BUGS syntax carries over almost one-to-one to Julia, with minor exceptions. Modifications required are minor: curly braces are replaced with begin ... end blocks, and for loops do not require parentheses. In addition, Julia uses f(x) = ... as a shorthand for function definition, so BUGS' link function syntax is disallowed. Instead, user can call the inverse function of the link functions on the RHS expressions.

Support for Legacy BUGS Programs

The @bugs macro also works with original (R-like) BUGS syntax:

model_def = @bugs("""
model{
    for( i in 1 : N ) {
        r[i] ~ dbin(p[i],n[i])
        b[i] ~ dnorm(0.0,tau)
        logit(p[i]) <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
        alpha12 * x1[i] * x2[i] + b[i]
    }
    alpha0 ~ dnorm(0.0,1.0E-6)
    alpha1 ~ dnorm(0.0,1.0E-6)
    alpha2 ~ dnorm(0.0,1.0E-6)
    alpha12 ~ dnorm(0.0,1.0E-6)
    tau ~ dgamma(0.001,0.001)
    sigma <- 1 / sqrt(tau)
}
""", true, true)

By default, @bugs will translate R-style variable names like a.b.c to a_b_c, user can pass false as the second argument to disable this. User can also pass true as the third argument if model { } enclosure is not present in the BUGS program. We still encourage users to write new programs using the Julia-native syntax, because of better debuggability and perks like syntax highlighting.

Basic Workflow

Compilation

Model definition and data are the two necessary inputs for compilation, with optional initializations. The compile function creates a BUGSModel that implements the LogDensityProblems.jl interface.

compile(model_def::Expr, data::NamedTuple)

And with initializations:

compile(model_def::Expr, data::NamedTuple, initializations::NamedTuple)

Using the model definition and data we defined earlier, we can compile the model:

model = compile(model_def, data)

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

Parameter values will be sampled from the prior distributions in the original space.

We can provide initializations:

initializations = (alpha = 1, beta = 1)

compile(model_def, data, initializations)

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

We can also initialize parameters after compilation:

initialize!(model, initializations)

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

initialize! also accepts a flat vector. In this case, the vector should have the same length as the number of parameters, but values can be in transformed space:

initialize!(model, rand(26))

BUGSModel (parameters are in transformed (unconstrained) space, with dimension 26):

  Model parameters:
    alpha2
    b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1]
    tau
    alpha12
    alpha1
    alpha0

  Variable sizes and types:
    b: size = (21,), type = Vector{Float64}
    p: size = (21,), type = Vector{Float64}
    n: size = (21,), type = Vector{Int64}
    alpha2: type = Float64
    sigma: type = Float64
    alpha0: type = Float64
    alpha12: type = Float64
    N: type = Int64
    tau: type = Float64
    alpha1: type = Float64
    r: size = (21,), type = Vector{Int64}
    x1: size = (21,), type = Vector{Int64}
    x2: size = (21,), type = Vector{Int64}

Inference

For gradient-based inference, compile your model with an AD backend using the adtype parameter (see Automatic Differentiation for details). We use AdvancedHMC.jl:

# Compile with gradient support
model = compile(model_def, data; adtype=AutoMooncake(; config=nothing))

n_samples, n_adapts = 2000, 1000

D = LogDensityProblems.dimension(model); initial_θ = rand(D)

samples_and_stats = AbstractMCMC.sample(
                        model,
                        NUTS(0.8),
                        n_samples;
                        chain_type = Chains,
                        n_adapts = n_adapts,
                        init_params = initial_θ,
                        discard_initial = n_adapts,
                        progress = false
                    )
describe(samples_and_stats)

[ Info: Found initial step size 0.2
Chains MCMC chain (2000×40×1 Array{Real, 3}):

Iterations        = 1001:1:3000
Number of chains  = 1
Samples per chain = 2000
parameters        = tau, alpha12, alpha2, alpha1, alpha0, b[21], b[20], b[19], b[18], b[17], b[16], b[15], b[14], b[13], b[12], b[11], b[10], b[9], b[8], b[7], b[6], b[5], b[4], b[3], b[2], b[1], sigma
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt

Summary Statistics

  parameters      mean        std      mcse    ess_bulk    ess_tail      rhat  ⋯
      Symbol   Float64    Float64   Float64        Real     Float64   Float64  ⋯

         tau   62.3601   152.8742   22.8922     61.4302     66.7005    1.0111  ⋯
     alpha12   -0.8070     0.4286    0.0140    931.7291   1170.0948    0.9996  ⋯
      alpha2    1.3539     0.2669    0.0101    720.4786    928.8923    1.0051  ⋯
      alpha1    0.0754     0.3069    0.0106    832.6237   1131.5541    0.9995  ⋯
      alpha0   -0.5544     0.1946    0.0074    706.0751    932.0235    0.9999  ⋯
       b[21]   -0.0395     0.2890    0.0058   2675.8867   1193.8510    1.0031  ⋯
       b[20]    0.2208     0.2633    0.0152    307.3472   1338.9801    0.9998  ⋯
       b[19]   -0.0108     0.2489    0.0065   1676.6893   1199.9068    1.0058  ⋯
       b[18]    0.0395     0.2342    0.0061   1712.8524   1362.4402    1.0028  ⋯
       b[17]   -0.2306     0.3332    0.0168    447.4625    758.1427    1.0025  ⋯
       b[16]   -0.1387     0.3344    0.0098   1382.8769    931.1751    1.0019  ⋯
       b[15]    0.2373     0.2813    0.0158    326.4564    973.9521    1.0015  ⋯
       b[14]   -0.1352     0.2596    0.0094    916.4998   1160.1026    1.0017  ⋯
       b[13]   -0.0660     0.2462    0.0053   2269.1072   1335.0560    1.0079  ⋯
       b[12]    0.1314     0.2751    0.0097    917.6152   1199.5996    1.0039  ⋯
           ⋮         ⋮          ⋮         ⋮           ⋮           ⋮         ⋮  ⋱

                                                    1 column and 12 rows omitted

Quantiles

  parameters      2.5%     25.0%     50.0%     75.0%      97.5%
      Symbol   Float64   Float64   Float64   Float64    Float64

         tau    2.7157    7.0635   12.1115   31.9643   599.4445
     alpha12   -1.6727   -1.0762   -0.8275   -0.5245     0.0325
      alpha2    0.8438    1.1876    1.3474    1.5160     1.9032
      alpha1   -0.5436   -0.1247    0.0846    0.2731     0.6698
      alpha0   -0.9475   -0.6747   -0.5534   -0.4304    -0.1712
       b[21]   -0.6850   -0.1926   -0.0203    0.1084     0.5436
       b[20]   -0.1935    0.0271    0.1783    0.3818     0.7980
       b[19]   -0.5418   -0.1361   -0.0077    0.1261     0.4857
       b[18]   -0.4132   -0.0886    0.0247    0.1740     0.5198
       b[17]   -1.0454   -0.4003   -0.1516   -0.0107     0.2726
       b[16]   -0.9247   -0.3005   -0.0752    0.0468     0.4193
       b[15]   -0.2213    0.0322    0.1848    0.4155     0.8738
       b[14]   -0.7068   -0.2866   -0.0968    0.0231     0.3288
       b[13]   -0.5996   -0.2079   -0.0372    0.0765     0.3920
       b[12]   -0.3544   -0.0367    0.0887    0.2744     0.7668
           ⋮         ⋮         ⋮         ⋮         ⋮          ⋮

                                                  12 rows omitted

This is consistent with the result in the OpenBUGS seeds example.

Next Steps

Automatic Differentiation - AD backends and configuration
Evaluation Modes - Different log density computation modes
Auto-Marginalization - Gradient-based inference with discrete variables
Parallel Sampling - Multi-threaded and distributed sampling

More Examples

We have transcribed all the examples from the first volume of the BUGS Examples (original and transcribed). All programs and data are included, and can be compiled using the steps described in the tutorial above.

More worked Julia scripts are available in the examples directory, including SIR, Gaussian process, and Bayesian neural network examples using Mooncake-backed gradient sampling.