using Distributions
using Random
Random.seed!(12); # Set seed for reproducibility
Introduction: Coin Flipping
This is the first of a series of guided tutorials on the Turing language. In this tutorial, we will use Bayesian inference to estimate the probability that a coin flip will result in heads, given a series of observations.
Setup
First, let us load some packages that we need to simulate a coin flip:
and to visualize our results.
using StatsPlots
Note that Turing is not loaded here — we do not use it in this example. Next, we configure the data generating model. Let us set the true probability that a coin flip turns up heads
= 0.5; p_true
and set the number of coin flips we will show our model.
= 100; N
We simulate N
coin flips by drawing N random samples from the Bernoulli distribution with success probability p_true
. The draws are collected in a variable called data
:
= rand(Bernoulli(p_true), N); data
Here are the first five coin flips:
1:5] data[
5-element Vector{Bool}:
1
0
0
0
1
Coin Flipping Without Turing
The following example illustrates the effect of updating our beliefs with every piece of new evidence we observe.
Assume that we are unsure about the probability of heads in a coin flip. To get an intuitive understanding of what “updating our beliefs” is, we will visualize the probability of heads in a coin flip after each observed evidence.
We begin by specifying a prior belief about the distribution of heads and tails in a coin toss. Here we choose a Beta distribution as prior distribution for the probability of heads. Before any coin flip is observed, we assume a uniform distribution \(\operatorname{U}(0, 1) = \operatorname{Beta}(1, 1)\) of the probability of heads. I.e., every probability is equally likely initially.
= Beta(1, 1); prior_belief
With our priors set and our data at hand, we can perform Bayesian inference.
This is a fairly simple process. We expose one additional coin flip to our model every iteration, such that the first run only sees the first coin flip, while the last iteration sees all the coin flips. In each iteration we update our belief to an updated version of the original Beta distribution that accounts for the new proportion of heads and tails. The update is particularly simple since our prior distribution is a conjugate prior. Note that a closed-form expression for the posterior (implemented in the updated_belief
expression below) is not accessible in general and usually does not exist for more interesting models.
function updated_belief(prior_belief::Beta, data::AbstractArray{Bool})
# Count the number of heads and tails.
= sum(data)
heads = length(data) - heads
tails
# Update our prior belief in closed form (this is possible because we use a conjugate prior).
return Beta(prior_belief.α + heads, prior_belief.β + tails)
end
# Show updated belief for increasing number of observations
@gif for n in 0:N
plot(
updated_belief(prior_belief, data[1:n]);
=(500, 250),
size="Updated belief after $n observations",
title="probability of heads",
xlabel="",
ylabel=nothing,
legend=(0, 1),
xlim=0,
fill=0.3,
α=3,
w
)vline!([p_true])
end
GKS: cannot open display - headless operation mode active [ Info: Saved animation to /tmp/jl_fMpALInev8.gif
The animation above shows that with increasing evidence our belief about the probability of heads in a coin flip slowly adjusts towards the true value. The orange line in the animation represents the true probability of seeing heads on a single coin flip, while the mode of the distribution shows what the model believes the probability of a heads is given the evidence it has seen.
For the mathematically inclined, the \(\operatorname{Beta}\) distribution is updated by adding each coin flip to the parameters \(\alpha\) and \(\beta\) of the distribution. Initially, the parameters are defined as \(\alpha = 1\) and \(\beta = 1\). Over time, with more and more coin flips, \(\alpha\) and \(\beta\) will be approximately equal to each other as we are equally likely to flip a heads or a tails.
The mean of the \(\operatorname{Beta}(\alpha, \beta)\) distribution is
\[\operatorname{E}[X] = \dfrac{\alpha}{\alpha+\beta}.\]
This implies that the plot of the distribution will become centered around 0.5 for a large enough number of coin flips, as we expect \(\alpha \approx \beta\).
The variance of the \(\operatorname{Beta}(\alpha, \beta)\) distribution is
\[\operatorname{var}[X] = \dfrac{\alpha\beta}{(\alpha + \beta)^2 (\alpha + \beta + 1)}.\]
Thus the variance of the distribution will approach 0 with more and more samples, as the denominator will grow faster than will the numerator. More samples means less variance. This implies that the distribution will reflect less uncertainty about the probability of receiving a heads and the plot will become more tightly centered around 0.5 for a large enough number of coin flips.
Coin Flipping With Turing
We now move away from the closed-form expression above. We use Turing to specify the same model and to approximate the posterior distribution with samples. To do so, we first need to load Turing
.
using Turing
Precompiling Turing... 825.8 ms ? OptimizationBase 1417.2 ms ? Optimization 2184.8 ms ? OptimizationOptimJL Info Given Turing was explicitly requested, output will be shown live WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. 5540.7 ms ? Turing 5722.5 ms ? Turing → TuringOptimExt WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. Precompiling Optimization... 811.7 ms ? OptimizationBase Info Given Optimization was explicitly requested, output will be shown live WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. 1421.7 ms ? Optimization WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. Precompiling OptimizationBase... Info Given OptimizationBase was explicitly requested, output will be shown live WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. 798.8 ms ? OptimizationBase WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: Method definition (::Type{OptimizationBase.OptimizerMissingError})(Any) in module OptimizationBase at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:23 overwritten at /home/runner/.julia/packages/OptimizationBase/sfIfa/src/solve.jl:177. ERROR: Method overwriting is not permitted during Module precompilation. Use `__precompile__(false)` to opt-out of precompilation. WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. ┌ Warning: Replacing docs for `CommonSolve.solve :: Tuple{SciMLBase.OptimizationProblem, Any, Vararg{Any}}` in module `OptimizationBase` └ @ Base.Docs docs/Docs.jl:243 WARNING: redefinition of constant OptimizationBase.OPTIMIZER_MISSING_ERROR_MESSAGE. This may fail, cause incorrect answers, or produce other errors. ┌ Warning: Replacing docs for `CommonSolve.init :: Tuple{SciMLBase.OptimizationProblem, Any, Vararg{Any}}` in module `OptimizationBase` └ @ Base.Docs docs/Docs.jl:243 ┌ Warning: Replacing docs for `CommonSolve.solve! :: Tuple{SciMLBase.AbstractOptimizationCache}` in module `OptimizationBase` └ @ Base.Docs docs/Docs.jl:243 Precompiling OptimizationOptimJL... 784.7 ms ? OptimizationBase 969.1 ms ? Optimization Info Given OptimizationOptimJL was explicitly requested, output will be shown live ┌ Warning: Module Optimization with build ID ffffffff-ffff-ffff-5eb1-d16dd190abcc is missing from the cache. │ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does. └ @ Base loading.jl:2541 1144.7 ms ? OptimizationOptimJL ┌ Warning: Module Optimization with build ID ffffffff-ffff-ffff-5eb1-d16dd190abcc is missing from the cache. │ This may mean Optimization [7f7a1694-90dd-40f0-9382-eb1efda571ba] does not support precompilation but is imported by a module that does. └ @ Base loading.jl:2541 Precompiling TuringOptimExt... 797.2 ms ? OptimizationBase 976.9 ms ? Optimization 1126.3 ms ? OptimizationOptimJL 3717.3 ms ? Turing Info Given TuringOptimExt was explicitly requested, output will be shown live ┌ Warning: Module Turing with build ID ffffffff-ffff-ffff-5eb1-09f7b9e1659b is missing from the cache. │ This may mean Turing [fce5fe82-541a-59a6-adf8-730c64b5f9a0] does not support precompilation but is imported by a module that does. └ @ Base loading.jl:2541 618.2 ms ? Turing → TuringOptimExt ┌ Warning: Module Turing with build ID ffffffff-ffff-ffff-5eb1-09f7b9e1659b is missing from the cache. │ This may mean Turing [fce5fe82-541a-59a6-adf8-730c64b5f9a0] does not support precompilation but is imported by a module that does. └ @ Base loading.jl:2541
Additionally, we load MCMCChains
, a library for analyzing and visualizing the samples with which we approximate the posterior distribution.
using MCMCChains
First, we define the coin-flip model using Turing.
# Unconditioned coinflip model with `N` observations.
@model function coinflip(; N::Int)
# Our prior belief about the probability of heads in a coin toss.
~ Beta(1, 1)
p
# Heads or tails of a coin are drawn from `N` independent and identically
# distributed Bernoulli distributions with success rate `p`.
~ filldist(Bernoulli(p), N)
y
return y
end;
In the Turing model the prior distribution of the variable p
, the probability of heads in a coin toss, and the distribution of the observations y
are specified on the right-hand side of the ~
expressions. The @model
macro modifies the body of the Julia function coinflip
and, e.g., replaces the ~
statements with internal function calls that are used for sampling.
Here we defined a model that is not conditioned on any specific observations as this allows us to easily obtain samples of both p
and y
with
rand(coinflip(; N))
(p = 0.9520583115441003, y = Bool[1, 1, 1, 1, 1, 1, 1, 1, 1, 1 … 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
The model can be conditioned on some observations with |
. See the documentation of the condition
syntax in DynamicPPL.jl
for more details. In the conditioned model
the observations y
are fixed to data
.
coinflip(y::AbstractVector{<:Real}) = coinflip(; N=length(y)) | (; y)
= coinflip(data); model
After defining the model, we can approximate the posterior distribution by drawing samples from the distribution. In this example, we use a Hamiltonian Monte Carlo sampler to draw these samples. Other tutorials give more information on the samplers available in Turing and discuss their use for different models.
= NUTS(); sampler
We approximate the posterior distribution with 1000 samples:
= sample(model, sampler, 2_000, progress=false); chain
┌ Info: Found initial step size └ ϵ = 0.4
The sample
function and common keyword arguments are explained more extensively in the documentation of AbstractMCMC.jl.
After finishing the sampling process, we can visually compare the closed-form posterior distribution with the approximation obtained with Turing.
histogram(chain)
Now we can build our plot:
# Visualize a blue density plot of the approximate posterior distribution using HMC (see Chain 1 in the legend).
density(chain; xlim=(0, 1), legend=:best, w=2, c=:blue)
# Visualize a green density plot of the posterior distribution in closed-form.
plot!(
0:0.01:1,
pdf.(updated_belief(prior_belief, data), 0:0.01:1);
="probability of heads",
xlabel="",
ylabel="",
title=(0, 1),
xlim="Closed-form",
label=0,
fill=0.3,
α=3,
w=:lightgreen,
c
)
# Visualize the true probability of heads in red.
vline!([p_true]; label="True probability", c=:red)
As we can see, the samples obtained with Turing closely approximate the true posterior distribution. Hopefully this tutorial has provided an easy-to-follow, yet informative introduction to Turing’s simpler applications. More advanced usage is demonstrated in other tutorials.