Motivation

When executing a DynamicPPL model, it is very often necessary to store information about the random variables in the model. Consider, e.g.

using DynamicPPL, Distributions

@model function f1()
    x ~ Normal(0, 1)
    y ~ Normal(x, 1)
    z ~ Normal(y, 1)
    return nothing
end

When we evaluate the model, we want to know the values of x, y, and z (even if the return value of the function is nothing; the return value is typically used for other purposes). Sometimes we might want to store other information about them, like their distributions. This information can be extracted and used by inference algorithms, for example.

In general this is done by storing some kind of mapping from AbstractPPL.VarNames to values. This is the job of VarNamedTuples, or VNT for short. Before we go into that, let's take a short view back to the past to see various approaches that don't quite work. This will provide a motivation for the design of VNT.

Historical approaches

Julia contains two main data structures for key-value maps: NamedTuples and Dicts.

NamedTuple

NamedTuples are extremely lightweight data structures and are completely type stable. For example, in the above, we could store the values in a NamedTuple like so:

(x=0.5, y=1.2, z=-0.3)

(or whatever the sampled values are.) The issue with this, however, is that the keys of NamedTuples are symbols. This is fine for variables that are single identifiers (i.e., VarNames with an optic of AbstractPPL.Iden; which we'll call 'identity VarNames').

But in a model like the following, we have the VarNames x[1], x[2], x[3], and y.a. These cannot be properly converted into Symbols without loss of information, and this makes the output NamedTuple unsuitable for subsequent use.

@model function f2()
    x = Vector{Float64}(undef, 3)
    for i in eachindex(x)
        x[i] ~ Normal()
    end

    y = (; a=1.0)
    return y.a ~ Normal()
end

In the past, there was something called SimpleVarInfo which stored variables in NamedTuples (possibly converting non-identity VarNames to symbols with loss of information). This was a very fast data structure, but was unsuitable for general use.

`Dict`

In order to be completely general, we could use a Dict{VarName, Any}. This allows us to store arbitrary VarNames as keys, and arbitrary values. Again, there was a SimpleVarInfo implementation that used this approach (specifically an OrderedDict).

Dict(
    @varname(x[1]) => 0.5,
    @varname(x[2]) => -1.2,
    @varname(x[3]) => 1.7,
    @varname(y.a) => 0.3,
)

Dict{VarName, Float64} with 4 entries:
  x[3] => 1.7
  x[1] => 0.5
  y.a  => 0.3
  x[2] => -1.2

The main issue with Dicts is performance. In general there is substantial overhead in hashing and looking up keys in a Dict, compared to NamedTuple field access which is just direct memory access. Furthermore, accessing values is type-unstable (unless all the values can be made to share the same concrete type, which is the case above, but is unrealistic).

`Metadata` and `VarNamedVector`

In an attempt to get around the performance issue, DynamicPPL used to have data structures called Metadata and VarNamedVector. These structures essentially attempted to remove as much overhead as possible from Dicts by flattening their elements (i.e. variable values) and storing vectorised forms of them in contiguous vectors. Furthermore, they were often wrapped in NamedTuples to get some type stability benefits. This approach was somewhat successful and was used for many years: it is also described in the Turing.jl ACM Transactions on Probabilistic Machine Learning paper.

While this fixed many performance problems with Dict, at its very core, it is still a mapping of VarNames to values, and therefore suffers from a lack of what we call 'constructiveness'.

Constructiveness

This is a very subtle issue, but caused a number of headaches with DynamicPPL. As we will soon see, this is the main motivation behind VNT, and even so VNT does not fully solve it.

Consider the following model:

@model function f3()
    x = Vector{Float64}(undef, 2)
    x[1] ~ Normal()
    return x[2] ~ Normal()
end

f3 (generic function with 2 methods)

If at the end of the model we were to ask what the value of x was, we would have no way of knowing this. This is what we refer to as a lack of constructiveness: it is not possible to reconstruct the value of x from the stored information.

Let's say we are using a Dict to store the values. We would have something like

d = Dict(@varname(x[1]) => 0.5, @varname(x[2]) => -1.2)

Dict{VarName{:x, Index{Tuple{Int64}, @NamedTuple{}, Iden}}, Float64} with 2 entries:
  x[1] => 0.5
  x[2] => -1.2

but attempting to get d[@varname(x)] would return nothing, since there is no such key in the Dict. In fact, we cannot even access x[1:2], despite both indices obviously being defined.

haskey(d, @varname(x)), haskey(d, @varname(x[1:2]))

(false, false)

What's the problem with this?

One issue that a lack of constructiveness causes is when reading information back from a Chains object. MCMCChains.jl, for example, breaks up all variables into its constituent scalar components. Thus, even if you have a multivariate distribution

using LinearAlgebra

@model function f4()
    x ~ MvNormal(zeros(2), I)
    return x
end

sampling from this model would give a chain with keys x[1] and x[2]. When calling functions such as returned or predict on the chain, we have to somehow 'reconstruct' the value of x from its components, so that when executing the model we can use the actual value of x.

Info

Some workarounds exist for this (see this PR, and AbstractPPLDistributionsExt).

using AbstractPPL

no_dist = AbstractPPL.hasvalue(d, @varname(x))
with_dist = AbstractPPL.hasvalue(d, @varname(x), MvNormal(zeros(2), I))

(no_dist, with_dist)

(false, true)

Another problem was when conditioning on values. In such cases, you had to condition on x in full, rather than its components:

model = f4() | Dict(@varname(x) => [1.0, -1.0])  # This would work.
model = f4() | Dict(@varname(x[1]) => 1.0, @varname(x[2]) => -1.0)  # This would not.

Finally, we are unable to properly use different indexing schemes. For example, x[1] and x[1,1] mean the same thing if x is a Matrix, but the dictionary will think that they are different VarNames. Fundamentally, the issue is that the indexing semantics inside a Dict are not the same as the indexing semantics inside the model itself.

Desiderata

In summary, we want a data structure that:

Can store arbitrary VarNames as keys, with arbitrary values.
Is performant and as type stable where possible. In particular, property access should always be type-stable, and indexing should be type-stable as long as the indexed values are homogeneous in type.
Is constructive.

VarNamedTuple generally solves these issues, with a few very niche edge cases that will be discussed at the very end.