API: Turing.Variational

Turing.Variational.VIResultType
VIResult(ldf, q, info, state)
  • ldf: A DynamicPPL.LogDensityFunction corresponding to the target model (the original model can be accessed as ldf.model). If the VI process was run in unconstrained space, this LogDensityFunction will also be in unconstrained space.
  • q: Output variational distribution of algorithm. Note that, as above, this will typically also be in unconstrained space.
  • state: Collection of states used by algorithm. This can be used to resume from a past call to vi.
  • info: Information generated while executing algorithm.
source
Base.randMethod
Base.rand(rng::Random.AbstractRNG, res::VIResult, sz...)

Draw a sample, or array of samples, from the variational distribution q in res. Each sample is a DynamicPPL.VarNamedTuple containing raw parameter values.

source
Turing.Variational.q_fullrank_gaussianMethod
q_fullrank_gaussian(
    [rng::Random.AbstractRNG,]
    ldf::DynamicPPL.LogDensityFunction;
    location::Union{Nothing,<:AbstractVector} = nothing,
    scale::Union{Nothing,<:LowerTriangular} = nothing,
    kwargs...
)

Find a numerically non-degenerate Gaussian q with a scale with full-rank factors (traditionally referred to as a "full-rank family") for approximating the target ldf::LogDensityFunction.

If the scale set as nothing, the default value will be a zero-mean Gaussian with a LowerTriangular scale matrix (resulting in a covariance with "full-rank" factors) no larger than 0.6*I (covariance of 0.6^2*I). This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the Turing.InitFromUniform() strategy. Whether the default choice is used or not, the scale may be adjusted via q_initialize_scale so that the log-densities of model are finite over the samples from q.

Arguments

  • ldf: The target log-density function.

Keyword Arguments

  • location: The location parameter of the initialization. If nothing, a vector of zeros is used.
  • scale: The scale parameter of the initialization. If nothing, an identity matrix is used.

The remaining keyword arguments are passed to q_locationscale.

Returns

  • An AdvancedVI.LocationScale distribution matching the support of ldf.
source
Turing.Variational.q_initialize_scaleMethod
q_initialize_scale(
    rng::Random.AbstractRNG,
    ldf::DynamicPPL.LogDensityFunction,
    location::AbstractVector,
    scale::AbstractMatrix,
    basedist::Distributions.UnivariateDistribution;
    num_samples::Int = 10,
    num_max_trials::Int = 10,
    reduce_factor::Real = one(eltype(scale)) / 2
)

Given an initial location-scale distribution q formed by location, scale, and basedist, shrink scale until the expectation of log-densities of ldf taken over q are finite. If the log-densities are not finite even after num_max_trials, throw an error.

For reference, a location-scale distribution $q$ formed by location, scale, and basedist is a distribution where its sampling process $z \sim q$ can be represented as

u = rand(basedist, d)
z = scale * u + location

Arguments

  • ldf: The target log-density function.
  • location: The location parameter of the initialization.
  • scale: The scale parameter of the initialization.
  • basedist: The base distribution of the location-scale family.

Keyword Arguments

  • num_samples: Number of samples used to compute the average log-density at each trial.
  • num_max_trials: Number of trials until throwing an error.
  • reduce_factor: Factor for shrinking the scale. After n trials, the scale is then scale*reduce_factor^n.

Returns

  • scale_adj: The adjusted scale matrix matching the type of scale.
source
Turing.Variational.q_locationscaleMethod
q_locationscale(
    [rng::Random.AbstractRNG,]
    ldf::DynamicPPL.LogDensityFunction;
    location::Union{Nothing,<:AbstractVector} = nothing,
    scale::Union{Nothing,<:Diagonal,<:LowerTriangular} = nothing,
    meanfield::Bool = true,
    basedist::Distributions.UnivariateDistribution = Normal()
)

Find a numerically non-degenerate variational distribution q for approximating the target LogDensityFunction within the location-scale variational family formed by the type of scale and basedist.

The distribution can be manually specified by setting location, scale, and basedist. Otherwise, it chooses a Gaussian with zero-mean and scale 0.6*I (covariance of 0.6^2*I) by default. This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the Turing.InitFromUniform() strategy.

Whether the default choice is used or not, the scale may be adjusted via q_initialize_scale so that the log-densities of model are finite over the samples from q. If meanfield is set as true, the scale of q is restricted to be a diagonal matrix and only the diagonal of scale is used.

For reference, a location-scale distribution $q$ formed by location, scale, and basedist is a distribution where its sampling process $z \sim q$ can be represented as

u = rand(basedist, d)
z = scale * u + location

Arguments

  • ldf: The target log-density function.

Keyword Arguments

  • location: The location parameter of the initialization. If nothing, a vector of zeros is used.
  • scale: The scale parameter of the initialization. If nothing, an identity matrix is used.
  • meanfield: Whether to use the mean-field approximation. If true, scale is converted into a Diagonal matrix. Otherwise, it is converted into a LowerTriangular matrix.
  • basedist: The base distribution of the location-scale family.

The remaining keywords are passed to q_initialize_scale.

Returns

  • An AdvancedVI.LocationScale distribution matching the support of ldf.
source
Turing.Variational.q_meanfield_gaussianMethod
q_meanfield_gaussian(
    [rng::Random.AbstractRNG,]
    ldf::DynamicPPL.LogDensityFunction;
    location::Union{Nothing,<:AbstractVector} = nothing,
    scale::Union{Nothing,<:Diagonal} = nothing,
    kwargs...
)

Find a numerically non-degenerate mean-field Gaussian q for approximating the target ldf::LogDensityFunction.

If the scale set as nothing, the default value will be a zero-mean Gaussian with a Diagonal scale matrix (the "mean-field" approximation) no larger than 0.6*I (covariance of 0.6^2*I). This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the Turing.InitFromUniform() strategy. Whether the default choice is used or not, the scale may be adjusted via q_initialize_scale so that the log-densities of model are finite over the samples from q.

Arguments

  • ldf: The target log-density function.

Keyword Arguments

  • location: The location parameter of the initialization. If nothing, a vector of zeros is used.
  • scale: The scale parameter of the initialization. If nothing, an identity matrix is used.

The remaining keyword arguments are passed to q_locationscale.

Returns

  • An AdvancedVI.LocationScale distribution matching the support of ldf.
source
Turing.Variational.viMethod
vi(
    [rng::Random.AbstractRNG,]
    model::DynamicPPL.Model,
    family,
    max_iter::Int;
    adtype::ADTypes.AbstractADType=DEFAULT_ADTYPE,
    algorithm::AdvancedVI.AbstractVariationalAlgorithm = KLMinRepGradProxDescent(
        adtype; n_samples=10
    ),
    unconstrained::Bool=requires_unconstrained_space(algorithm),
    fix_transforms::Bool=false,
    show_progress::Bool = Turing.PROGRESS[],
    kwargs...
)

Approximate the target model via the variational inference algorithm algorithm using a variational family specified by family. This is a thin wrapper around AdvancedVI.optimize.

The default algorithm, KLMinRepGradProxDescent (relevant docs), assumes family returns a AdvancedVI.MvLocationScale, which is true if family is q_fullrank_gaussian or q_meanfield_gaussian. For other variational families, refer to the documentation of AdvancedVI to determine the best algorithm and other options.

Arguments

  • model: The target DynamicPPL.Model.
  • family: A function which is used to generate an initial variational approximation. Existing choices in Turing are q_locationscale, q_meanfield_gaussian, and q_fullrank_gaussian.
  • max_iter: Maximum number of steps.
  • Any additional arguments are passed on to AdvancedVI.optimize.

Keyword Arguments

  • adtype: Automatic differentiation backend to be applied to the log-density. The default value for algorithm also uses this backend for differentiating the variational objective.
  • algorithm: Variational inference algorithm. The default is KLMinRepGradProxDescent, please refer to AdvancedVI docs for all the options.
  • show_progress: Whether to show the progress bar.
  • unconstrained: Whether to transform the posterior to be unconstrained for running the variational inference algorithm. The default value depends on the chosen algorithm (most algorithms require unconstrained space).
  • fix_transforms: Whether to precompute the transforms needed to convert model parameters to (possibly unconstrained) vectors. This can lead to performance improvements, but if any transforms depend on model parameters, setting fix_transforms=true can silently yield incorrect results.
  • Any additional keyword arguments are passed on both to the function initial_approx, and also to AdvancedVI.optimize.

See the docs of AdvancedVI.optimize for additional keyword arguments.

Returns

A VIResult object: please see its docstring for information.

source