Evaluator preparation and AD

AbstractPPL provides a small interface for preparing callables and asking a prepared evaluator for values and derivatives. prepare binds a callable to a sample input that establishes the expected input shape and type; value_and_gradient!! and value_and_jacobian!! then return the value and derivative together.

The !! suffix signals that the returned gradient or Jacobian may alias internal cache buffers of the prepared evaluator. The next call to value_and_gradient!! (or value_and_jacobian!!) may overwrite that buffer in place, so a previously-returned reference will silently change. Copy before holding on to a result:

val, grad = value_and_gradient!!(prepared, x1)
saved = copy(grad)                       # safe to keep
val2, grad2 = value_and_gradient!!(prepared, x2)
# `grad` may now reflect `x2`; `saved` still reflects `x1`

Backends that always allocate fresh output (e.g. ForwardDiff.gradient) do not actually alias, but consumers should not rely on that — write to the contract, not the implementation.

Quick start

using AbstractPPL
using AbstractPPL: prepare, value_and_gradient!!
using AbstractPPL.Evaluators: Prepared, VectorEvaluator, NamedTupleEvaluator
using ADTypes: AutoForwardDiff
using ForwardDiff: ForwardDiff

function AbstractPPL.prepare(adtype::AutoForwardDiff, f, x::AbstractVector{<:Real})
    return Prepared(adtype, VectorEvaluator(f, length(x)))
end

function AbstractPPL.value_and_gradient!!(
    p::Prepared{<:AutoForwardDiff}, x::AbstractVector{<:Real}
)
    return (p(x), ForwardDiff.gradient(p.evaluator.f, x))
end

mvnormal_logp(x) = -0.5 * sum(abs2, x)  # standard normal log density (up to constant)
prepared = prepare(AutoForwardDiff(), mvnormal_logp, zeros(3))
value_and_gradient!!(prepared, [1.0, 2.0, 3.0])

(-7.0, [-1.0, -2.0, -3.0])

Two input styles

Vector inputs

When the callable accepts a flat vector, pass a sample vector whose length matches the expected input:

prepared([1.0, 2.0, 3.0])

-7.0

For vector-valued callables, use value_and_jacobian!!. The returned Jacobian has shape (length(value), length(x)). The same backend extension that defines value_and_gradient!! typically also defines value_and_jacobian!! on the same Prepared type — they are separate generic functions, so the two methods coexist without conflict and the caller picks whichever applies to their function:

using AbstractPPL: value_and_jacobian!!

function AbstractPPL.value_and_jacobian!!(
    p::Prepared{<:AutoForwardDiff}, x::AbstractVector{<:Real}
)
    return (p(x), ForwardDiff.jacobian(p.evaluator.f, x))
end

vecfun(x) = [x[1] * x[2], x[2] + x[3]]
prepared_vec = prepare(AutoForwardDiff(), vecfun, zeros(3))
value_and_jacobian!!(prepared_vec, [2.0, 3.0, 4.0])

([6.0, 7.0], [3.0 2.0 0.0; 0.0 1.0 1.0])

NamedTuple inputs

When the callable accepts a NamedTuple, pass a sample NamedTuple whose field names and value types match the expected input. The prototype's leaves must be Real, Complex, AbstractArray (recursively), Tuple, or NamedTuple. An extension can define a prepare overload that wraps the function in a NamedTupleEvaluator:

function AbstractPPL.prepare(adtype::AutoForwardDiff, f, values::NamedTuple)
    return Prepared(adtype, NamedTupleEvaluator(f, values))
end

ntfun(v::NamedTuple) = v.a^2 + sum(abs2, v.b)
prepared_nt = prepare(AutoForwardDiff(), ntfun, (a=0.0, b=zeros(2)))
prepared_nt((a=1.0, b=[2.0, 3.0]))

14.0

AD backends

Automatic differentiation packages extend the interface by implementing value_and_gradient!! and value_and_jacobian!! for specific cache types stored in prepared.cache:

prepared = prepare(adtype, problem, prototype)  # returns Prepared{AD,E,Cache}
value_and_gradient!!(prepared, x)               # may return aliased cache buffer
value_and_jacobian!!(prepared, x)

Prepared has three fields: adtype, evaluator (the user-facing callable), and cache (backend-specific pre-allocated state such as ForwardDiff configs or Mooncake tapes). Backend extensions dispatch on the cache type:

function AbstractPPL.prepare(
    adtype::MyADType, problem, x::AbstractVector{<:Real}; check_dims::Bool=true
)
    f = ...        # extract callable from problem
    cache = MyCache(f, x)
    return Prepared(adtype, VectorEvaluator{check_dims}(f, length(x)), cache)
end

function AbstractPPL.value_and_gradient!!(
    p::Prepared{<:AbstractADType,<:VectorEvaluator,<:MyCache}, x::AbstractVector{<:Real}
)
    # use p.cache to avoid allocations
    return ...
end

Pass check_dims=false in your prepare implementation to construct a VectorEvaluator{false}, which skips the per-call length check. This is an opt-in trust mode — the caller takes responsibility for length(x). The typical use is inside a backend's value_and_gradient!!, where the AD library invokes the inner callable many times with same-length dual arrays derived from a single user-supplied x; re-validating on each invocation would be redundant work in the hot path.

Hessian (`order=2`)

Pass order=2 to prepare to build a Hessian-capable evaluator. The returned object answers value_gradient_and_hessian!!, which returns (value, gradient, hessian) in a single call. order=2 requires problem to be scalar-valued; a vector-valued probe throws at preparation time.

using AbstractPPL: prepare, value_gradient_and_hessian!!
using ADTypes: AutoForwardDiff
using ForwardDiff, DifferentiationInterface

quadratic(x) = sum(abs2, x)
prepared = prepare(AutoForwardDiff(), quadratic, zeros(3); order=2)
val, grad, hess = value_gradient_and_hessian!!(prepared, [1.0, 2.0, 3.0])
# val == 14.0
# grad == [2.0, 4.0, 6.0]
# hess == [2 0 0; 0 2 0; 0 0 2]

Both context= and check_dims= apply to order=2 preps with the same semantics as for order=1. The !! aliasing contract also extends: the returned gradient and Hessian may alias internal cache buffers of prepared, so copy before retaining them past the next call. NamedTuple inputs are not supported at order=2.

For DifferentiationInterface, adtype can be either a single backend (letting DI pick its own Hessian strategy) or a DifferentiationInterface.SecondOrder(outer, inner) composition that selects the outer differentiator and the inner gradient backend independently — typically forward-over-reverse:

using DifferentiationInterface: SecondOrder
using ADTypes: AutoForwardDiff, AutoReverseDiff

adtype = SecondOrder(AutoForwardDiff(), AutoReverseDiff())
prepared = prepare(adtype, quadratic, zeros(3); order=2)

SecondOrder <: AbstractADType, so the same prepare(adtype, problem, x; order=2) entry handles it.

Calling value_gradient_and_hessian!! on an order=1 prep throws an ArgumentError — re-prepare with order=2 instead. The reverse is allowed: value_and_gradient!! on an order=2 prep returns (value, gradient) without paying the Hessian cost, since prepare builds a dedicated gradient prep alongside the Hessian one. value_and_jacobian!! is rejected because order=2 requires a scalar-valued problem.

Constant context arguments

When the underlying callable naturally takes the form f(x, context...) — where everything after x is constant state — pass context as a tuple to the vector form of prepare. AD differentiates only w.r.t. x; every value in context is treated as inactive:

affine(x, scale, offset) = scale * sum(x) + offset
prepared = prepare(adtype, affine, zeros(3); context=(2.0, 1.0))
val, grad = value_and_gradient!!(prepared, [1.0, 2.0, 3.0])
# val == 2.0 * 6.0 + 1.0; grad == [2.0, 2.0, 2.0]

prepared(x) evaluates f(x, context...), and context=() (the default) preserves the unary f(x) shape.

Without an AD backend

The two-argument form prepare(problem, x) is available without any AD package. By default it wraps problem in a VectorEvaluator{check_dims} (or NamedTupleEvaluator{check_dims} for the NamedTuple form), giving you a callable that runs the per-call shape check before forwarding to problem. Downstream code that only needs primal evaluation (e.g. log-density only, no gradient) can call prepare(...) uniformly without knowing whether an AD backend is loaded:

sumsimple(x) = sum(x)
p = prepare(sumsimple, zeros(3))   # `VectorEvaluator{true}(sumsimple, 3)`
p([1.0, 2.0, 3.0])

6.0

API reference

AbstractPPL.Evaluators.prepare — Function

prepare(problem, values::NamedTuple; check_dims::Bool=true)
prepare(problem, x::AbstractVector{<:Real}; check_dims::Bool=true, context::Tuple=())
prepare(adtype, problem, x::AbstractVector{<:Real}; check_dims::Bool=true, context::Tuple=(), order::Int=1)

Prepare a callable evaluator for problem.

Use the two-argument form with a NamedTuple when the evaluator works with named inputs, or with a vector when it works with vector inputs. The three-argument form, contributed by AD-backend extensions, additionally prepares gradient, jacobian, or Hessian machinery for vector inputs.

check_dims (default true) controls whether the returned evaluator validates the input shape on each call. Pass check_dims=false to skip the per-call check, e.g. inside an AD backend's hot path where the input shape is already guaranteed.

The vector-input forms accept a context::Tuple of constant arguments threaded through to problem: the prepared evaluator computes problem(x, context...), and AD backends differentiate only with respect to x. context=() (the default) preserves the unary problem(x) contract.

order selects the derivative order to prepare for on the AD-aware form. The default order=1 prepares gradient (scalar output) or jacobian (vector output) machinery. order=2 prepares Hessian machinery via value_gradient_and_hessian!! and requires problem to be scalar-valued — vector-valued problems will throw during preparation.

The three-argument AD-aware form may invoke problem once during preparation to detect output arity (scalar vs vector) and select the appropriate derivative machinery. Avoid prepare calls when problem has side effects that should fire only on user-driven evaluations.

source

AbstractPPL.Evaluators.value_and_gradient!! — Function

value_and_gradient!!(prepared, x::AbstractVector{<:Real})

Return (value, gradient) for a scalar-valued evaluator, potentially reusing internal cache buffers of prepared. The returned gradient may alias prepared's internal storage; copy if you need to retain it past the next call.

source

AbstractPPL.Evaluators.value_and_jacobian!! — Function

value_and_jacobian!!(prepared, x::AbstractVector{<:Real})

Return (value::AbstractVector, jacobian::AbstractMatrix) for a vector-valued evaluator, potentially reusing internal cache buffers. The returned arrays may alias prepared's internal storage; copy if needed. The Jacobian has shape (length(value), length(x)).

source

AbstractPPL.Evaluators.value_gradient_and_hessian!! — Function

value_gradient_and_hessian!!(prepared, x::AbstractVector{<:Real})

Return (value, gradient::AbstractVector, hessian::AbstractMatrix) for a scalar-valued evaluator prepared with order=2, potentially reusing internal cache buffers. The returned gradient and Hessian may alias prepared's internal storage; copy if you need to retain them past the next call. The Hessian has shape (length(x), length(x)).

source