Gen Documentation

Gen Documentation

Assignments

An assignment is a map from addresses of random choices to their values. Assignments are represented using the abstract type Assignment. Assignments have the following methods:

Gen.has_valueFunction.
has_value(assmt::Assignment, addr)

Return true if there is a value at the given address.

source
Gen.get_valueFunction.
value = get_value(assmt::Assignment, addr)

Return the value at the given address in the assignment, or throw a KeyError if no value exists. A syntactic sugar is Base.getindex:

value = assmt[addr]
source
Gen.get_subassmtFunction.
subassmt = get_subassmt(assmt::Assignment, addr)

Return the sub-assignment containing all choices whose address is prefixed by addr.

It is an error if the assignment contains a value at the given address. If there are no choices whose address is prefixed by addr then return an EmptyAssignment.

source
key_subassmt_iterable = get_values_shallow(assmt::Assignment)

Return an iterable collection of tuples (key, subassmt::Assignment) for each top-level key associated with a value.

source
key_subassmt_iterable = get_subassmts_shallow(assmt::Assignment)

Return an iterable collection of tuples (key, subassmt::Assignment) for each top-level key that has a non-empty sub-assignment.

source
Gen.to_arrayFunction.
arr::Vector{T} = to_array(assmt::Assignment, ::Type{T}) where {T}

Populate an array with values of choices in the given assignment.

It is an error if each of the values cannot be coerced into a value of the given type.

Implementation

To support to_array, a concrete subtype T <: Assignment should implement the following method:

n::Int = _fill_array!(assmt::T, arr::Vector{V}, start_idx::Int) where {V}

Populate arr with values from the given assignment, starting at start_idx, and return the number of elements in arr that were populated.

source
Gen.from_arrayFunction.
assmt::Assignment = from_array(proto_assmt::Assignment, arr::Vector)

Return an assignment with the same address structure as a prototype assignment, but with values read off from the given array.

The order in which addresses are populated is determined by the prototype assignment. It is an error if the number of choices in the prototype assignment is not equal to the length the array.

Implementation

To support from_array, a concrete subtype T <: Assignment should implement the following method:

(n::Int, assmt::T) = _from_array(proto_assmt::T, arr::Vector{V}, start_idx::Int) where {V}

Return an assignment with the same address structure as a prototype assignment, but with values read off from arr, starting at position start_idx, and the number of elements read from arr.

source
Gen.pairFunction.
assmt = pair(assmt1::Assignment, assmt2::Assignment, key1::Symbol, key2::Symbol)

Return an assignment that contains assmt1 as a sub-assignment under key1 and assmt2 as a sub-assignment under key2.

source
Gen.unpairFunction.
(assmt1, assmt2) = unpair(assmt::Assignment, key1::Symbol, key2::Symbol)

Return the two sub-assignments at key1 and key2, one or both of which may be empty.

It is an error if there are any top-level values, or any non-empty top-level sub-assignments at keys other than key1 and key2.

source
Base.mergeFunction.
assmt = Base.merge(assmt1::Assignment, assmt2::Assignment)

Merge two assignments.

It is an error if the assignments both have values at the same address, or if one assignment has a value at an address that is the prefix of the address of a value in the other assignment.

source

TODO: change get_assmt to assmt TODO: simplify other method names

Address Schemata

An address schema provides information about the set of addresses of random choices in an assignment.

Address schemata are associated with the type of an assignment:

schema = get_address_schema(::Type{T}) where {T <: Assignment}

Return the (top-level) address schema for the given assignment.

source

The remainder if this section describes some concrete types that subtype Assignment.

Dynamic Assignment

A DynamicAssignment is mutable, and can contain arbitrary values for its keys.

Static Assignment

A StaticAssignment is a immutable and contains only symbols as its keys for leaf nodes and for internal nodes. A StaticAssignment has type parametersR and T that are tuples of Symbols that are the keys of the leaf nodes and internal nodes respectively, so that code can be generated that is specialized to the particular set of keys in the trie:

struct StaticAssignment{R,S,T,U} <: Assignment
    leaf_nodes::NamedTuple{R,S}
    internal_nodes::NamedTuple{T,U}
end 

A StaticAssignment with leaf symbols :a and :b and internal key :c can be constructed using syntax like:

trie = StaticAssignment((a=1, b=2), (c=inner_trie,))

TODO: use generated functions in a lot more places, e.g. get_subassmt

TODO: document static variants of getters:

Other Concrete Assignment Types

Address Selections

TODO: document AddressSet API TODO: consider changing names of method in AddressSet API

Traces and Generative Functions

A trace is a record of an execution of a generative function. There is no abstract type representing all traces. Generative functions implement the generative function interface, which is a set of methods that involve the execution traces and probabilistic behavior of generative functions. In the mathematical description of the interface methods, we denote arguments to a function by $x$, complete assignments of values to addresses of random choices (containing all the random choices made during some execution) by $t$ and partial assignments by either $u$ or $v$. We denote a trace of a generative function by the tuple $(x, t)$. We say that two assignments $u$ and $t$ agree when they assign addresses that appear in both assignments to the same values (they can different or even disjoint sets of addresses and still agree). A generative function is associated with a family of probability distributions $P(t; x)$ on assignments $t$, parameterized by arguments $x$, and a second family of distributions $Q(t; u, x)$ on assignments $t$ parameterized by partial assignment $u$ and arguments $x$. $Q$ is called the internal proposal family of the generative function, and satisfies that if $u$ and $t$ agree then $P(t; x) > 0$ if and only if $Q(t; x, u) > 0$, and that $Q(t; x, u) > 0$ implies that $u$ and $t$ agree. See the Gen technical report for additional details.

Generative functions may also use non-addressable random choices, denoted $r$. Unlike regular (addressable) random choices, non-addressable random choices do not have addresses, and the value of non-addressable random choices is not exposed through the generative function interface. However, the state of non-addressable random choices is maintained in the trace. A trace that contains non-addressable random choices is denoted $(x, t, r)$. Non-addressable random choices manifest to the user of the interface as stochasticity in weights returned by generative function interface methods. The behavior of non-addressable random choices is defined by an additional pair of families of distributions associated with the generative function, denoted $Q(r; x, t)$ and $P(r; x, t)$, which are defined for $P(t; x) > 0$, and which satisfy $Q(r; x, t) > 0$ if and only if $P(r; x, t) > 0$. For each generative function below, we describe its semantics first in the basic setting where there is no non-addressable random choices, and then in the more general setting that may include non-addressable random choices.

Gen.initializeFunction.
(trace::U, weight) = initialize(gen_fn::GenerativeFunction{T,U}, args::Tuple,
                                assmt::Assignment)

Return a trace of a generative function that is consistent with the given assignment.

Basic case

Given arguments $x$ (args) and assignment $u$ (assmt), sample $t \sim Q(\cdot; u, x)$ and return the trace $(x, t)$ (trace). Also return the weight (weight):

\[\frac{P(t; x)}{Q(t; u, x)}\]

General case

Identical to the basic case, except that we also sample $r \sim Q(\cdot; x, t)$, the trace is $(x, t, r)$ and the weight is:

\[\frac{P(t; x)}{Q(t; u, x)} \cdot \frac{P(r; x, t)}{Q(r; x, t)}\]
source
Gen.projectFunction.
weight = project(trace::U, selection::AddressSet)

Estimate the probability that the selected choices take the values they do in a trace.

Basic case

Given a trace $(x, t)$ (trace) and a set of addresses $A$ (selection), let $u$ denote the restriction of $t$ to $A$. Return the weight (weight):

\[\frac{P(t; x)}{Q(t; u, x)}\]

General case

Identical to the basic case except that the previous trace is $(x, t, r)$ and the weight is:

\[\frac{P(t; x)}{Q(t; u, x)} \cdot \frac{P(r; x, t)}{Q(r; x, t)}\]
source
Gen.proposeFunction.
(assmt, weight, retval) = propose(gen_fn::GenerativeFunction, args::Tuple)

Sample an assignment and compute the probability of proposing that assignment.

Basic case

Given arguments (args), sample $t' \sim P(\cdot; x)$, and return $t$ (assmt) and the weight (weight) $P(t; x)$.

General case

Identical to the basic case, except that we also sample $r \sim P(\cdot; x, t)$, and the weight is:

\[P(t; x) \cdot \frac{P(r; x, t)}{Q(r; x, t)}\]
source
Gen.assessFunction.
(weight, retval) = assess(gen_fn::GenerativeFunction, args::Tuple, assmt::Assignment)

Return the probability of proposing an assignment

Basic case

Given arguments $x$ (args) and an assignment $t$ (assmt) such that $P(t; x) > 0$, return the weight (weight) $P(t; x)$. It is an error if $P(t; x) = 0$.

General case

Identical to the basic case except that we also sample $r \sim Q(\cdot; x, t)$, and the weight is:

\[P(t; x) \cdot \frac{P(r; x, t)}{Q(r; x, t)}\]
source
Gen.force_updateFunction.
(new_trace, weight, discard, retdiff) = force_update(args::Tuple, argdiff, trace,
                                                     assmt::Assignment)

Update a trace by changing the arguments and/or providing new values for some existing random choice(s) and values for any newly introduced random choice(s).

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt), return a new trace $(x', t')$ (new_trace) that is consistent with $u$. The values of choices in $t'$ are deterministically copied either from $t$ or from $u$ (with $u$ taking precedence). All choices in $u$ must appear in $t'$. Also return an assignment $v$ (discard) containing the choices in $t$ that were overwritten by values from $u$, and any choices in $t$ whose address does not appear in $t'$. Also return the weight (weight):

\[\frac{P(t'; x')}{P(t; x)}\]

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

\[\frac{P(t'; x')}{P(t; x)} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}\]
source
Gen.fix_updateFunction.
(new_trace, weight, discard, retdiff) = fix_update(args::Tuple, argdiff, trace,
                                                   assmt::Assignment)

Update a trace, by changing the arguments and/or providing new values for some existing random choice(s).

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt), return a new trace $(x', t')$ (new_trace) that is consistent with $u$. Let $u + t$ denote the merge of $u$ and $t$ (with $u$ taking precedence). Sample $t' \sim Q(\cdot; u + t, x)$. All addresses in $u$ must appear in $t$ and in $t'$. Also return an assignment $v$ (discard) containing the values from $t$ for addresses in $u$. Also return the weight (weight):

\[\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; v + t', x)}{Q(t'; u + t, x')}\]

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

\[\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; v + t', x)}{Q(t'; u + t, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}\]
source
Gen.free_updateFunction.
(new_trace, weight, retdiff) = free_update(args::Tuple, argdiff, trace,
                                           selection::AddressSet)

Update a trace by changing the arguments and/or randomly sampling new values for selected random choices.

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and a set of addresses $A$ (selection), return a new trace $(x', t')$ (new_trace) such that $t'$ agrees with $t$ on all addresses not in $A$ ($t$ and $t'$ may have different sets of addresses). Let $u$ denote the restriction of $t$ to the complement of $A$. Sample $t' \sim Q(\cdot; u, x')$. Return the new trace $(x', t')$ (new_trace) and the weight (weight):

\[\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; u', x)}{Q(t'; u, x')}\]

where $u'$ is the restriction of $t'$ to the complement of $A$.

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

\[\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; u', x)}{Q(t'; u, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}\]
source
Gen.extendFunction.
(new_trace, weight, retdiff) = extend(args::Tuple, argdiff, trace, assmt::Assignment)

Extend a trace with new random choices by changing the arguments.

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt) that shares no addresses with $t$, return a new trace $(x', t')$ (new_trace) such that $t'$ agrees with $t$ on all addresses in $t$ and $t'$ agrees with $u$ on all addresses in $u$. Sample $t' \sim Q(\cdot; t + u, x')$. Also return the weight (weight):

\[\frac{P(t'; x')}{P(t; x) Q(t'; t + u, x')}\]

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, and we also sample $r' \sim Q(\cdot; t', x)$, the new trace is $(x', t', r')$, and the weight is:

\[\frac{P(t'; x')}{P(t; x) Q(t'; t + u, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}\]
source
Gen.backprop_paramsFunction.
arg_grads = backprop_params(trace, retgrad)

Increment gradient accumulators for parameters by the gradient of the log-probability of the trace.

Basic case

Given a previous trace $(x, t)$ (trace) and a gradient with respect to the return value $∇_y J$ (retgrad), return the following gradient (arg_grads) with respect to the arguments $x$:

\[∇_x \left( \log P(t; x) + J \right)\]

Also increment the gradient accumulators for the static parameters $Θ$ of the function by:

\[∇_Θ \left( \log P(t; x) + J \right)\]

General case

Not yet formalized.

source
Gen.backprop_traceFunction.
(arg_grads, choice_values, choice_grads) = backprop_trace(trace, selection::AddressSet,
                                                          retgrad)

Basic case

Given a previous trace $(x, t)$ (trace) and a gradient with respect to the return value $∇_y J$ (retgrad), return the following gradient (arg_grads) with respect to the arguments $x$:

\[∇_x \left( \log P(t; x) + J \right)\]

Also given a set of addresses $A$ (selection) that are continuous-valued random choices, return the folowing gradient (choice_grads) with respect to the values of these choices:

\[∇_A \left( \log P(t; x) + J \right)\]

Also return the assignment (choice_values) that is the restriction of $t$ to $A$.

General case

Not yet formalized.

source
Gen.get_assmtFunction.
get_assmt(trace)

Return a value implementing the assignment interface

source
Gen.get_argsFunction.
get_args(trace)

Return the argument tuple for a given execution.

source
Gen.get_retvalFunction.
get_retval(trace)

Return the return value of the given execution.

source
Gen.get_scoreFunction.
get_score(trace)

Basic case

Return $P(t; x)$

General case

Return $P(r, t; x) / Q(r; tx, t)$

source

TODO: document has_argument_grads

Distributions

Probability distributions are singleton types whose supertype is Distribution{T}, where T indicates the data type of the random sample.

abstract type Distribution{T} end

By convention, distributions have a global constant lower-case name for the singleton value. For example:

struct Bernoulli <: Distribution{Bool} end
const bernoulli = Bernoulli()

Distributions must implement two methods, random and logpdf.

random returns a random sample from the distribution:

x::Bool = random(bernoulli, 0.5)
x::Bool = random(Bernoulli(), 0.5)

logpdf returns the log probability (density) of the distribution at a given value:

logpdf(bernoulli, false, 0.5)
logpdf(Bernoulli(), false, 0.5)

Distribution values are also callable, which is a syntactic sugar with the same behavior of calling random:

bernoulli(0.5) # identical to random(bernoulli, 0.5) and random(Bernoulli(), 0.5)

Gradients of Distributions

Distributions may also implement logpdf_grad, which returns the gradient of the log probability (density) with respect to the random sample and the parameters, as a tuple:

(grad_sample, grad_mu, grad_std) = logpdf_grad(normal, 1.324, 0.0, 1.0)

The partial derivative of the log probability (density) with respect to the random sample, or one of the parameters, might not always exist. Distributions indicate which partial derivatives exist using the methods has_output_grad and has_argument_grads:

has_output_grad(::Normal) = true
has_argument_grads(::Normal) = (true, true)

If a particular partial derivative does not exist, that field of the tuple returned by logpdf_grad should be nothing.

Built-In Distributions

Gen.bernoulliConstant.
bernoulli(prob_true::Real)

Samples a Bool value which is true with given probability

source
Gen.normalConstant.
normal(mu::Real, std::Real)

Samples a Float64 value from a normal distribution.

source
Gen.mvnormalConstant.
mvnormal(mu::AbstractVector{T}, cov::AbstractMatrix{U}} where {T<:Real,U<:Real}

Samples a Vector{Float64} value from a multivariate normal distribution.

source
Gen.gammaConstant.
gamma(shape::Real, scale::Real)

Sample a Float64 from a gamma distribution.

source
Gen.inv_gammaConstant.
inv_gamma(shape::Real, scale::Real)

Sample a Float64 from a inverse gamma distribution.

source
Gen.betaConstant.
beta(alpha::Real, beta::Real)

Sample a Float64 from a beta distribution.

source
Gen.categoricalConstant.
categorical(probs::AbstractArray{U, 1}) where {U <: Real}

Given a vector of probabilities probs where sum(probs) = 1, sample an Int i from the set {1, 2, .., length(probs)} with probability probs[i].

source
Gen.uniformConstant.
uniform(low::Real, high::Real)

Sample a Float64 from the uniform distribution on the interval [low, high].

source
Gen.uniform_discreteConstant.
uniform_discrete(low::Integer, high::Integer)

Sample an Int from the uniform distribution on the set {low, low + 1, ..., high-1, high}.

source
Gen.poissonConstant.
poisson(lambda::Real)

Sample an Int from the Poisson distribution with rate lambda.

source

Trie

Modeling DSLs

Dynamic DSL

TODO: remove the @ad return value differentiation flag