Gen Documentation

# Gen Documentation

## Assignments

An assignment is a map from addresses of random choices to their values. Assignments are represented using the abstract type Assignment. Assignments have the following methods:

has_value(assmt::Assignment, addr)

Return true if there is a value at the given address.

source
value = get_value(assmt::Assignment, addr)

Return the value at the given address in the assignment, or throw a KeyError if no value exists. A syntactic sugar is Base.getindex:

value = assmt[addr]
source
subassmt = get_subassmt(assmt::Assignment, addr)

Return the sub-assignment containing all choices whose address is prefixed by addr.

It is an error if the assignment contains a value at the given address. If there are no choices whose address is prefixed by addr then return an EmptyAssignment.

source
key_subassmt_iterable = get_values_shallow(assmt::Assignment)

Return an iterable collection of tuples (key, subassmt::Assignment) for each top-level key associated with a value.

source
key_subassmt_iterable = get_subassmts_shallow(assmt::Assignment)

Return an iterable collection of tuples (key, subassmt::Assignment) for each top-level key that has a non-empty sub-assignment.

source
arr::Vector{T} = to_array(assmt::Assignment, ::Type{T}) where {T}

Populate an array with values of choices in the given assignment.

It is an error if each of the values cannot be coerced into a value of the given type.

Implementation

To support to_array, a concrete subtype T <: Assignment should implement the following method:

n::Int = _fill_array!(assmt::T, arr::Vector{V}, start_idx::Int) where {V}

Populate arr with values from the given assignment, starting at start_idx, and return the number of elements in arr that were populated.

source
assmt::Assignment = from_array(proto_assmt::Assignment, arr::Vector)

Return an assignment with the same address structure as a prototype assignment, but with values read off from the given array.

The order in which addresses are populated is determined by the prototype assignment. It is an error if the number of choices in the prototype assignment is not equal to the length the array.

Implementation

To support from_array, a concrete subtype T <: Assignment should implement the following method:

(n::Int, assmt::T) = _from_array(proto_assmt::T, arr::Vector{V}, start_idx::Int) where {V}

Return an assignment with the same address structure as a prototype assignment, but with values read off from arr, starting at position start_idx, and the number of elements read from arr.

source
assmt = pair(assmt1::Assignment, assmt2::Assignment, key1::Symbol, key2::Symbol)

Return an assignment that contains assmt1 as a sub-assignment under key1 and assmt2 as a sub-assignment under key2.

source
(assmt1, assmt2) = unpair(assmt::Assignment, key1::Symbol, key2::Symbol)

Return the two sub-assignments at key1 and key2, one or both of which may be empty.

It is an error if there are any top-level values, or any non-empty top-level sub-assignments at keys other than key1 and key2.

source
assmt = Base.merge(assmt1::Assignment, assmt2::Assignment)

Merge two assignments.

It is an error if the assignments both have values at the same address, or if one assignment has a value at an address that is the prefix of the address of a value in the other assignment.

source

TODO: change get_assmt to assmt TODO: simplify other method names

Address schemata are associated with the type of an assignment:

schema = get_address_schema(::Type{T}) where {T <: Assignment}

Return the (top-level) address schema for the given assignment.

source

The remainder if this section describes some concrete types that subtype Assignment.

### Dynamic Assignment

A DynamicAssignment is mutable, and can contain arbitrary values for its keys.

• DynamicAssignment()

• set_value! (with syntactic sugar Base.setindex!), will cause any previous value or sub-assignment at this addr to be deleted. it is an error if there is already a value present at some prefix of addr.

• set_subassmt!, will cause any previous value or sub-assignment at this addr to be deleted. it is an error if there is already a value present at some prefix of addr.

### Static Assignment

A StaticAssignment is a immutable and contains only symbols as its keys for leaf nodes and for internal nodes. A StaticAssignment has type parametersR and T that are tuples of Symbols that are the keys of the leaf nodes and internal nodes respectively, so that code can be generated that is specialized to the particular set of keys in the trie:

struct StaticAssignment{R,S,T,U} <: Assignment
leaf_nodes::NamedTuple{R,S}
internal_nodes::NamedTuple{T,U}
end 

A StaticAssignment with leaf symbols :a and :b and internal key :c can be constructed using syntax like:

trie = StaticAssignment((a=1, b=2), (c=inner_trie,))

TODO: use generated functions in a lot more places, e.g. get_subassmt

TODO: document static variants of getters:

• static_get_subassmt(assmt, ::Val{key}): throws a key error if the key isn't in the static address schema (get_subassmt would return an EmptyAssignment)

• NOTE: static_has_value(assmt, ::Val{key}) appears in the Static IR, but this an internal implementation detail, and not part of the 'static assignment interface'.

### Other Concrete Assignment Types

• EmptyAssignment

• InternalVectorAssignment (TODO rename to DeepVectorAssignment)

• ShallowVectorAssignment (TODO not yet implemented)

• Assignments produced from GFTraces

• Assignments produced

TODO: document AddressSet API TODO: consider changing names of method in AddressSet API

## Traces and Generative Functions

A trace is a record of an execution of a generative function. There is no abstract type representing all traces. Generative functions implement the generative function interface, which is a set of methods that involve the execution traces and probabilistic behavior of generative functions. In the mathematical description of the interface methods, we denote arguments to a function by $x$, complete assignments of values to addresses of random choices (containing all the random choices made during some execution) by $t$ and partial assignments by either $u$ or $v$. We denote a trace of a generative function by the tuple $(x, t)$. We say that two assignments $u$ and $t$ agree when they assign addresses that appear in both assignments to the same values (they can different or even disjoint sets of addresses and still agree). A generative function is associated with a family of probability distributions $P(t; x)$ on assignments $t$, parameterized by arguments $x$, and a second family of distributions $Q(t; u, x)$ on assignments $t$ parameterized by partial assignment $u$ and arguments $x$. $Q$ is called the internal proposal family of the generative function, and satisfies that if $u$ and $t$ agree then $P(t; x) > 0$ if and only if $Q(t; x, u) > 0$, and that $Q(t; x, u) > 0$ implies that $u$ and $t$ agree. See the Gen technical report for additional details.

Generative functions may also use non-addressable random choices, denoted $r$. Unlike regular (addressable) random choices, non-addressable random choices do not have addresses, and the value of non-addressable random choices is not exposed through the generative function interface. However, the state of non-addressable random choices is maintained in the trace. A trace that contains non-addressable random choices is denoted $(x, t, r)$. Non-addressable random choices manifest to the user of the interface as stochasticity in weights returned by generative function interface methods. The behavior of non-addressable random choices is defined by an additional pair of families of distributions associated with the generative function, denoted $Q(r; x, t)$ and $P(r; x, t)$, which are defined for $P(t; x) > 0$, and which satisfy $Q(r; x, t) > 0$ if and only if $P(r; x, t) > 0$. For each generative function below, we describe its semantics first in the basic setting where there is no non-addressable random choices, and then in the more general setting that may include non-addressable random choices.

(trace::U, weight) = initialize(gen_fn::GenerativeFunction{T,U}, args::Tuple,
assmt::Assignment)

Return a trace of a generative function that is consistent with the given assignment.

Basic case

Given arguments $x$ (args) and assignment $u$ (assmt), sample $t \sim Q(\cdot; u, x)$ and return the trace $(x, t)$ (trace). Also return the weight (weight):

$\frac{P(t; x)}{Q(t; u, x)}$

General case

Identical to the basic case, except that we also sample $r \sim Q(\cdot; x, t)$, the trace is $(x, t, r)$ and the weight is:

$\frac{P(t; x)}{Q(t; u, x)} \cdot \frac{P(r; x, t)}{Q(r; x, t)}$
source
weight = project(trace::U, selection::AddressSet)

Estimate the probability that the selected choices take the values they do in a trace.

Basic case

Given a trace $(x, t)$ (trace) and a set of addresses $A$ (selection), let $u$ denote the restriction of $t$ to $A$. Return the weight (weight):

$\frac{P(t; x)}{Q(t; u, x)}$

General case

Identical to the basic case except that the previous trace is $(x, t, r)$ and the weight is:

$\frac{P(t; x)}{Q(t; u, x)} \cdot \frac{P(r; x, t)}{Q(r; x, t)}$
source
(assmt, weight, retval) = propose(gen_fn::GenerativeFunction, args::Tuple)

Sample an assignment and compute the probability of proposing that assignment.

Basic case

Given arguments (args), sample $t' \sim P(\cdot; x)$, and return $t$ (assmt) and the weight (weight) $P(t; x)$.

General case

Identical to the basic case, except that we also sample $r \sim P(\cdot; x, t)$, and the weight is:

$P(t; x) \cdot \frac{P(r; x, t)}{Q(r; x, t)}$
source
(weight, retval) = assess(gen_fn::GenerativeFunction, args::Tuple, assmt::Assignment)

Return the probability of proposing an assignment

Basic case

Given arguments $x$ (args) and an assignment $t$ (assmt) such that $P(t; x) > 0$, return the weight (weight) $P(t; x)$. It is an error if $P(t; x) = 0$.

General case

Identical to the basic case except that we also sample $r \sim Q(\cdot; x, t)$, and the weight is:

$P(t; x) \cdot \frac{P(r; x, t)}{Q(r; x, t)}$
source
(new_trace, weight, discard, retdiff) = force_update(args::Tuple, argdiff, trace,
assmt::Assignment)

Update a trace by changing the arguments and/or providing new values for some existing random choice(s) and values for any newly introduced random choice(s).

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt), return a new trace $(x', t')$ (new_trace) that is consistent with $u$. The values of choices in $t'$ are deterministically copied either from $t$ or from $u$ (with $u$ taking precedence). All choices in $u$ must appear in $t'$. Also return an assignment $v$ (discard) containing the choices in $t$ that were overwritten by values from $u$, and any choices in $t$ whose address does not appear in $t'$. Also return the weight (weight):

$\frac{P(t'; x')}{P(t; x)}$

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

$\frac{P(t'; x')}{P(t; x)} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}$
source
(new_trace, weight, discard, retdiff) = fix_update(args::Tuple, argdiff, trace,
assmt::Assignment)

Update a trace, by changing the arguments and/or providing new values for some existing random choice(s).

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt), return a new trace $(x', t')$ (new_trace) that is consistent with $u$. Let $u + t$ denote the merge of $u$ and $t$ (with $u$ taking precedence). Sample $t' \sim Q(\cdot; u + t, x)$. All addresses in $u$ must appear in $t$ and in $t'$. Also return an assignment $v$ (discard) containing the values from $t$ for addresses in $u$. Also return the weight (weight):

$\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; v + t', x)}{Q(t'; u + t, x')}$

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

$\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; v + t', x)}{Q(t'; u + t, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}$
source
(new_trace, weight, retdiff) = free_update(args::Tuple, argdiff, trace,
selection::AddressSet)

Update a trace by changing the arguments and/or randomly sampling new values for selected random choices.

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and a set of addresses $A$ (selection), return a new trace $(x', t')$ (new_trace) such that $t'$ agrees with $t$ on all addresses not in $A$ ($t$ and $t'$ may have different sets of addresses). Let $u$ denote the restriction of $t$ to the complement of $A$. Sample $t' \sim Q(\cdot; u, x')$. Return the new trace $(x', t')$ (new_trace) and the weight (weight):

$\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; u', x)}{Q(t'; u, x')}$

where $u'$ is the restriction of $t'$ to the complement of $A$.

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, the new trace is $(x', t', r')$ where $r' \sim Q(\cdot; x', t')$, and the weight is:

$\frac{P(t'; x')}{P(t; x)} \cdot \frac{Q(t; u', x)}{Q(t'; u, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}$
source
(new_trace, weight, retdiff) = extend(args::Tuple, argdiff, trace, assmt::Assignment)

Extend a trace with new random choices by changing the arguments.

Basic case

Given a previous trace $(x, t)$ (trace), new arguments $x'$ (args), and an assignment $u$ (assmt) that shares no addresses with $t$, return a new trace $(x', t')$ (new_trace) such that $t'$ agrees with $t$ on all addresses in $t$ and $t'$ agrees with $u$ on all addresses in $u$. Sample $t' \sim Q(\cdot; t + u, x')$. Also return the weight (weight):

$\frac{P(t'; x')}{P(t; x) Q(t'; t + u, x')}$

General case

Identical to the basic case except that the previous trace is $(x, t, r)$, and we also sample $r' \sim Q(\cdot; t', x)$, the new trace is $(x', t', r')$, and the weight is:

$\frac{P(t'; x')}{P(t; x) Q(t'; t + u, x')} \cdot \frac{P(r'; x', t') Q(r; x, t)}{P(r; x, t) Q(r'; x', t')}$
source
arg_grads = backprop_params(trace, retgrad)

Increment gradient accumulators for parameters by the gradient of the log-probability of the trace.

Basic case

Given a previous trace $(x, t)$ (trace) and a gradient with respect to the return value $∇_y J$ (retgrad), return the following gradient (arg_grads) with respect to the arguments $x$:

$∇_x \left( \log P(t; x) + J \right)$

Also increment the gradient accumulators for the static parameters $Θ$ of the function by:

$∇_Θ \left( \log P(t; x) + J \right)$

General case

Not yet formalized.

source
(arg_grads, choice_values, choice_grads) = backprop_trace(trace, selection::AddressSet,
retgrad)

Basic case

Given a previous trace $(x, t)$ (trace) and a gradient with respect to the return value $∇_y J$ (retgrad), return the following gradient (arg_grads) with respect to the arguments $x$:

$∇_x \left( \log P(t; x) + J \right)$

Also given a set of addresses $A$ (selection) that are continuous-valued random choices, return the folowing gradient (choice_grads) with respect to the values of these choices:

$∇_A \left( \log P(t; x) + J \right)$

Also return the assignment (choice_values) that is the restriction of $t$ to $A$.

General case

Not yet formalized.

source
get_assmt(trace)

Return a value implementing the assignment interface

source
get_args(trace)

Return the argument tuple for a given execution.

source
get_retval(trace)

Return the return value of the given execution.

source
get_score(trace)

Basic case

Return $P(t; x)$

General case

Return $P(r, t; x) / Q(r; tx, t)$

source

TODO: document has_argument_grads

## Distributions

Probability distributions are singleton types whose supertype is Distribution{T}, where T indicates the data type of the random sample.

abstract type Distribution{T} end

By convention, distributions have a global constant lower-case name for the singleton value. For example:

struct Bernoulli <: Distribution{Bool} end
const bernoulli = Bernoulli()

Distributions must implement two methods, random and logpdf.

random returns a random sample from the distribution:

x::Bool = random(bernoulli, 0.5)
x::Bool = random(Bernoulli(), 0.5)

logpdf returns the log probability (density) of the distribution at a given value:

logpdf(bernoulli, false, 0.5)
logpdf(Bernoulli(), false, 0.5)

Distribution values are also callable, which is a syntactic sugar with the same behavior of calling random:

bernoulli(0.5) # identical to random(bernoulli, 0.5) and random(Bernoulli(), 0.5)

Distributions may also implement logpdf_grad, which returns the gradient of the log probability (density) with respect to the random sample and the parameters, as a tuple:

(grad_sample, grad_mu, grad_std) = logpdf_grad(normal, 1.324, 0.0, 1.0)

The partial derivative of the log probability (density) with respect to the random sample, or one of the parameters, might not always exist. Distributions indicate which partial derivatives exist using the methods has_output_grad and has_argument_grads:

has_output_grad(::Normal) = true
has_argument_grads(::Normal) = (true, true)

If a particular partial derivative does not exist, that field of the tuple returned by logpdf_grad should be nothing.

### Built-In Distributions

bernoulli(prob_true::Real)

Samples a Bool value which is true with given probability

source
normal(mu::Real, std::Real)

Samples a Float64 value from a normal distribution.

source
mvnormal(mu::AbstractVector{T}, cov::AbstractMatrix{U}} where {T<:Real,U<:Real}

Samples a Vector{Float64} value from a multivariate normal distribution.

source
gamma(shape::Real, scale::Real)

Sample a Float64 from a gamma distribution.

source
inv_gamma(shape::Real, scale::Real)

Sample a Float64 from a inverse gamma distribution.

source
beta(alpha::Real, beta::Real)

Sample a Float64 from a beta distribution.

source
categorical(probs::AbstractArray{U, 1}) where {U <: Real}

Given a vector of probabilities probs where sum(probs) = 1, sample an Int i from the set {1, 2, .., length(probs)} with probability probs[i].

source
uniform(low::Real, high::Real)

Sample a Float64 from the uniform distribution on the interval [low, high].

source
uniform_discrete(low::Integer, high::Integer)

Sample an Int from the uniform distribution on the set {low, low + 1, ..., high-1, high}.

source
poisson(lambda::Real)

Sample an Int from the Poisson distribution with rate lambda.

source

# Modeling DSLs

## Dynamic DSL

TODO: remove the @ad return value differentiation flag