In [ ]:
using Sigma
loadvis()
Often all we want is a single sample or set of samples from a random variable $X$ conditioned on $Y$ being true. Additionally, other statistics such as conditional expectation, variance and conditional probability can be estimated from sets of samples.
In Sigma, to sample from $X$ unconditionally means to
In [4]:
X = uniform(1,0.0,1.0)
ω = [rand()]
call(X,ω)
Out[4]:
This is achieved more conviently simply using rand(X)
In [5]:
rand(X)
Out[5]:
Expectations of Real-valued random variables can be found using sample_mean(X), which takes an additional keyword parameter nsamples.
In [ ]:
X = normal(0,1)
Y = X > 0
samples = rand(X,Y,1000)
plot(x=samples,Geom.histogram)
rand(X,Y) is just a convenient synonym for cond_sample_bfs(X,Y) which draws exact samples. The alternative is cond_sample_mh(X,Y). We shall explain the difference between these two, for many it will suffice to know that cond_sample_bfs samples are exact, whereas cond_sample_mh using a variant of the Metropolis Hastings algorithm to handle higher dimensional problems, but as a result does not guarantee exact samples.
To understand the differences, and how Sigma generates these samples we should first ask what does it even mean to conditionaly sample?
Recall $\mathbb{P}:\Sigma \to [0,1]$ is our probability measure, i.e., a function which assigns probabilities to subsets of omega. When we conditionally sampled from $X$ using rand(X) above, we really sampling an infinitesimly (but not zero) subset of. To conditionally sample we first construct a new probability measure $\mathbb{P_{Y}}$
To construct a conditional query we first construct a conditional measure $\mathbb{P}_{cond}$ finds subsets of $\Omega$. Given a set of disjoint subsets, we can sample by sampling an abstraction. Intuitively this is correct because larger regions.
A major problem with abstract Metropolis Hastings is that a poor choice of where to split an abstraction can lead to rejected proposals. Consider the following preimage plot. If we split down the middle two-ways to create four sub-boxes and use a uniform transition probability of 1/4 for each sub-box, the tiny preimage region in the upper-left hand corner will receive the same amout of proposal mass as each of the three other regions, which are all fully satisfiable. The result is that the small box (or some subset of it) will be proposed approximately 1/4 of the time and almost always rejected.
This behaviour is a combination of two causes:
In [3]:
X = uniform(0,1)
Y = uniform(0,1)
Z = (X > 0.5) | (Y < 0.5) | ((X ∈ [0.25, 0.26]) & (Y ∈ [0.75, 0.76]))
plot_preimage(Z)
Out[3]:
The fundamental problem is found in all MCMC algorithms, our proposal distribution poorly reflects the distribution of interest. In our case it can be off by several (10s or 100s) orders of magnitude. We first address this with methods which are non-adaptive, that is, the proposal distribution does not change. Later we can conider adaptive methods, which have addition constraints to ensure correctness.
Consider a subdivision of an abstraction into sub-abstractions $\tilde{A}$ and $\tilde{B}$, a good proposal distribution will assign mass to these according to the volume of the true preimage, each subatraction subsumes. This brings a circularity; since if we knew the volumes within each subregion there would be no need for metropolis hastings at all.
Rapidly exploring random trees (RRT) and its variants are sample based methods which work well for motion planning in high dimensional spaces. The fact that they are able to fill high dimensional spaces with a tree is motivation for their use in this case.
Basic algorithm:
The challenges in adapting this method to our needs are that
RRT relies on being able to draw points uniformly over the space. The spaces used in motion planning have feasible regions large enough that simple rejection sampling is good enough to get points. This property does not hold for us, where the feasible regions are likely tiny.
RRT is primarily concerned with expansion into continuous regions. If there are discontinuities in the state space, they are not of importance because it means the robot can not possible reach the target.
When a new node is sampled, an edge is grown from the closest node in the tree as close as possible to it. This requires that we know that the edge is not interesecting a non disjoint region. I'm not entirely sure how they check this, but for us it could be done with an SMT solver, but that seems expensive. The goal is to have a fast approximation.
Heuristic to try:
When splitting a box A into $A_i,..,A_n$. For each $A_i$