Getting started with probabilistic programming in Gamble

This document introduces the basic ideas of probabilistic programming using Gamble, and provides lots of small examples as starting points for your own projects. You can tweak and re-run all the code blocks.

This primary author of this introduction is Sean Stromsten (sean.stromsten@baesystems.com), who welcomes bug reports and suggestions for improvement.

Prerequisites

This material assumes that you are familiar with elementary probability theory, that you know how to program in some language, and that you are at least a little bit familiar with the Scheme programming language. If your probability is rusty, there are nice reviews here and here. For a more leisurely discussion, with a focus on the Bayesian perspective, and some modeling examples, click here. For help with Scheme, and, in particular, the Racket dialect spoken here, this is a good place to start.

[TOC]

What is probabilistic programming and why do we need it?

What is probabilistic programming?

Computer programs are a familiar and flexible way to represent processes, and running those programs repeatedly is a straightforward way to simulate those processes. Like any other formal mathematical model, a model in program form abstracts away (currently) unimportant details so that we can describe the essentials of a system, and derive properties that were not obvious without the model. A computer program is a model with the additional virtue of being executable.

A probabilistic (or stochastic) model is a model that includes random elements. Probabilistic models are incredibly useful, even though truly probablistic systems are rare (even coin flips and dice rolls are deterministic, given enough information). The probabilistic parts are summaries of our ignorance of the many factors that we would have to know precisely to model (e.g.) coin flipping as deterministic.

Given that all popular programming languages include procedures for drawing (pseudo-)random numbers, programs can easily represent random processes, including complex, multi-step processes with both random and deterministic parts. We will call programs with random expressions "probabilistic programs" (PPs, for short). So,

Probabilistic programs are programs with random components.

(If this was all there was to it, of course, we wouldn't need new languages. We'll get to why we need something new in the next section.)

We can think of a complete program as a procedure (often one that takes no arguments). Running this program many times will produce a sample resembling the probability distribution over executions defined by the program.

Before we start...

When you first open a new notebook, it's a Racket notebook. Racket is the language (or language-building toolkit) in which Gamble is defined. To make it possible to run Gamble programs, the first thing you will want to do is this:


In [1]:
(require gamble)

If you don't evaluate a (require gamble) expression first, you won't be able to evaluate any Gamble code.

The first Gamble program

With that bit of plumbing taken care of, consider the simplest probabilistic "process": a single coin toss. We can model it with a random variable that takes on the values $\verb+#t+$ and $\verb+#f+$ (Racket's Boolean values) with equal probability. We represent this "system" with the smallest possible Gamble program:


In [2]:
(flip)


Out[2]:
#t

If we run this program many times, it will return $\verb+#t+$ about half the time and $\verb+#f+$ the other half. The program below flips the coin 100 times and counts the $\verb+#t+$s:


In [3]:
(length 
 (filter (lambda (x) x)
         (repeat flip 100)))


Out[3]:
45

That's pretty much the "Hello, world!" of probabilistic programming. Here's the line-by-line breakdown: repeat executes the thunk (that's a procedure that takes no arguments) passed in as its first argument $n$ times, where $n$ is an integer value passed in as the second argument. filter runs a predicate (first argument) over a list (second argument), and returns just those elements of the list for which the predicate evaluates to #t. Even if you haven't seen it before, you can probably guess what length does.

flip is one of about 20 primitive distibutions, which can be composed with each other and with other (pure) Racket constructs to make Gamble models. We'll get to a few of the most common ones in this introduction, and the rest are documented here (ref).

There is another primitive distribution very much like flip, except that it yields 0 or 1 instead of #t or #f.


In [4]:
(bernoulli)


Out[4]:
1

flip is useful for random variables used as conditions in if expressions, but counts and averages of many "flips" are easier with bernoulli.


In [5]:
(apply + (repeat bernoulli 100))


Out[5]:
40

Conditional distributions

Bigger computer programs are composed of smaller ones. So are Gamble programs representing complex joint probability distributions.

flip and bernoulli, above, are passed no arguments, and default to equal probability for both outcomes. If all variables in a joint distribution are produced by procedures of no arguments, there are no attachement points between variables, so no interesting joint distribution can be constructed. If we think in terms of cause and effect (a risky but useful heuristic), the zero-argument version of flip is an isolated event, without cause. It feels no pain. It never cries. The bread and butter of probabilistic programming is conditional distributions, which map parameters to distributions.

For example, flip can take a bias argument to make a "trick coin".


In [6]:
(flip 0.9)


Out[6]:
#f

This is important, so I will say it again: when procedures with parameters are (or contain) random choices, like flip, they represent conditional distributions--that is, functions from arguments (here, a coin's bias towards #t) to distributions (or samplers of distributions).

If we flip such a coin many times, it behaves just as you'd expect.


In [7]:
(length 
  (filter (lambda (x) x)
          (repeat (lambda () (flip 0.9)) 100)))


Out[7]:
88

Syntax aside: the expression is uglier than it was for zero-argument version of flip, because repeat requires as its first argument a thunk that can be evaluated to produce a sample. (flip 0.9), without the enclosing lambda, is not a thunk, but, rather, just a single value--either #t or #f. We could use a few defines to make this less ugly:


In [8]:
(define (flip-mostly-heads-coin) (flip 0.9))

(define (is-true? x) x)

(length (filter is-true? (repeat flip-mostly-heads-coin 100)))


Out[8]:
90

Gluing several random choices together

Above, we reached in and set the bias parameter ourselves, but the bias itself could be a random choice. Here is a Gamble program that actually links several random variables, by sharing a bias variable coin-weight across the two flip expressions:


In [9]:
(define (two-weighted-flips) 
  (let ([coin-weight (uniform 0 1)])
  (list (flip coin-weight) (flip coin-weight))))

(two-weighted-flips)


Out[9]:
(#t #t)

two-weighted-flips doesn't look like much, but it's actually a big step. It is through these linkages that information flows from one part of a mode to another. Finding out the value of a coin flip, here, tells us something about the coin weight, and vice versa. This process of incorporating observations of some variables to (rationally) change beliefs about other, unobserved variables--inference--is where things gets tricky.

Why we need something new

So far, we haven't done anything we couldn't have done in just about any existing programming language. But inference is where probabilistic programming in ordinary programming languages fails.

In short, sampling from the possible executions of the process doesn't solve the kinds of problems people usually want to solve. We usually know something about how the process executed, and want to know other things.

Some inferences are easy. For example we can easily sample many values of two-weighted-flips, and look at (say) the proportion of heads on the second flip, or the proportion of the time the two flips are equal.


In [10]:
(require racket/list)

(length
 (filter (lambda (ht-list)
           (equal? (first ht-list)
               (second ht-list))) (repeat two-weighted-flips 100)))


Out[10]:
75

The (require racket/list) is so I can use first and second, which I prefer to the antique lisp forms car and cadr.

We can also easily sample from the model in which a "root" random variable--in this case,the coin weight--has been observed. This is just requires a bit of "program surgery", replacing the random draw of the weight with assignment to a fixed value:


In [11]:
(define (two-mostly-heads-flips) 
  (let ([coin-weight 0.9])
  (list (flip coin-weight) (flip coin-weight))))

(two-mostly-heads-flips)


Out[11]:
(#t #t)

Now let's consider a harder question: having observed the first flip's outcome as #t, in what proportion of samples (of the whole process) does the other flip also have value #t? We can't just fix the value of the first flip, because that severs its connection to the coin weight. In general, then, inference is hard, and we can't do it by simply evaluating a program written in a general-purpose programming language.

First, we consider two simple inference strategies, both of which are implemented in Gamble.

Inference strategy 1: reject samples that don't agree with the observation

Rejection sampling is dead simple, and often used as the basis for an operational semantics for probabilistic programming (ref). The idea is to generate many samples of all the random variables, from the original model--the "prior", in Bayesian lingo--and simply throw out those for which the sampled value(s) of the observed random variable(s) don't agree with the observed value(s). We could easily roll our own using filter, but rejection sampling is built in to Gamble, and provides a nice introduction to the general pattern of a query.

Instead of editing the model, the normal Gamble inference workflow specifies a sampler object, defined by

  • a generative model (the program that specifies the process),
  • some observations, and
  • a query expression.

The class of the object itself specifies the sampling method.


In [12]:
(define two-wtd-cn-rej-sampler
  (rejection-sampler
   
    ;; Model.
    (define coin-weight (uniform 0 1))
    (define flip1 (flip coin-weight))
    (define flip2 (flip coin-weight))
   
    ;; Observations.
    (observe/fail flip1 #t)
   
    ;; Query ("return") expression.
    flip2))
   
   
(length (filter is-true? (repeat two-wtd-cn-rej-sampler 10000)))


Out[12]:
6729

It looks like observing the first flip to be #t tells us that the second is #t about two thirds of the time. As we will see in a bit, this happens to be one of the few special cases where we can verify that answer against a closed-form solution.

TODO: Show more idiomatic, vector-based sample generating/handling procedures.

Inference strategy 2: weight samples of other variables by the degree to which they "agree with" the observation

Rejection is a wasteful procedure, especially when the observed values occur in only a small fraction of samples from the prior. "Likelihood weighting" is a bit smarter; instead of actually drawing values for the observed random variables, many or most of which will be incompatible with the observed values, it weights samples of the other random variables by how probable the observed values are, given those sampled values.


In [13]:
(define two-wtd-cn-lw-sampler
  (importance-sampler
   
    ;; Model.
    (deflazy coin-weight (uniform 0 1))
    (deflazy flip1 (flip coin-weight))
    (deflazy flip2 (flip coin-weight))
   
    ;; Observations.
    (observe flip1 #t)
   
    ;; Query ("return") expression.
    flip2))


(generate-weighted-samples two-wtd-cn-lw-sampler 30)


Out[13]:
#((#f . 0.3531539688017279) (#t . 0.9679331307602328) (#t . 0.792056051955479) (#t . 0.4257894699844089) (#t . 0.8199624699894791) (#t . 0.4826797711191216) (#t . 0.4727898320044124) (#f . 0.2179661882894503) (#t . 0.6947156673997779) (#t . 0.6199723281791072) (#t . 0.7256733735418092) (#t . 0.30644810566241065) (#f . 0.5456204471385696) (#t . 0.9412373071018049) (#t . 0.1973055678046211) (#f . 0.48234610965661495) (#t . 0.5933201646922609) (#f . 0.6826089385390886) (#f . 0.15182641813994735) (#f . 0.4219504703687732) (#f . 0.5402296665980878) (#t . 0.4061045710625478) (#f . 0.687727989872783) (#f . 0.00043628716160257577) (#t . 0.9498467560783321) (#f . 0.03640703264918709) (#t . 0.30486322785996633) (#f . 0.25528477996104265) (#f . 0.8020867628124652) (#t . 0.8257530612770089))

Indexed sets of random variables

If we define a to have random value $v$, then subsequent references to a will have value $v$. Very often, we will have a set of random variables, each indexed by some identifier--say, the heights of a number of people--that are drawn similarly. Instead of giving each variable in this set a unique name and drawing a value for it with a define statement, we'd like to do something like a for loop over an array.

In scheme, a natural generalization of a simple random variable to an indexed set is a function. An array is just a function from integers to values, but we need not limit ourselves to integer indices. We might try something like this:

(define (height person) (normal 69 3))

However, there is a problem: a function is evaluated each time it is encountered, so several references to (height "Bob") will, in general, have different random values. This is not what we mean. We want there to be just one random value for (height "Bob"), for all references. What we want is the special Gamble syntax defmem, used like this:

(defmem (height person) (normal 69 3))

I like to think of this as drawing an infinite table assigning each possible input to a normal draw. Subsequent references are just table lookups. Of course, the implementation has to be clever and only draw values for table entries that are actually used, when they are first needed. The "mem" in defmem is for "memoize", which just means wrapping a function so that only the first instance of a call (with a particular argument) is actually evaluated, with subsequent calls looking up the value generated the first time. With deterministic procedures, memoization is a speed-up technique, but here it gets us from the wrong program meaning to the right one.

TODO: make this work, and then answer a question using weighted samples. Also, successfully switch to deflazy and observe (not observe/fail), and explain the differences between the two "observes"...

Maybe first do with observe/fail, then note that this is essentially rejection. Then introduce the notion of an observable context...

observe versus observe/fail

observe is better than observe/fail in just the same way that likelihood weighting is better than rejection sampling: rather than waiting to see what value has been drawn, to apply an observation, observe weights the rest of the draw by the observed value. However, observe has some restrictions on its use. First of all, it can't be used when a (possibly different) value for the observed random value has already been drawn! By substituting deflazy for define, [somewhere], we tell the compiler[/interpreter?] not to draw that value immediately, but wait, instead, until it is required to produce the query value or score a state [...]. A deflazy and an observe of the same variable can be combined into a weighting expression by the compiler. [...Ryan...].

Explore the space of values of unobserved random variables by a sequence of random proposals

The workhorse of inference, for hard problems, is the Metropolis-Hastings sampler. Why it works is beyond the scope of this introduction, but the intuition is not too hard.

  • The sampled and observed values define a score (probability or log-probability) for every state.
  • From any current state, simple random perturbations of values can propose a new state.
  • The score function can be used to decide whether to accept the proposed state or keep the old one.
  • As if by magic, this procedure eventually samples from the correct posterior distribution.

The "eventually" part means that it is usually a good idea to sample for a while (a "burn-in" period) before collecting any values. Also, because successive samples are far from independent, it can be space efficient to "thin" the sample sequence--that is, to skip some number of samples between recorded ones. These two delays are input to generate-samples by the optional, keyword-identified arguments #:burn and #:thin.


In [14]:
(require racket/vector)

(define two-wtd-cn-mh-sampler
  (mh-sampler
   
    ;; Model.
    (define coin-weight (uniform 0 1))
    (define flip1 (flip coin-weight))
    (define flip2 (flip coin-weight))
   
    ;; Observations.
    (observe/fail flip1 #t)
   
    ;; Query ("return") expression.
    flip2))


(time (vector-length 
 (vector-filter is-true? 
                (generate-samples two-wtd-cn-mh-sampler 100 #:burn 100 #:thin 100))))


cpu time: 259 real time: 260 gc time: 14
Out[14]:
68

In [15]:
(require racket/vector)

(define two-wtd-cn-mh-sampler2
  (mh-sampler
   
    ;; Model.
    (define coin-weight (uniform 0 1))
    (deflazy flip1 (flip coin-weight))
    (define flip2 (flip coin-weight))
   
    ;; Observations.
    (observe flip1 #t)
   
    ;; Query ("return") expression.
    flip2))


(time (vector-length 
 (vector-filter is-true? 
                (generate-samples two-wtd-cn-mh-sampler2 100 #:burn 100 #:thin 100))))


cpu time: 300 real time: 300 gc time: 9
Out[15]:
62

Beyond uniform coin weights

Suppose you know that the coin about to be flipped is weighted, so that it is more likely to land one way than the other. What you don't know is which side it favors, or to what degree. From your knowledge of the physics of flipping, though, extreme weights seem very unlikely. To represent this state of knowledge, you need a distribution of p(heads) which concentrates its mass near 0.5.

Beta distributions

There is a standard distibution family for just this case: the family of beta distributions. A beta distribution assigns a probability density to every number in the interval $[0,1]$, and integrates to one over that interval. A beta's shape is determined by two parameters, $\alpha$ and $\beta$, which act like "virtual observations". That is, a preponderance of pseudo-observed ones (larger $\alpha$) moves the probability mass to the right, while a preponderance of pseudo-observed zeros (larger $\beta$) moves it to the left. Here are a few beta distributions:


In [1]:
(require "c3_helpers.rkt"
         racket/list
         math/distributions)

$beta(1,1)$ is just a uniform distribution on $[0,1]$.


In [2]:
(define (plot-beta alpha beta)
  (let ([xs (range 0.0001 0.9999 0.001)])
  (line-c3 xs 
           (map (lambda (x) (pdf (beta-dist alpha beta) x)) xs)
           #:xtickvalues '(0 1)
           #:showpoints #f)))

(plot-beta 1.5 2.5)


Out[2]:
(c3-data . #hasheq((data . #hasheq((type . line) (xs . #hasheq((ys1 . xs1))) (columns . ((xs1 0.0001 0.0011 0.0021000000000000003 0.0031000000000000003 0.0041 0.0051 0.0061 0.0071 0.0081 0.0091 0.010100000000000001 0.011100000000000002 0.012100000000000003 0.013100000000000004 0.014100000000000005 0.015100000000000006 0.016100000000000007 0.017100000000000008 0.01810000000000001 0.01910000000000001 0.02010000000000001 0.02110000000000001 0.022100000000000012 0.023100000000000013 0.024100000000000014 0.025100000000000015 0.026100000000000016 0.027100000000000016 0.028100000000000017 0.029100000000000018 0.03010000000000002 0.03110000000000002 0.03210000000000002 0.03310000000000002 0.03410000000000002 0.03510000000000002 0.03610000000000002 0.03710000000000002 0.03810000000000002 0.039100000000000024 0.040100000000000025 0.041100000000000025 0.042100000000000026 0.04310000000000003 0.04410000000000003 0.04510000000000003 0.04610000000000003 0.04710000000000003 0.04810000000000003 0.04910000000000003 0.05010000000000003 0.051100000000000034 0.052100000000000035 0.053100000000000036 0.05410000000000004 0.05510000000000004 0.05610000000000004 0.05710000000000004 0.05810000000000004 0.05910000000000004 0.06010000000000004 0.06110000000000004 0.062100000000000044 0.06310000000000004 0.06410000000000005 0.06510000000000005 0.06610000000000005 0.06710000000000005 0.06810000000000005 0.06910000000000005 0.07010000000000005 0.07110000000000005 0.07210000000000005 0.07310000000000005 0.07410000000000005 0.07510000000000006 0.07610000000000006 0.07710000000000006 0.07810000000000006 0.07910000000000006 0.08010000000000006 0.08110000000000006 0.08210000000000006 0.08310000000000006 0.08410000000000006 0.08510000000000006 0.08610000000000007 0.08710000000000007 0.08810000000000007 0.08910000000000007 0.09010000000000007 0.09110000000000007 0.09210000000000007 0.09310000000000007 0.09410000000000007 0.09510000000000007 0.09610000000000007 0.09710000000000008 0.09810000000000008 0.09910000000000008 0.10010000000000008 0.10110000000000008 0.10210000000000008 0.10310000000000008 0.10410000000000008 0.10510000000000008 0.10610000000000008 0.10710000000000008 0.10810000000000008 0.10910000000000009 0.11010000000000009 0.11110000000000009 0.11210000000000009 0.11310000000000009 0.11410000000000009 0.11510000000000009 0.11610000000000009 0.11710000000000009 0.1181000000000001 0.1191000000000001 0.1201000000000001 0.1211000000000001 0.1221000000000001 0.1231000000000001 0.1241000000000001 0.1251000000000001 0.1261000000000001 0.1271000000000001 0.1281000000000001 0.1291000000000001 0.1301000000000001 0.1311000000000001 0.1321000000000001 0.1331000000000001 0.1341000000000001 0.1351000000000001 0.1361000000000001 0.1371000000000001 0.1381000000000001 0.1391000000000001 0.1401000000000001 0.14110000000000011 0.14210000000000012 0.14310000000000012 0.14410000000000012 0.14510000000000012 0.14610000000000012 0.14710000000000012 0.14810000000000012 0.14910000000000012 0.15010000000000012 0.15110000000000012 0.15210000000000012 0.15310000000000012 0.15410000000000013 0.15510000000000013 0.15610000000000013 0.15710000000000013 0.15810000000000013 0.15910000000000013 0.16010000000000013 0.16110000000000013 0.16210000000000013 0.16310000000000013 0.16410000000000013 0.16510000000000014 0.16610000000000014 0.16710000000000014 0.16810000000000014 0.16910000000000014 0.17010000000000014 0.17110000000000014 0.17210000000000014 0.17310000000000014 0.17410000000000014 0.17510000000000014 0.17610000000000015 0.17710000000000015 0.17810000000000015 0.17910000000000015 0.18010000000000015 0.18110000000000015 0.18210000000000015 0.18310000000000015 0.18410000000000015 0.18510000000000015 0.18610000000000015 0.18710000000000016 0.18810000000000016 0.18910000000000016 0.19010000000000016 0.19110000000000016 0.19210000000000016 0.19310000000000016 0.19410000000000016 0.19510000000000016 0.19610000000000016 0.19710000000000016 0.19810000000000016 0.19910000000000017 0.20010000000000017 0.20110000000000017 0.20210000000000017 0.20310000000000017 0.20410000000000017 0.20510000000000017 0.20610000000000017 0.20710000000000017 0.20810000000000017 0.20910000000000017 0.21010000000000018 0.21110000000000018 0.21210000000000018 0.21310000000000018 0.21410000000000018 0.21510000000000018 0.21610000000000018 0.21710000000000018 0.21810000000000018 0.21910000000000018 0.22010000000000018 0.22110000000000019 0.2221000000000002 0.2231000000000002 0.2241000000000002 0.2251000000000002 0.2261000000000002 0.2271000000000002 0.2281000000000002 0.2291000000000002 0.2301000000000002 0.2311000000000002 0.2321000000000002 0.2331000000000002 0.2341000000000002 0.2351000000000002 0.2361000000000002 0.2371000000000002 0.2381000000000002 0.2391000000000002 0.2401000000000002 0.2411000000000002 0.2421000000000002 0.2431000000000002 0.2441000000000002 0.2451000000000002 0.2461000000000002 0.2471000000000002 0.2481000000000002 0.2491000000000002 0.2501000000000002 0.2511000000000002 0.2521000000000002 0.2531000000000002 0.2541000000000002 0.2551000000000002 0.2561000000000002 0.2571000000000002 0.2581000000000002 0.2591000000000002 0.2601000000000002 0.2611000000000002 0.2621000000000002 0.2631000000000002 0.2641000000000002 0.2651000000000002 0.2661000000000002 0.2671000000000002 0.2681000000000002 0.26910000000000023 0.27010000000000023 0.27110000000000023 0.27210000000000023 0.27310000000000023 0.27410000000000023 0.27510000000000023 0.27610000000000023 0.27710000000000024 0.27810000000000024 0.27910000000000024 0.28010000000000024 0.28110000000000024 0.28210000000000024 0.28310000000000024 0.28410000000000024 0.28510000000000024 0.28610000000000024 0.28710000000000024 0.28810000000000024 0.28910000000000025 0.29010000000000025 0.29110000000000025 0.29210000000000025 0.29310000000000025 0.29410000000000025 0.29510000000000025 0.29610000000000025 0.29710000000000025 0.29810000000000025 0.29910000000000025 0.30010000000000026 0.30110000000000026 0.30210000000000026 0.30310000000000026 0.30410000000000026 0.30510000000000026 0.30610000000000026 0.30710000000000026 0.30810000000000026 0.30910000000000026 0.31010000000000026 0.31110000000000027 0.31210000000000027 0.31310000000000027 0.31410000000000027 0.31510000000000027 0.31610000000000027 0.31710000000000027 0.31810000000000027 0.3191000000000003 0.3201000000000003 0.3211000000000003 0.3221000000000003 0.3231000000000003 0.3241000000000003 0.3251000000000003 0.3261000000000003 0.3271000000000003 0.3281000000000003 0.3291000000000003 0.3301000000000003 0.3311000000000003 0.3321000000000003 0.3331000000000003 0.3341000000000003 0.3351000000000003 0.3361000000000003 0.3371000000000003 0.3381000000000003 0.3391000000000003 0.3401000000000003 0.3411000000000003 0.3421000000000003 0.3431000000000003 0.3441000000000003 0.3451000000000003 0.3461000000000003 0.3471000000000003 0.3481000000000003 0.3491000000000003 0.3501000000000003 0.3511000000000003 0.3521000000000003 0.3531000000000003 0.3541000000000003 0.3551000000000003 0.3561000000000003 0.3571000000000003 0.3581000000000003 0.3591000000000003 0.3601000000000003 0.3611000000000003 0.3621000000000003 0.3631000000000003 0.3641000000000003 0.3651000000000003 0.3661000000000003 0.3671000000000003 0.3681000000000003 0.3691000000000003 0.3701000000000003 0.3711000000000003 0.3721000000000003 0.3731000000000003 0.3741000000000003 0.3751000000000003 0.3761000000000003 0.3771000000000003 0.3781000000000003 0.3791000000000003 0.3801000000000003 0.3811000000000003 0.38210000000000033 0.38310000000000033 0.38410000000000033 0.38510000000000033 0.38610000000000033 0.38710000000000033 0.38810000000000033 0.38910000000000033 0.39010000000000034 0.39110000000000034 0.39210000000000034 0.39310000000000034 0.39410000000000034 0.39510000000000034 0.39610000000000034 0.39710000000000034 0.39810000000000034 0.39910000000000034 0.40010000000000034 0.40110000000000035 0.40210000000000035 0.40310000000000035 0.40410000000000035 0.40510000000000035 0.40610000000000035 0.40710000000000035 0.40810000000000035 0.40910000000000035 0.41010000000000035 0.41110000000000035 0.41210000000000035 0.41310000000000036 0.41410000000000036 0.41510000000000036 0.41610000000000036 0.41710000000000036 0.41810000000000036 0.41910000000000036 0.42010000000000036 0.42110000000000036 0.42210000000000036 0.42310000000000036 0.42410000000000037 0.42510000000000037 0.42610000000000037 0.42710000000000037 0.42810000000000037 0.42910000000000037 0.43010000000000037 0.43110000000000037 0.4321000000000004 0.4331000000000004 0.4341000000000004 0.4351000000000004 0.4361000000000004 0.4371000000000004 0.4381000000000004 0.4391000000000004 0.4401000000000004 0.4411000000000004 0.4421000000000004 0.4431000000000004 0.4441000000000004 0.4451000000000004 0.4461000000000004 0.4471000000000004 0.4481000000000004 0.4491000000000004 0.4501000000000004 0.4511000000000004 0.4521000000000004 0.4531000000000004 0.4541000000000004 0.4551000000000004 0.4561000000000004 0.4571000000000004 0.4581000000000004 0.4591000000000004 0.4601000000000004 0.4611000000000004 0.4621000000000004 0.4631000000000004 0.4641000000000004 0.4651000000000004 0.4661000000000004 0.4671000000000004 0.4681000000000004 0.4691000000000004 0.4701000000000004 0.4711000000000004 0.4721000000000004 0.4731000000000004 0.4741000000000004 0.4751000000000004 0.4761000000000004 0.4771000000000004 0.4781000000000004 0.4791000000000004 0.4801000000000004 0.4811000000000004 0.4821000000000004 0.4831000000000004 0.4841000000000004 0.4851000000000004 0.4861000000000004 0.4871000000000004 0.4881000000000004 0.4891000000000004 0.4901000000000004 0.4911000000000004 0.4921000000000004 0.4931000000000004 0.4941000000000004 0.49510000000000043 0.49610000000000043 0.49710000000000043 0.49810000000000043 0.49910000000000043 0.5001000000000004 0.5011000000000004 0.5021000000000004 0.5031000000000004 0.5041000000000004 0.5051000000000004 0.5061000000000004 0.5071000000000004 0.5081000000000004 0.5091000000000004 0.5101000000000004 0.5111000000000004 0.5121000000000004 0.5131000000000004 0.5141000000000004 0.5151000000000004 0.5161000000000004 0.5171000000000004 0.5181000000000004 0.5191000000000004 0.5201000000000005 0.5211000000000005 0.5221000000000005 0.5231000000000005 0.5241000000000005 0.5251000000000005 0.5261000000000005 0.5271000000000005 0.5281000000000005 0.5291000000000005 0.5301000000000005 0.5311000000000005 0.5321000000000005 0.5331000000000005 0.5341000000000005 0.5351000000000005 0.5361000000000005 0.5371000000000005 0.5381000000000005 0.5391000000000005 0.5401000000000005 0.5411000000000005 0.5421000000000005 0.5431000000000005 0.5441000000000005 0.5451000000000005 0.5461000000000005 0.5471000000000005 0.5481000000000005 0.5491000000000005 0.5501000000000005 0.5511000000000005 0.5521000000000005 0.5531000000000005 0.5541000000000005 0.5551000000000005 0.5561000000000005 0.5571000000000005 0.5581000000000005 0.5591000000000005 0.5601000000000005 0.5611000000000005 0.5621000000000005 0.5631000000000005 0.5641000000000005 0.5651000000000005 0.5661000000000005 0.5671000000000005 0.5681000000000005 0.5691000000000005 0.5701000000000005 0.5711000000000005 0.5721000000000005 0.5731000000000005 0.5741000000000005 0.5751000000000005 0.5761000000000005 0.5771000000000005 0.5781000000000005 0.5791000000000005 0.5801000000000005 0.5811000000000005 0.5821000000000005 0.5831000000000005 0.5841000000000005 0.5851000000000005 0.5861000000000005 0.5871000000000005 0.5881000000000005 0.5891000000000005 0.5901000000000005 0.5911000000000005 0.5921000000000005 0.5931000000000005 0.5941000000000005 0.5951000000000005 0.5961000000000005 0.5971000000000005 0.5981000000000005 0.5991000000000005 0.6001000000000005 0.6011000000000005 0.6021000000000005 0.6031000000000005 0.6041000000000005 0.6051000000000005 0.6061000000000005 0.6071000000000005 0.6081000000000005 0.6091000000000005 0.6101000000000005 0.6111000000000005 0.6121000000000005 0.6131000000000005 0.6141000000000005 0.6151000000000005 0.6161000000000005 0.6171000000000005 0.6181000000000005 0.6191000000000005 0.6201000000000005 0.6211000000000005 0.6221000000000005 0.6231000000000005 0.6241000000000005 0.6251000000000005 0.6261000000000005 0.6271000000000005 0.6281000000000005 0.6291000000000005 0.6301000000000005 0.6311000000000005 0.6321000000000006 0.6331000000000006 0.6341000000000006 0.6351000000000006 0.6361000000000006 0.6371000000000006 0.6381000000000006 0.6391000000000006 0.6401000000000006 0.6411000000000006 0.6421000000000006 0.6431000000000006 0.6441000000000006 0.6451000000000006 0.6461000000000006 0.6471000000000006 0.6481000000000006 0.6491000000000006 0.6501000000000006 0.6511000000000006 0.6521000000000006 0.6531000000000006 0.6541000000000006 0.6551000000000006 0.6561000000000006 0.6571000000000006 0.6581000000000006 0.6591000000000006 0.6601000000000006 0.6611000000000006 0.6621000000000006 0.6631000000000006 0.6641000000000006 0.6651000000000006 0.6661000000000006 0.6671000000000006 0.6681000000000006 0.6691000000000006 0.6701000000000006 0.6711000000000006 0.6721000000000006 0.6731000000000006 0.6741000000000006 0.6751000000000006 0.6761000000000006 0.6771000000000006 0.6781000000000006 0.6791000000000006 0.6801000000000006 0.6811000000000006 0.6821000000000006 0.6831000000000006 0.6841000000000006 0.6851000000000006 0.6861000000000006 0.6871000000000006 0.6881000000000006 0.6891000000000006 0.6901000000000006 0.6911000000000006 0.6921000000000006 0.6931000000000006 0.6941000000000006 0.6951000000000006 0.6961000000000006 0.6971000000000006 0.6981000000000006 0.6991000000000006 0.7001000000000006 0.7011000000000006 0.7021000000000006 0.7031000000000006 0.7041000000000006 0.7051000000000006 0.7061000000000006 0.7071000000000006 0.7081000000000006 0.7091000000000006 0.7101000000000006 0.7111000000000006 0.7121000000000006 0.7131000000000006 0.7141000000000006 0.7151000000000006 0.7161000000000006 0.7171000000000006 0.7181000000000006 0.7191000000000006 0.7201000000000006 0.7211000000000006 0.7221000000000006 0.7231000000000006 0.7241000000000006 0.7251000000000006 0.7261000000000006 0.7271000000000006 0.7281000000000006 0.7291000000000006 0.7301000000000006 0.7311000000000006 0.7321000000000006 0.7331000000000006 0.7341000000000006 0.7351000000000006 0.7361000000000006 0.7371000000000006 0.7381000000000006 0.7391000000000006 0.7401000000000006 0.7411000000000006 0.7421000000000006 0.7431000000000006 0.7441000000000006 0.7451000000000007 0.7461000000000007 0.7471000000000007 0.7481000000000007 0.7491000000000007 0.7501000000000007 0.7511000000000007 0.7521000000000007 0.7531000000000007 0.7541000000000007 0.7551000000000007 0.7561000000000007 0.7571000000000007 0.7581000000000007 0.7591000000000007 0.7601000000000007 0.7611000000000007 0.7621000000000007 0.7631000000000007 0.7641000000000007 0.7651000000000007 0.7661000000000007 0.7671000000000007 0.7681000000000007 0.7691000000000007 0.7701000000000007 0.7711000000000007 0.7721000000000007 0.7731000000000007 0.7741000000000007 0.7751000000000007 0.7761000000000007 0.7771000000000007 0.7781000000000007 0.7791000000000007 0.7801000000000007 0.7811000000000007 0.7821000000000007 0.7831000000000007 0.7841000000000007 0.7851000000000007 0.7861000000000007 0.7871000000000007 0.7881000000000007 0.7891000000000007 0.7901000000000007 0.7911000000000007 0.7921000000000007 0.7931000000000007 0.7941000000000007 0.7951000000000007 0.7961000000000007 0.7971000000000007 0.7981000000000007 0.7991000000000007 0.8001000000000007 0.8011000000000007 0.8021000000000007 0.8031000000000007 0.8041000000000007 0.8051000000000007 0.8061000000000007 0.8071000000000007 0.8081000000000007 0.8091000000000007 0.8101000000000007 0.8111000000000007 0.8121000000000007 0.8131000000000007 0.8141000000000007 0.8151000000000007 0.8161000000000007 0.8171000000000007 0.8181000000000007 0.8191000000000007 0.8201000000000007 0.8211000000000007 0.8221000000000007 0.8231000000000007 0.8241000000000007 0.8251000000000007 0.8261000000000007 0.8271000000000007 0.8281000000000007 0.8291000000000007 0.8301000000000007 0.8311000000000007 0.8321000000000007 0.8331000000000007 0.8341000000000007 0.8351000000000007 0.8361000000000007 0.8371000000000007 0.8381000000000007 0.8391000000000007 0.8401000000000007 0.8411000000000007 0.8421000000000007 0.8431000000000007 0.8441000000000007 0.8451000000000007 0.8461000000000007 0.8471000000000007 0.8481000000000007 0.8491000000000007 0.8501000000000007 0.8511000000000007 0.8521000000000007 0.8531000000000007 0.8541000000000007 0.8551000000000007 0.8561000000000007 0.8571000000000008 0.8581000000000008 0.8591000000000008 0.8601000000000008 0.8611000000000008 0.8621000000000008 0.8631000000000008 0.8641000000000008 0.8651000000000008 0.8661000000000008 0.8671000000000008 0.8681000000000008 0.8691000000000008 0.8701000000000008 0.8711000000000008 0.8721000000000008 0.8731000000000008 0.8741000000000008 0.8751000000000008 0.8761000000000008 0.8771000000000008 0.8781000000000008 0.8791000000000008 0.8801000000000008 0.8811000000000008 0.8821000000000008 0.8831000000000008 0.8841000000000008 0.8851000000000008 0.8861000000000008 0.8871000000000008 0.8881000000000008 0.8891000000000008 0.8901000000000008 0.8911000000000008 0.8921000000000008 0.8931000000000008 0.8941000000000008 0.8951000000000008 0.8961000000000008 0.8971000000000008 0.8981000000000008 0.8991000000000008 0.9001000000000008 0.9011000000000008 0.9021000000000008 0.9031000000000008 0.9041000000000008 0.9051000000000008 0.9061000000000008 0.9071000000000008 0.9081000000000008 0.9091000000000008 0.9101000000000008 0.9111000000000008 0.9121000000000008 0.9131000000000008 0.9141000000000008 0.9151000000000008 0.9161000000000008 0.9171000000000008 0.9181000000000008 0.9191000000000008 0.9201000000000008 0.9211000000000008 0.9221000000000008 0.9231000000000008 0.9241000000000008 0.9251000000000008 0.9261000000000008 0.9271000000000008 0.9281000000000008 0.9291000000000008 0.9301000000000008 0.9311000000000008 0.9321000000000008 0.9331000000000008 0.9341000000000008 0.9351000000000008 0.9361000000000008 0.9371000000000008 0.9381000000000008 0.9391000000000008 0.9401000000000008 0.9411000000000008 0.9421000000000008 0.9431000000000008 0.9441000000000008 0.9451000000000008 0.9461000000000008 0.9471000000000008 0.9481000000000008 0.9491000000000008 0.9501000000000008 0.9511000000000008 0.9521000000000008 0.9531000000000008 0.9541000000000008 0.9551000000000008 0.9561000000000008 0.9571000000000008 0.9581000000000008 0.9591000000000008 0.9601000000000008 0.9611000000000008 0.9621000000000008 0.9631000000000008 0.9641000000000008 0.9651000000000008 0.9661000000000008 0.9671000000000008 0.9681000000000008 0.9691000000000008 0.9701000000000009 0.9711000000000009 0.9721000000000009 0.9731000000000009 0.9741000000000009 0.9751000000000009 0.9761000000000009 0.9771000000000009 0.9781000000000009 0.9791000000000009 0.9801000000000009 0.9811000000000009 0.9821000000000009 0.9831000000000009 0.9841000000000009 0.9851000000000009 0.9861000000000009 0.9871000000000009 0.9881000000000009 0.9891000000000009 0.9901000000000009 0.9911000000000009 0.9921000000000009 0.9931000000000009 0.9941000000000009 0.9951000000000009 0.9961000000000009 0.9971000000000009 0.9981000000000009 0.9991000000000009) (ys1 0.05092194254312723 0.16863568156678907 0.23265387547356053 0.28224636073910064 0.3241049297677578 0.36093113258308546 0.3941386850827227 0.4245780620036401 0.45280837913536476 0.4792206774304587 0.5041012603180703 0.5276672796934835 0.5500881294672704 0.571499008473434 0.5920098957106794 0.6117117036663048 0.6306806242737045 0.6489812771505221 0.6666690406262454 0.6837918109181812 0.7003913522087889 0.7165043482955725 0.732163232739629 0.7473968520393556 0.7622310011626188 0.7766888602635148 0.7907913540148468 0.80455744969923 0.8180044063647661 0.831147984529893 0.8440026238219501 0.8565815943530268 0.8688971264339623 0.8809605223035106 0.8927822528335524 0.9043720416114512 0.9157389383595755 0.9268913833019075 0.9378372638078244 0.9485839644179918 0.959138411175045 0.9695071110333232 0.9796961870004072 0.989711409563187 0.999558224868465 1.0092417800593343 1.0187669461111803 1.0281383384630307 1.0373603356994625 1.046437096504051 1.0553725750762968 1.0641705351792499 1.0728345629639344 1.0813680786985849 1.0897743475151398 1.098056489272042 1.1062174876207724 1.1142601983535152 1.1221873571005765 1.1300015864385828 1.1377054024638031 1.1453012208791133 1.1527913626379953 1.1601780591844537 1.1674634573237548 1.1746496237553827 1.1817385492964947 1.1887321528213886 1.1956322849400605 1.2024407314367205 1.2091592164872058 1.215789405672475 1.2223329088038144 1.2287912825739848 1.2351660330472811 1.241458618000353 1.2476704491246107 1.2538028941001222 1.2598572785500883 1.2658348878842194 1.2717369690386648 1.2775647321195334 1.2833193519564745 1.2890019695722885 1.2946136935740702 1.3001556014709634 1.3056287409232237 1.3110341309269355 1.3163727629384023 1.3216456019419387 1.326853587464525 1.3319976345405338 1.3370786346295047 1.3420974564897572 1.3470549470104072 1.351951932004202 1.3567892169634161 1.3615675877808942 1.3662878114381982 1.3709506366626827 1.3755567945552043 1.3801069991900579 1.3846019481886407 1.3890423232682352 1.3934287907672327 1.3977620021480208 1.4020425944786945 1.4062711908946766 1.4104484010412626 1.414574821498059 1.4186510361862052 1.4226776167592405 1.4266551229784081 1.430584103073155 1.4344650940875392 1.438298622213214 1.4420852031096214 1.4458253422119995 1.449519535027764 1.4531682674217976 1.4567720158911601 1.4603312478296901 1.4638464217829565 1.467317987693989 1.47074638714019 1.4741320535618199 1.477475412482417 1.4807768817215026 1.484036871599898 1.487255785137968 1.490434018247089 1.4935719599146233 1.4966699923826683 1.4997284913208377 1.5027478259933176 1.5057283594204274 1.5086704485349083 1.5115744443331507 1.5144406920215567 1.5172695311582323 1.520061295790191 1.5228163145862408 1.5255349109657224 1.5282174032232567 1.5308641046496514 1.5334753236491125 1.5360513638529003 1.538592524229557 1.5410990991918383 1.5435713787004652 1.5460096483648131 1.5484141895406507 1.550785279425029 1.5531231911484298 1.5554281938642645 1.5577005528358174 1.5599405295207263 1.5621483816530808 1.5643243633232256 1.5664687250553424 1.5685817138828908 1.5706635734219758 1.5727145439427168 1.5747348624386803 1.5767247626944432 1.5786844753513471 1.5806142279715056 1.5825142451001162 1.5843847483261368 1.586225956341379 1.5880380849980638 1.5898213473648957 1.591575953781698 1.5933021119126531 1.5950000267981974 1.596669900905604 1.5983119341783034 1.5999263240839716 1.6015132656614317 1.6030729515663982 1.604605572116102 1.6061113153328328 1.6075903669864224 1.6090429106357107 1.6104691276690177 1.6118691973436523 1.613243296824487 1.614591601221625 1.6159142836271851 1.6172115151512316 1.618483464956872 1.6197303002945476 1.6209521865355363 1.6221492872046965 1.623321764012466 1.6244697768861407 1.625593484000454 1.6266930418074712 1.627768605065824 1.6288203268692971 1.6298483586747903 1.6308528503296675 1.6318339500985135 1.6327918046893097 1.6337265592790509 1.6346383575388095 1.6355273416582712 1.6363936523697455 1.6372374289716742 1.6380588093516457 1.638857930008928 1.6396349260765364 1.6403899313428412 1.641123078272735 1.6418344980283652 1.6425243204894433 1.6431926742731446 1.6438396867536054 1.6444654840810282 1.645070191200407 1.6456539318698795 1.6462168286787144 1.6467590030649475 1.6472805753326691 1.6477816646689776 1.6482623891605985 1.648722865810188 1.6491632105523188 1.6495835382691608 1.6499839628058632 1.6503645969856444 1.6507255526245956 1.65106694054621 1.6513888705956319 1.6516914516536492 1.6519747916504206 1.6522389975789502 1.6524841755083148 1.65271043059665 1.6529178671038955 1.6531065884043141 1.6532766969987813 1.6534282945268524 1.653561481778619 1.6536763587063472 1.6537730244359126 1.653851577278034 1.6539121147393046 1.653954733533036 1.6539795295899073 1.6539865980684314 1.6539760333652407 1.6539479291251933 1.653902378251307 1.6538394729145234 1.6537593045633054 1.6536619639330714 1.6535475410554703 1.6534161252675015 1.6532678052204803 1.653102668888856 1.6529208035788798 1.652722295937134 1.652507231958915 1.6522756969964856 1.6520277757671844 1.6517635523614098 1.6514831102504703 1.65118653229431 1.650873900749109 1.6505452972747616 1.6502008029422366 1.649840498240818 1.6494644630852333 1.6490727768226685 1.6486655182396717 1.6482427655689502 1.6478045964960615 1.6473510881659985 1.6468823171896751 1.6463983596503093 1.6458992911097112 1.6453851866144715 1.644856120702059 1.6443121674068224 1.643753400265903 1.6431798923250587 1.6425917161443984 1.6419889438040332 1.6413716469096415 1.6407398965979525 1.6400937635421478 1.6394333179571836 1.6387586296050356 1.638069767799866 1.6373668014131157 1.6366497988785222 1.6359188281970665 1.635173956941845 1.6344152522628763 1.6336427808918348 1.6328566091467198 1.6320568029364573 1.631243427765435 1.6304165487379765 1.6295762305627506 1.6287225375571184 1.6278555336514215 1.6269752823932093 1.6260818469514082 1.6251752901204324 1.6242556743242396 1.6233230616203314 1.6223775137036958 1.6214190919106992 1.6204478572229266 1.6194638702709647 1.6184671913381405 1.6174578803642046 1.616435996948968 1.6154016003558895 1.6143547495156174 1.6132955030294795 1.6122239191729333 1.6111400558989657 1.6100439708414511 1.6089357213184656 1.6078153643355564 1.60668295658897 1.6055385544688392 1.6043822140623274 1.6032139911567356 1.6020339412425648 1.600842119516546 1.5996385808846245 1.5984233799649128 1.5971965710906022 1.5959582083128399 1.5947083454035687 1.5934470358583324 1.5921743328990468 1.5908902894767327 1.5895949582742226 1.5882883917088242 1.5869706419349605 1.5856417608467714 1.5843018000806879 1.5829508110179704 1.5815888447872217 1.5802159522668644 1.5788321840875936 1.5774375906347957 1.5760322220509415 1.57461612823795 1.5731893588595225 1.5717519633434522 1.5703039908839047 1.5688454904436708 1.5673765107563966 1.5658971003287827 1.5644073074427622 1.5629071801576513 1.5613967663122756 1.5598761135270716 1.5583452692061655 1.556804280539426 1.555253194504496 1.5536920578687996 1.5521209171915282 1.5505398188256017 1.5489488089196113 1.5473479334197364 1.5457372380716434 1.5441167684223625 1.5424865698221424 1.5408466874262867 1.5391971661969694 1.537538050905029 1.535869386131745 1.5341912162705946 1.5325035855289897 1.5308065379299962 1.529100117314033 1.5273843673405552 1.5256593314897173 1.5239250530640207 1.5221815751899406 1.5204289408195393 1.5186671927320603 1.5168963735355054 1.5151165256681978 1.513327691400326 1.5115299128354729 1.5097232319121296 1.5079076904051927 1.506083329927447 1.5042501919310316 1.502408317708894 1.5005577483962251 1.4986985249718836 1.4968306882598046 1.4949542789303922 1.4930693375019028 1.491175904341809 1.4892740196681524 1.487363723550887 1.4854450559132006 1.483518056532832 1.4815827650433688 1.479639220935538 1.4776874635584778 1.4757275321210033 1.4737594656928559 1.471783303205943 1.4697990834555634 1.4678068451016242 1.4658066266698426 1.463798466552939 1.461782403011819 1.4597584741767407 1.4577267180484759 1.4556871724994578 1.4536398752749178 1.451584863994012 1.4495221761509398 1.4474518491160484 1.4453739201369293 1.443288426339506 1.4411954047291078 1.439094892191539 1.4369869254941348 1.434871541286809 1.4327487761030924 1.4306186663611626 1.4284812483648623 1.426336558304713 1.424184632258914 1.4220255061943379 1.4198592159675159 1.4176857973256112 1.4155052859073907 1.4133177172441815 1.4111231267608235 1.4089215497766128 1.4067130215062371 1.4044975770607022 1.4022752514482522 1.4000460795752827 1.3978100962472424 1.3955673361695333 1.3933178339483971 1.3910616240918001 1.3887987410103069 1.386529219017948 1.3842530923330816 1.3819703950792477 1.3796811612860145 1.3773854248898187 1.3750832197348015 1.372774579573633 1.3704595380683335 1.36813812879109 1.3658103852250618 1.363476340765183 1.361136028718959 1.358789482307254 1.356436734665077 1.354077818842358 1.3517127678047198 1.3493416144342436 1.3469643915302318 1.344581131809959 1.3421918679094231 1.33979663238409 1.3373954577096292 1.3349883762826493 1.332575420421423 1.3301566223666133 1.3277320142819862 1.3253016282551267 1.3228654962981443 1.320423650348375 1.3179761222690805 1.3155229438501392 1.313064146808735 1.3105997627900392 1.3081298233678913 1.3056543600454715 1.3031734042559706 1.300686987363255 1.2981951406625274 1.2956978953809848 1.2931952826784676 1.2906873336481108 1.2881740793169851 1.2856555506467386 1.2831317785342324 1.2806027938121705 1.2780686272497295 1.275529309553182 1.272984871366516 1.2704353432720528 1.2678807557910579 1.265321139384351 1.2627565244529118 1.2601869413384796 1.2576124203241545 1.2550329916349905 1.2524486854385866 1.2498595318456762 1.247265560910712 1.2446668026324443 1.2420632869545039 1.2394550437659737 1.2368421029019625 1.2342244941441725 1.2316022472214663 1.2289753918104294 1.2263439575359292 1.2237079739716747 1.2210674706407667 1.218422477016252 1.2157730225216716 1.2131191365316052 1.210460848372216 1.2077981873217896 1.205131182611274 1.2024598634248131 1.1997842589002794 1.1971043981298064 1.1944203101603144 1.1917320239940365 1.1890395685890431 1.1863429728597614 1.1836422656774934 1.1809374758709346 1.1782286322266862 1.1755157634897668 1.1727988983641238 1.1700780655131406 1.1673532935601414 1.1646246110888963 1.1618920466441216 1.159155628731981 1.1564153858205835 1.1536713463404775 1.1509235386851475 1.1481719912115047 1.1454167322403788 1.1426577900570072 1.1398951929115209 1.1371289690194313 1.134359146562114 1.1315857536872909 1.1288088185095118 1.1260283691106336 1.1232444335402982 1.1204570398164087 1.1176662159256043 1.1148719898237356 1.1120743894363356 1.1092734426590918 1.1064691773583146 1.1036616213714074 1.100850802507334 1.0980367485470834 1.0952194872441376 1.0923990463249325 1.0895754534893234 1.0867487364110455 1.0839189227381765 1.0810860400935944 1.078250116075439 1.075411178257568 1.0725692541900151 1.0697243713994478 1.0668765573896208 1.0640258396418327 1.0611722456153796 1.0583158027480084 1.0554565384563703 1.052594480136472 1.0497296551641284 1.0468620908954132 1.0439918146671092 1.0411188537971585 1.0382432355851126 1.03536498731258 1.0324841362436765 1.029600709625473 1.0267147346884424 1.0238262386469097 1.020935248699497 1.0180417920295717 1.0151458958056934 1.0122475871820609 1.0093468932989584 1.0064438412832026 1.003538458248588 1.000630771296335 0.9977208075155346 0.9948085939835967 0.9918941577666943 0.988977525920212 0.9860587254891924 0.9831377835087817 0.9802147270046779 0.9772895829935783 0.9743623784836261 0.9714331404748586 0.9685018959596561 0.965568671923189 0.9626334953438684 0.959696393193794 0.9567573924392044 0.9538165200409284 0.9508738029548348 0.9479292681322835 0.944982942520579 0.9420348530634223 0.9390850267013638 0.9361334903722588 0.9331802710117209 0.9302253955535789 0.9272688909303323 0.9243107840736099 0.9213511019146268 0.9183898713846439 0.9154271194154295 0.9124628729397192 0.9094971588916777 0.9065300042073637 0.903561435825194 0.9005914806864086 0.8976201657355382 0.8946475179208722 0.8916735641949285 0.8886983315149244 0.885721846843248 0.882744137147933 0.8797652294031331 0.876785150589599 0.873803927695157 0.8708215877151867 0.867838157653106 0.864853664520851 0.8618681353393626 0.8588815971390729 0.8558940769603935 0.8529056018542048 0.8499161988823501 0.8469258951181292 0.8439347176467936 0.8409426935660462 0.8379498499865401 0.8349562140323835 0.8319618128416413 0.828966673566845 0.8259708233754999 0.8229742894505984 0.8199770989911332 0.8169792792126137 0.8139808573475851 0.8109818606461506 0.8079823163764954 0.8049822518254126 0.8019816942988347 0.7989806711223648 0.7959792096418118 0.7929773372237299 0.7899750812559577 0.7869724691481643 0.7839695283323963 0.7809662862636266 0.7779627704203108 0.7749590083049414 0.7719550274446096 0.7689508553915694 0.765946519723803 0.7629420480455933 0.7599374679880966 0.756932807209922 0.7539280933977122 0.7509233542667294 0.7479186175614438 0.7449139110561291 0.741909262555457 0.7389046998951012 0.7359002509423411 0.7328959435966739 0.7298918057904262 0.726887865489375 0.7238841506933698 0.7208806894369604 0.7178775097900291 0.7148746398584294 0.711872107784625 0.7088699417483398 0.7058681699672086 0.7028668206974329 0.6998659222344452 0.6968655029135757 0.6938655911107242 0.6908662152430401 0.6878674037696048 0.6848691851921226 0.681871588055615 0.6788746409491229 0.6758783725064128 0.6728828114066913 0.669887986375323 0.6668939261845582 0.6639006596542637 0.6609082156526621 0.6579166230970777 0.6549259109546871 0.6519361082432806 0.6489472440320259 0.6459593474422427 0.6429724476481835 0.6399865738778199 0.6370017554136396 0.6340180215934482 0.6310354018111811 0.6280539255177215 0.6250736222217276 0.6220945214904674 0.619116652950662 0.6161400462893383 0.6131647312546885 0.610190737656941 0.6072180953692365 0.6042468343285187 0.6012769845364277 0.5983085760602093 0.5953416390336292 0.5923762036578993 0.5894123002026136 0.5864499590066944 0.5834892104793479 0.5805300851010321 0.5775726134244338 0.5746168260754574 0.5716627537542249 0.568710427236086 0.5657598773726432 0.5628111350927832 0.5598642314037253 0.5569191973920802 0.5539760642249197 0.5510348631508621 0.5480956255011679 0.5451583826908504 0.5422231662197982 0.5392900076739131 0.53635893872626 0.533429991138233 0.5305031967607343 0.5275785875353685 0.524656195495652 0.5217360527682372 0.5188181915741538 0.515902644230063 0.5129894431495321 0.5100786208443208 0.507170209925689 0.5042642431057177 0.5013607531986511 0.4984597731222539 0.4955613358991868 0.49266547465840294 0.489772222636562 0.4868816131794604 0.48399367974348845 0.4811084558971002 0.47822597532230793 0.47534627181619676 0.4724693792924604 0.46959533178295787 0.4667241634392946 0.46385590853442416 0.46099060146427306 0.45812827674939166 0.4552689690366264 0.4524127131008178 0.4495595438465236 0.44670949630976897 0.4438626056598191 0.4410189072009827 0.438178436374441 0.4353412287601042 0.43250732007849707 0.42967674619267476 0.42684954311016626 0.4240257469849498 0.4212053941194588 0.4183885209666197 0.41557516413192247 0.41276536037552375 0.409959146614385 0.40715655992444433 0.40435763754282367 0.40156241687007344 0.3987709354724533 0.3959832310842508 0.3931993416101406 0.3904193051275799 0.3876431598892483 0.384870944325527 0.3821026970470212 0.3793384568471256 0.37657826270463557 0.3738221537864028 0.37107016945003807 0.36832234924666235 0.36557873292370613 0.3628393604277592 0.36010427190747235 0.3573735077165108 0.3546471084165622 0.3519251147804008 0.3492075677950055 0.3464945086647392 0.3437859788145859 0.34108201989344905 0.33838267377751313 0.335687982573668 0.3329979886230015 0.330312734504357 0.327632263037962 0.32495661728912706 0.3222858405720179 0.31961997645350265 0.31695906875707575 0.3143031615668604 0.3116522992316934 0.30900652636929127 0.30636588787050417 0.30373042890365587 0.3011001949189749 0.2984752316531184 0.29585558513379123 0.29324130168446355 0.29063242792918814 0.28802901079752335 0.2854310975295609 0.28283873568106416 0.2802519731287189 0.2776708580755021 0.2750954390561676 0.27252576494285685 0.2699618849508353 0.26740384864436145 0.26485170594268714 0.2623055071262002 0.25976530284270666 0.2572311441138625 0.2547030823417561 0.25218116931564794 0.24966545721887104 0.24715599863589827 0.2446528465595819 0.24215605439857055 0.23966567598490796 0.23718176558182286 0.2347043778917128 0.23223356806432985 0.22976939170517524 0.2273119048841091 0.22486116414418283 0.22241722651070178 0.2199801495005266 0.21754999113162046 0.215126809932851 0.21271066495405733 0.21030161577638948 0.20789972252293076 0.20550504586961418 0.20311764705644134 0.2007375878990184 0.19836493080041606 0.19599973876337018 0.19364207540283288 0.19129200495888873 0.188949592310049 0.18661490298693925 0.1842880031863957 0.18196895978598532 0.17965784035896676 0.1773547131897109 0.17505964728959822 0.17277271241341122 0.1704939790762452 0.16822351857095671 0.16596140298617154 0.16370770522487835 0.16146249902362952 0.15922585897237806 0.1569978605349766 0.15477858007036835 0.15256809485449913 0.15036648310298448 0.1481738239945645 0.14599019769538304 0.14381568538412884 0.14165036927807825 0.13949433266008318 0.1373476599065482 0.13521043651644346 0.13308274914140544 0.13096468561697847 0.12885633499505175 0.1267577875775545 0.12466913495147142 0.12259047002524626 0.12052188706664622 0.11846348174216331 0.11641535115803495 0.11437759390296985 0.11235031009267366 0.1103336014162717 0.1083275711847363 0.10633232438143063 0.10434796771489169 0.10237460967398129 0.10041236058554366 0.09846133267472063 0.09652164012808362 0.09459339915975438 0.09267672808070146 0.09077174737141028 0.08887857975814362 0.08699735029302423 0.0851281864381906 0.08327121815429801 0.0814265779936586 0.07959440119833877 0.07777482580356147 0.07596799274678845 0.07417404598289153 0.07239313260586079 0.07062540297753343 0.06887101086387866 0.067130113579418 0.0654028721404202 0.06368945142757039 0.061990020358884346 0.06030475207371455 0.0586338241287846 0.05697741870728508 0.055335722842175826 0.05370892865496583 0.05209723361138022 0.050500840795490005 0.04891995920405841 0.047354804063069786 0.04580559716864441 0.044272567254816916 0.0427559503909717 0.04125599041209198 0.039772939385400885 0.03830705811746233 0.036858616706379806 0.035427895144399736 0.03401518397700972 0.032620785025548986 0.03124501218144365 0.029888192281486575 0.028550666075141264 0.027232789296728826 0.025934933857623778 0.024657489176344246 0.023400863667795525 0.0221654864170788 0.020951809068425815 0.019760307966248267 0.018591486593384412 0.017445878361898075 0.016324049824948422 0.015226604395280462 0.0141541866781647 0.013107487556095278 0.012087250202072726 0.011094277252010926 0.010129439440955571 0.009193686111919924 0.008288058155153467 0.007413704153516993 0.006571900835904539 0.005764079442856676 0.004991860406243057 0.004257100059302649 0.003561955354449633 0.0029089766619268646 0.002301246623653194 0.0017425995322485086 0.0012379938594417159 0.00079421203706488 0.00042139321556706327 0.0001374479774601795))))) (point . #hasheq((show . #f))) (axis . #hasheq((x . #hasheq((label . x) (tick . #hasheq((values . (0 1)))))) (y . #hasheq((label . y)))))))

In [9]:
(pdf (beta-dist 1 1) 0.5)


Out[9]:
1.0

Like $beta(1,1)$, $beta(2,2)$ favors neither side of the interval, but, unlike $beta(1,1)$, it concentrates its probability mass near 0.5. In general, if the $\alpha$ and $\beta$ parameters are given equal values, larger values will concentrate more sharply on 0.5.


In [62]:
(show-beta 10 10)


Out[62]:


When $\alpha$ is big, relative to $\beta$, the probability mass shifts towards 1.


In [66]:
(show-beta 4 1)


Out[66]:


Reversing the values of $\alpha$ and $\beta$ just flips the graph around the $x=0.5$ axis.

$\alpha$ and $\beta$ values below 1 concentrate probability mass near the extremes:


In [69]:
(show-beta 0.9 0.8)


Out[69]:


Assuming a moderate concentration of probability mass near 0.5, we can model 5 flips of the same weighted coin like so:


In [75]:
(define weighted-flips
    (rejection-sampler
     
        ;; Generative model
        (define wt (beta 5 5))
        (deflazy result1 (bernoulli wt))
        (deflazy result2 (bernoulli wt))
        (deflazy result3 (bernoulli wt))
        (deflazy result4 (bernoulli wt))
        (deflazy result5 (bernoulli wt))
        (deflazy result6 (bernoulli wt))
     
        ;; Observations
        (observe result1 1)
        (observe result2 1)
        (observe result3 1)
        (observe result4 1)
        (observe result5 1)

        ;; Query expression
        (list wt (exact->inexact result6))))

   (sampler->mean weighted-flips 10000)


Out[75]:
(0.6665448617839016 0.6628000000000001)


Here we see that five 1s changes our best guess of the coin's weight to something like 0.67.

You probably noticed the new deflazy [constructs] where the defines used to be. These are necessary because observes combine with sampling expressions, under the hood, whereas defines are evaluated immediately. [...Ryan...]

This happens to be one of those rare cases with a closed-form solution for the expected values of our query expressions. The beta distribution is conjugate to the Bernoulli distribution, which means that the posterior distribution, given all of this observations, is also a beta distribution.

Remember where I said that $\alpha$ and $\beta$ behave like pseudo-observations? Well, observations do too! To get the posterior beta, we just add the number of 1s to alpha to define $\alpha'$, and the number of 0s to $\beta$ to define $\beta'$. The posterior is then $beta(\alpha',\beta')$--here, $beta(10,5)$.

A quick look at Wikipedia confirms that the mean of this posterior is $10/(10+5) = 2/3 \approx 0.66$, and the posterior predictive distribution of result6 is the same.

[not budging enough after ten heads, leading to...]

[...more realistic, less mathematically convenient mixture distribution]

[...]

Beyond BNs: indexed random variables

There is a regular structure to the previous program that should be annoying to every programmer: all of those "define resultN" statements look the same. All of the variables defined in those statements are drawn from the same distribution. Can't we put them in an indexed set, such as an array or hash? A popular Gamble idiom (borrowed from Church, but cleaned up) does something a tiny bit more abstract; we use defmem to draw a random function from objects of whatever type (integers, below) to draws from (bernoulli wt). Conceptually, the entire (infinite) table of values is drawn at the time of definition, but the implementation is lazy; the random draw for a particular input to the random function is drawn the first time that input is seen, and memoized.

[What would be the most natural way to say "draw me a function at random from the set of functions that assign to each object in the range a draw from dist"?]


In [70]:
;; Imports "range" and "for".
(require racket/list)

(define weighted-flips
   (rejection-sampler
   
    ;; Generative model
    (define wt (beta 5 5))
      (defmem (result flip-num)
        (bernoulli wt))
    
    ;; Observations
    (for ([x (range 1 6)])
          (observe (result x) 1))
    
    ;; Query expression
    (list wt (exact->inexact (result 6)))))

   (sampler->mean weighted-flips 10000)


Out[70]:
(0.6661860628194309 0.6676000000000001)


Metropolis-Hastings sampling also gets the right answer: (Hooray!)

(define weighted-flips-mh (mh-sampler

;; Generative model
(define wt (beta 5 5))
  (defmem (result flip-num)
    (bernoulli wt))

;; Observations
(for ([x (range 1 6)])
      (observe (result x) 1))

;; Query expression
(list wt (exact->inexact (result 6)))))

(sampler->mean weighted-flips-mh 1000 #:burn 1000 #:thin 50)

[We have to track this down and fix it: adaptive-drift-proposal: scale increased out of range]


In [79]:
(expt 3 2)


Out[79]:
9



In [83]:
(exact->inexact (expt (/ 51 49) 100))


Out[83]:
54.62728380723642


Junk

Some inferences are easy...

The "Asia" model is typical of an intuitive, causal style of modeling. It defines a joint probability distribution over risk factors, diseases, and symptoms (that is, it defines the probability of every distinct set of values for those variables) by breaking the process in to small pieces, and it does so in a certain direction. The pieces correspond to hypothesized "causal mechanisms", about which you can read more here (link). For instance, disease probabilities depend directly on risk factors, because risk factors cause diseases. Similarly, symptom probabilities depend directly on diseases, because diseases cause symptoms. Risk factors also cause symptoms, but only indirectly, via diseases.

A model of this form makes it easy to make one kind of conditional inference: forward, from causes to effects (note: the Bayes net formalism doesn't require a causal ordering, and arrows should not always be interpreted as causal. What makes inference easy, in any Bayes net, is query variables that are desendants of observed ones, whether those descendants are effects or not).

Suppose we want to know the probability that Bob has tuberculosis, given that he has visited Asia. All we have to do is modify the model above to fix the values of $\verb+visit-to-asia+$ to $\verb+#t+$.

and more junk

So what? You don't need the help of probability theory or computers to answer questions about a single coin flip, or even the long-run average of many coin flips. If you have a little bit of experience with such things, you can probably even answer simple conditional questions like "what's the probability that the first of two flips was heads, given that at least one flip was heads.

[more code...]

WHERE?: In coin flipping, our ignorance of or inability to measure precisely enough the factors that determine the outcome is nearly complete, but for many other systems (e.g., the motion of a projectile), we have pretty good deterministic models and measurements. The probabilistic parts are there to capture minor deviations from our ideal model caused by such nuisances as a changeable atmosphere and imperfect sensors. Random elements sweep under the rug all the factors that either aren't worth modeling, because they complicate the model without much improving predictions; or about which we can never know very much.

Intuition and case enumeration fail

When processes get just a little more complex than a pair of coin flips, though, unaided human minds either

  • fail to compute or intuit an answer:
    • Q: If anywhere from one to six dice are selected, with equal probability, and the maximum across all dice, in one roll, is 5, what is the probability that the number of dice chosen is two?
    • A: I need a computer for this. Or at least lots of paper and time.
  • intuit wrong answers:
    • Q: Are you having a hot streak at the roulette table?
    • A: Yeah, Baby! (wrong)

Mathematical analysis fails

Sometimes we can use the calculus of probability to answer questions our intution can't. For instance, suppose ....

Numerical approximations fail

We can also use built-in programming language functions to predict the outcomes of more complicated processes, such as flipping a coin 100 times and outputing 1 if heads occurs at least 70 times. What is the probability of this event? If you said "about 0.0000016", give yourself a gold star (or give one to the software that told you the answer).

Questions like that above (probability of 70+ heads in 100 fair coin flips) can be answered quickly, to a high degree of precision, by closed form math or fast, accurate numerical methods. However, queries over more complex models have no such simple off-the-shelf solution methods. The progress of model development...[somewhere: Modeling the system under consideration--call it "s1"--is not the same thing as designing a system ("s2") to answer questions about s1. s1 is usually what we care about, and s2 is what we have to muck with to get answers about s1. ]

[In these cases, writing down a model of the system you care about is only half (or a tenth of) the battle. You also need to specify a method for getting an answer.]

Simulation is useful when mathematical analysis runs out of steam.

Inference: why we need probabilistic programming

Making a model of how frequently diseases occur, and how often they produce various symptoms, is a fine exercise in formalization. First, write down a step-by-step description of the process. If it's too hard to do the math, or you are too lazy, then draw samples to get an estimate of how often it plays out this or that way. Note, though, that, so far, we haven't done anything we couldn't have done in just about any existing programming language.

Why do we need anything new?

Mere simulation of a process is not enough, because sampling from the possible executions of the process doesn't solve the kinds of problems people usually want to solve. Usually, we observe the values of some random variables in the model, and want to infer the probable values of some others. For example, a doctor observes a patient's symptoms and needs to infer the patient's probable disease status. The doctor knows that the process played out is some way such that particular symptoms occurred, and wants to know something else about how it played out--what unobservable disease happened to cause the symptoms.

Some inferences are easy...

The "Asia" model is typical of an intuitive, causal style of modeling. It defines a joint probability distribution over risk factors, diseases, and symptoms (that is, it defines the probability of every distinct set of values for those variables) by breaking the process in to small pieces, and it does so in a certain direction. The pieces correspond to hypothesized "causal mechanisms", about which you can read more here (link). For instance, disease probabilities depend directly on risk factors, because risk factors cause diseases. Similarly, symptom probabilities depend directly on diseases, because diseases cause symptoms. Risk factors also cause symptoms, but only indirectly, via diseases.

A model of this form makes it easy to make one kind of conditional inference: forward, from causes to effects (note: the Bayes net formalism doesn't require a causal ordering, and arrows should not always be interpreted as causal. What makes inference easy, in any Bayes net, is query variables that are desendants of observed ones, whether those descendants are effects or not).

Suppose we want to know the probability that Bob has tuberculosis, given that he has visited Asia. All we have to do is modify the model above to fix the values of $\verb+visit-to-asia+$ to $\verb+#t+$.

Simple rejection sampling implements the meaning of conditioning

The simplest approximation is this: run the program many times, and throw out those runs that don't produce Bob's symptoms. The proportion of retained runs in which tuberculosis is $\verb+#t+$ will be, on average, $p(tuberculosis|cough,x-ray)$.

Here we have

  • defined a program to
    • draw a sample from the joint distribution, via a set of internal defines,
    • define an output value ((map (lambda ...), based on that sample;
  • run that program 1000 times; and
  • displayed proportions of samples for which the three variables in the output expression take on the value #t.

$\verb+asia+$ is a no-arg procedure (thunk) that draws a value for each random variable. sampler->mean is a convenience function that runs a sampling thunk a number of times and takes the element-wise means of the samples.

Digression: visualizing results

One thing you will want to do quite often is look at probability distributions represented as set of samples. Histograms are very useful for this. For details, see the plot library, but this examples gives an idea of the basic syntax:

For some other small, special models, posterior distributions have nice closed forms. However, in the general case, these expressions involve very large sums and/or unsolvable integrals, and the only feasible approaches are approximate. The next few sections discuss a few Monte Carlo approximations. These are approximations that represent the posterior distributions of interest by sets of samples drawn from those distributions. The first--rejection sampling--is a direct embodiment of the meaning of conditioning. It is impractical for real problems, but useful as a conceptual aid, and as a check on the correctness of implementations of more sophisiticated algorithms. The second is the (currently) most popular all purpose sampling method.

Unfortunately, in many more complicated models nearly all runs will be rejected, because the observations are very improbable (in the case of continuous-valued observations, infinitely so). So, while simple rejection sampling is an intuitive way to think about the meaning of conditional inference, and sometimes a useful check on the correctness of implementations of other inference methods, it is rarely used in practice.

[Likelihood weighting]

Metropolis-Hastings sampling is the Swiss army knife of inference

We solved the "Asia" diagnosis problem above using "enum". We can also solve it approximately, using a general-purpose sampling algorithm:


In [73]:
(require plot/pict
         racket/draw)
(plot (discrete-histogram `(#(dyspnea ,(list-ref asia-sample-means 0))
                         #(tuberculosis ,(list-ref asia-sample-means 1))
                         #(lung-cancer ,(list-ref asia-sample-means 2))))
      #:x-label "problems"
      #:y-label "probabilities of problems")


asia-sample-means: undefined;
 cannot reference undefined identifier
  context...:
   /Applications/Racket v6.1.1/share/pkgs/sandbox-lib/racket/sandbox.rkt:475:0: call-with-custodian-shutdown
   /Applications/Racket v6.1.1/collects/racket/private/more-scheme.rkt:147:2: call-with-break-parameterization
   /Applications/Racket v6.1.1/share/pkgs/sandbox-lib/racket/sandbox.rkt:837:5: loop

In [27]:
(require plot)
; A plotting convenience function
(define (show-beta alpha beta) 
  (plot (function 
         (lambda (x) (dist-pdf (beta-dist alpha beta) x)) 
         0.0001 
         0.9999 
         #:y-min 0
         #:y-max 4)))

(show-beta 1 1)


dynamic-require: name is protected
  name: 'platform-values
  module: #<resolved-module-path:"/Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/private/wx/cocoa/platform.rkt">
  context...:
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/private/wx/platform.rkt: [running body]
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/private/kernel.rkt: [traversing imports]
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/private/mred.rkt: [traversing imports]
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/mred.rkt: [traversing imports]
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/mred/main.rkt: [traversing imports]
   /Applications/Racket v6.1.1/share/pkgs/gui-lib/racket/gui/base.rkt: [traversing imports]
   /Applications/Racket v6.1.1/share/pkgs/plot-gui-lib/plot/private/gui/snip2d.rkt: [traversing imports]
   get-sym
   /Applications/Racket v6.1.1/collects/racket/promise.rkt:70:15
   /Applications/Racket v6.1.1/collects/racket/private/more-scheme.rkt:264:2: call-with-exception-handler
   /Applications/Racket v6.1.1/collects/racket/promise.rkt:38:2
   /Applications/Racket v6.1.1/collects/racket/lazy-require.rkt:102:6
   /Applications/Racket v6.1.1/share/pkgs/sandbox-lib/racket/sandbox.rkt:475:0: call-with-custodian-shutdown
   /Applications/Racket v6.1.1/collects/racket/private/more-scheme.rkt:147:2: call-with-break-parameterization
   /Applications/Racket v6.1.1/share/pkgs/sandbox-lib/racket/sandbox.rkt:837:5: loop

In [17]:
(define (asia-w-visit)
    (define visit-to-asia #t)
    (define smoker (flip 0.5))
    (define tuberculosis (if visit-to-asia (flip 0.05) (flip 0.01)))
    (define lung-cancer (if smoker (flip 0.1) (flip 0.01)))
    (define bronchitis (if smoker (flip 0.6) (flip 0.3)))
    (define tb-or-c (or tuberculosis lung-cancer))
    (define x-ray-abnormal (if tb-or-c (flip 0.98) (flip 0.05)))
    (define dyspnea (cond [(and tb-or-c bronchitis) (flip 0.9)]
                          [tb-or-c (flip 0.7)]
                          [bronchitis (flip 0.8)]
                          [else (flip 0.1)]))
    (map (lambda (x) (if x 1 0)) (list dyspnea tuberculosis lung-cancer)))

    (define asia-w-visit-sample-means (sampler->mean asia-w-visit 1000))
    
    (printf "dyspnea: ~a\ntuberculosis: ~a\nlung cancer: ~a\n" 
            (list-ref asia-w-visit-sample-means 0) 
            (list-ref asia-w-visit-sample-means 1) 
            (list-ref asia-w-visit-sample-means 2) )


dyspnea: 58/125
tuberculosis: 53/1000
lung cancer: 29/500


In [5]:
(require plot/pict)
(plot (list (discrete-histogram `(#(dyspnea ,(list-ref asia-w-visit-sample-means 0))
                                  #(tuberculosis ,(list-ref asia-w-visit-sample-means 1))
                                  #(lung-cancer ,(list-ref asia-w-visit-sample-means 2)))
                                    #:skip 2.5 
                                    #:x-min 0
                                    #:label "Visit known")
            (discrete-histogram `(#(dyspnea ,(list-ref asia-sample-means 0))
                                  #(tuberculosis ,(list-ref asia-sample-means 1))
                                  #(lung-cancer ,(list-ref asia-sample-means 2)))
                                    #:skip 2.5 
                                    #:x-min 1
                                    #:label "Nothing known"
                                    #:color 2
                                    #:line-color 2))
          #:x-label "Problems"
          #:y-label "Problem probabilities")


Out[5]:


...but inference is usually hard

Editing the model is not a very nice way to incorporate evidence, and not the way we'll usualy do it. In fact, editing the program this way doesn't work at all for the obviously useful inference; the probability that Bob has tuberculosis given that he has some symptoms--say, an abnormal x-ray (though having recently made a trip to Asia might also be useful information--more on that later).

We will write the probability of disease status $d$, given symtoms $s$, as $p(d|s)$. We have defined a model in terms of disease probability, $p(d)$ and symptom probability, given diseases $p(s|d)$. The program represents a sequential generative process, where the conditions--the variables to the right of the "|"s-- are sampled before the variables to the left. We can see now why editing the program fails: the program runs top to bottom (or inside to outside, for nested expressions), so fixing (rather than drawing) symptoms, which happen "later" in the generative process than diseases, won't affect the frequency with which diseases are drawn. This raises two questions: (1) How do we answer the doctor's "backwards" question using this model? (2) Why didn't we just make the model go the other way? The short, vague answer to (2) is that sometimes we do, but that there are often good reasons not to. Let's postpone the details of when and why to make non-causally ordered models.

Mathematically, the answer to (1) is "simple": by the definition of conditional probability

$$p(d|s) = p(d,s)/p(s)$$

But neither of these probabilities is provided directly by the model. Instead, we have to consider all the possible values of all the other variables. For the moment, lets summarize such a value set as "n" (for "nuisance").

$$p(d|s) = \frac{\sum_n p(d,s,n)}{\sum_n p(s,n)}$$

The model tells us just how to decompose those joint probabilities into products of local conditionals.

All of this leads to slogan two:

Probabilistic programs are written in programming languages that support conditioning.

For simple, mathematically convenient models, there are closed-form solutions to some inference problems. But those are not those of most concern to us, because when domain experts construct models of systems, they aren't thinking about mathematical convenience.

There are lots of other inference strategies and algorithms, several of which we will discuss. But we discuss them only briefly, because one of the purposes of a probabilistic programming language is to make it possible to think about models without at the same time thinking (much) about inference (this is not just for the benefit of the users of the language (the modelers); implementing inference algorithms case by case is tiresome, repetitive, and error prone, so concentrating implementation and debugging effort into re-usable form benefits implementers, as well).

This is where probabilistic programming becomes clear as a new field, and not just an application of existing programming languages. Probabilistic programming aims to separate out the inference problem from the model-specification problem, so that modelers need not know about inference algorithms in order to get answers to their questions, and so that inference methods and code can be reused across models, and efforts to validate and improve that code can be consilidated and conserved.

We need probabilistic programming to allow us to think about modeling without at the same time having to think (much) about inference.

For small, discrete-valued problems like the "Asia" BN, there are clever enumeration algorithms for exact inference. Approximate algorithms are more generally applicable. We consider several inference algorithms, and show how to use them to specify observations and query the "Asia" BN.

Inference algorithms

Inference by enumeration gives exact answers, when applicable

Instead of editing the model, the normal Gamble inference workflow specifies

  • a generative model (the program that specifies the process),
  • some observations, and
  • a query expression.

These three elements [go in to] an inference method. Because the "Asia" Bayes net is small and discrete, we can solve it exactly, using "enumerate". Enumerate considers all possible states of the non-observed variables, but does so in a clever way that is much more efficient that brute-force enumeration. [equivalent to VE?]

TODO: do something to "results", below, to get individual variable probabilities, as I do for the other inferences.

We can compare this with the original set of probabilies of problems to see that the probability that Bob has tuberculosis is indeed higher, now that we know he has visited the dangerous continent. It's even easier in pictures:


In [ ]: