In [1]:
from lea import *
In [2]:
# mandatory die example - initilize a die object
die = Lea.fromVals(1, 2, 3, 4, 5, 6)
In [3]:
# throw the die a few times
die.random(20)
Out[3]:
In [4]:
# mandatory coin toss example - states can be strings!
coin = Lea.fromVals('Head', 'Tail')
In [5]:
# toss the coin a few times
coin.random(10)
Out[5]:
In [6]:
# how about a Boolean variable - only True or False ?
rain = Lea.boolProb(5, 100)
In [7]:
# how often does it rain in Chennai ?
rain.random(10)
Out[7]:
In [8]:
# How about standard statistics ?
die.mean, die.mode, die.var, die.entropy
Out[8]:
Summary
Random variables are abstract objects. Transparent method for drawing random samples variable.random(times)
. Standard statistical metrics of the probability distribution is also part of the object.
In [9]:
# Lets create two dies
die1 = die.clone()
die2 = die.clone()
In [10]:
# Two throws of the die
dice = die1 + die2
In [11]:
dice
Out[11]:
In [12]:
dice.random(10)
Out[12]:
In [13]:
dice.mean
Out[13]:
In [14]:
dice.mode
Out[14]:
In [15]:
print dice.histo()
Summary
Random variables are abstract objects. Methods are available for operating on them algebraically. The probability distributions, methods for drawing random samples, statistical metrics, are transparently propagated.
"You just threw two dice. Can you guess the result ?"
"Here's a tip : the sum is less than 6"
In [16]:
## We can create a new distribution, conditioned on our state of knowledge : P(sum | sum <= 6)
conditionalDice = dice.given(dice<=6)
In [17]:
## What is our best guess for the result of the throw ?
conditionalDice.mode
Out[17]:
In [18]:
## Conditioning can be done in many ways : suppose we know that the first die came up 3.
dice.given(die1 == 3)
Out[18]:
In [19]:
## Conditioning can be done in still more ways : suppose we know that **either** of the two dies came up 3
dice.given((die1 == 3) | (die2 == 3))
Out[19]:
Summary
Conditioning, which is the first step towards inference, is done automatically. A wide variety of conditions can be used. P(A | B) translates to a.given(b)
.
An entomologist spots what might be a rare subspecies of beetle, due to the pattern on its back. In the rare subspecies, 98% have the pattern, or P(pattern|species = rare) = 0.98. In the common subspecies, 5% have the pattern, or P(pattern | species = common) = 0.05. The rare subspecies accounts for only 0.1% of the population. How likely is the beetle having the pattern to be rare, or what is P(species=rare|pattern) ?
In [20]:
# Species is a random variable with states "common" and "rare", with probabilities determined by the population. Since
# are only two states, species states are, equivalently, "rare" and "not rare". Species can be a Boolean!
rare = Lea.boolProb(1,1000)
In [21]:
# Similarly, pattern is either "present" or "not present". It too is a Boolean, but, its probability distribution
# is conditioned on "rare" or "not rare"
patternIfrare = Lea.boolProb(98, 100)
patternIfNotrare = Lea.boolProb(5, 100)
In [22]:
# Now, lets build the conditional probability table for P(pattern | species)
pattern = Lea.buildCPT((rare , patternIfrare), ( ~rare , patternIfNotrare))
In [23]:
# Sanity check : do we get what we put in ?
pattern.given(rare)
Out[23]:
In [24]:
# Finally, our moment of truth : Bayesian inference - what is P(rare | pattern )?
rare.given(pattern)
Out[24]:
In [25]:
# And, now some show off : what is the probability of being rare and having a pattern ?
rare & pattern
Out[25]:
In [26]:
# All possible outcomes
Lea.cprod(rare,pattern)
Out[26]:
Summary
Lea contains all the basic ingredients of a proabilistic programming language. It is an excellent way to learn the paradigms of probabilistic programming. Lea is currently limited to discrete random variables. For continuous random variables, and for use in live applications, a more mature and capable tool like Stan or BayesPy should be used.