In [1]:
from discrete_probability import *
First we want to call init_printing
in sympy
to get SymPy to output nice MathJax LaTeX.
In [2]:
import sympy
sympy.init_printing()
Now we'll define our simple example with data to use later.
In [3]:
E, H, S = variables = map(Variable, 'EHS')
e, e_ = E
h, h_ = H
s, s_ = S
data_header = [H, S, E]
data_samples = [
[True, False, True],
[True, False, True],
[False, True, False],
[False, False, True],
[True, False, False],
[True, False, True],
[False, False, False],
[True, False, True],
[True, False, True],
[False, False, True],
[True, False, True],
[True, True, True],
[True, False, True],
[True, True, True],
[True, False, True],
[True, False, True]]
data_assignments = data_to_assignments(data_header, data_samples)
A quick example of using SymPy with pyDiscreteProbability is enforcing simplified fractions for probabilities. For example, here is the table from the basic example.
In [4]:
P = Table(variables)
P.learn_from_complete_data(data_assignments)
P
Out[4]:
By using the SymPy number system we can let SymPy handle and preserve the original fractions instead of relying on floating point numbers.
In [5]:
P = Table(variables, number_system=sympy_number_system)
P.learn_from_complete_data(data_assignments)
P
Out[5]:
The number system is preserved through operations as well.
In [6]:
P(E,H)
Out[6]:
In [7]:
P(E|S,H)
Out[7]:
We can also use SymPy to store expressions in our probability table. First we'll define the symbols that will form our probability expressions.
In [8]:
x, y, z = sympy.symbols('x y z')
Now ee'll create a simple joint marginal Table
. We need to set it to ignore validity or else it will raise an error because the Table
is not "valid" since the probabilities do not sum to one. We'll also tell it to use the SymPy number system. By default Table
uses a float
number system.
In [9]:
A, B, C = variables = map(Variable, 'ABC')
a, a_ = A
b, b_ = B
c, c_ = C
P = Table(variables, ignore_validity=True, number_system=sympy_number_system)
Now we'll define our probability table some simple expressions as a silly demonstration of using SymPy with the table.
In [10]:
P[a , b , c ] = 0.11
P[a , b , c_] = 0.05+z
P[a , b_, c ] = 0.07
P[a , b_, c_] = 0.21
P[a_, b , c ] = 0.32
P[a_, b , c_] = 0.04
P[a_, b_, c ] = sympy.sqrt(x)
P[a_, b_, c_] = y
P
Out[10]:
Notice that the table output preserves the expressions thanks to SymPy. Now we'll define some constraints so we can demonstrate SymPy's solver. The following two constraints are simply to fully define the system.
In [11]:
constraint_one = sympy.Eq(P(a, b), sympy.S('0.15'))
constraint_one
Out[11]:
In [12]:
constraint_two = sympy.Eq(x, sympy.S('2*y'))
constraint_two
Out[12]:
This constraint below, however, is fundamental. It is an expression of the second fundamental axiom of probability which states that the sum of probabilities of all possible world instantiations must be equal to one.
In [13]:
second_axiom = sympy.Eq(P.total_probability(), sympy.S('1'))
second_axiom
Out[13]:
Now we'll use SymPy's solver to find appropriate values for $x$ and $y$. I will admit that I couldn't get it to behave with the first fundamental axiom of probability which requires that the individual world probabilites be greater than or equal to zero. Instead the example is contrived to return positive solutions.
In [14]:
soln = sympy.solve([second_axiom, constraint_one, constraint_two])[0]
soln
Out[14]:
To demonstrate the validity of the solution we'll redefine the table with it. The normalize_number_system
method ensures that the number types all match the table's number system. This means it will convert the Python floats to SymPy floats.
In [15]:
P[a , b , c ] = 0.11
P[a , b , c_] = 0.05+soln[z]
P[a , b_, c ] = 0.07
P[a , b_, c_] = 0.21
P[a_, b , c ] = 0.32
P[a_, b , c_] = 0.04
P[a_, b_, c ] = sympy.sqrt(soln[x])
P[a_, b_, c_] = soln[y]
P.normalize_number_system()
print(P)
Now we can check that the solution satisfies the second axiom of probability.
In [16]:
print(P.total_probability())
print(P.is_valid)