• Basics of Probability Theory
    • Probability Theory
    • Frequentist Approach
    • Bayes Theorem
    • Bayesian Approach
    • Why is there need for bayesian approach
  • Inference
    • A few standard distributions
    • Prior
    • Posterior
    • Examples
  • Modelling
    • Learning parameters from the data
    • Decision Theory
    • Linear Regression example
    • Example of Naive Bayes
  • Issues with Bayesian Modelling (If time permits)
    • Complexity Issues
    • Approximate methods
    • Sampling Methods
    • Introduction to Probabilistic Graphical Models

Probability Theory

Probability is the measure of the likeliness that an event will occur.
Probability theory is the branch of mathematics concerned with probability, the analysis of random phenomena.

Show a normal example of some dice / cards

Random Variables

Discrete Random Variables

Continuous Random Variables

Probability Distributions

Frequentist Approach

Shortcomings of frequentist approach

Shortcomings of Bayesian approach

The issue of prior

Differences Between bayesian and frequentist approaches

Bayesian Statistics

The essential characteristic of Bayesian Method is their explicit use of probabiltiy for quantifying uncertainity in inference based of statistical data analysis.

The whole idea of Bayesian Statistics revolve around this formula:

$$ P(\theta | D) = \frac{P(D | \theta) * P(\theta)}{ P(D)} $$

In the case of Bayesian Statistics we assign probability to our parameters and don't settle for a single model as we do in the case of maximum likelihood. And then we can do whatever we want when we do the predictions. We can just use the model having maximum probability for the parameter or predict over all the models and do a weighted average based on the models.

Show a practical example.

But the problem that we face in the case of Bayesian Statistics is because of the huge integration to do to find the posterior distribution.

For dealing with this problem we have diffent methods like sampling.

Main steps in Bayesian Data Analysis:

  1. Setting up a full probability model.
  2. Conditioning on observed data.
  3. Evaluating a fit of the model and the implications of the resulting posterior distribution.

In [38]:
from scipy.stats import beta
prior = beta(1000, 1000)

In [39]:
import matplotlib.pyplot as plt
plt.plot(np.linspace(0, 1, 1000), prior.pdf(np.linspace(0, 1, 1000)), alpha=0.3, c='g')


Out[39]:
[<matplotlib.lines.Line2D at 0x7f3f1c05e908>]

In [4]:
%matplotlib inline

In [8]:
from __future__ import print_function
from IPython.html.widgets import interact, interactive, fixed
from IPython.html import widgets

In [25]:
from IPython.html.widgets import *

In [55]:
from scipy.stats import beta
%matplotlib inline

def plot(a, b):
    y = beta(a, b).pdf(np.linspace(0, 1, 1000))
    x = np.linspace(0, 1, 1000)
    plt.plot(x, y, c='g')
    plt.show()
    

interact(plot, a=(0, 1000), b=(0, 1000))


Linear regression example using bayesian learning


In [61]:
x = np.linspace(0, 10, 100)
y = 0.4 * x + 0.6
y_with_noise = y + np.random.randn(100)
plt.scatter(x, y_with_noise, alpha=0.5, c='g')


Out[61]:
<matplotlib.collections.PathCollection at 0x7f3f0ed445f8>
/home/ankur/anaconda3/envs/scipy/lib/python3.4/site-packages/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  if self._edgecolors == str('face'):

In [ ]: