Sebastian Raschka
last modified: 03/31/2014


I am really looking forward to your comments and suggestions to improve and extend this tutorial! Just send me a quick note
via Twitter: @rasbt
or Email: bluewoodtree@gmail.com


Problem Category

  • Statistical Pattern Recognition
  • Supervised Learning
  • Parametric Learning
  • Bayes Decision Theory
  • Univariate data
  • 2-class problem
  • different variances
  • equal priors
  • Cauchy model (2 parameters)
  • With conditional Risk (1-0 loss functions)



Given information:

model: continuous univariate Cauchy distribution for the class-conditional densities

$p(x | \omega_j) = \frac{1}{\pi b} \; \dot \; \frac{1}{1 \; + \; \bigg(\frac{x - a_i}{b} \bigg)^2} \quad i = 1, 2$

Prior probabilities:

$P(\omega_1) = P(\omega_2) = 0.5$

Loss functions:

where

$\lambda(\alpha_i|\omega_j) = \lambda_{ij}$,

the loss occured if $action_i$ is taken if the actual true class is $ \omega_j$ (assuming that $action_i$ classifies sample as $\omega_i$)

$\begin{equation} \lambda = \begin{pmatrix} \lambda_{11} \quad \lambda_{12} \\ \lambda_{21} \quad \lambda_{22} \end{pmatrix} = \begin{pmatrix} 0 \quad 1 \\ 1 \quad 0 \\ \end{pmatrix}\\ \end{equation}$

Parameters

$a_1 = 0, \quad a_2 = 2 ,\quad b=0.5$



Deriving the decision boundary

Bayes' Rule:

$P(\omega_j|x) = \frac{p(x|\omega_j) * P(\omega_j)}{p(x)}$

Risk Functions:

$R(\alpha_1|x) = \lambda_{11}P(\omega_1|x) + \lambda_{12}P(\omega_2|x)$

$R(\alpha_2|x) = \lambda_{21}P(\omega_1|x) + \lambda_{22}P(\omega_2|x)$

Decision Rule:

Decide $\omega_1 $ if $ R(\alpha_2|x) > R(\alpha_1|x) $ else decide $ \omega_2$.

$\Rightarrow \lambda_{21}P(\omega_1|x) + \lambda_{22}P(\omega_2|x) > \lambda_{11}P(\omega_1|x) + \lambda_{12}P(\omega_2|x)$

$\Rightarrow (\lambda_{21} - \lambda_{11}) \; P(\omega_1|x) > (\lambda_{12} - \lambda_{22}) \; P(\omega_2|x)$

$\Rightarrow \frac{P(\omega_1|x)}{P(\omega_2|x)} > \frac{(\lambda_{12} - \lambda_{22})}{(\lambda_{21} - \lambda_{11})} $

$\Rightarrow \frac{p(x|\omega_1) \; P{(\omega_1)}}{p(x|\omega_2) \; P{(\omega_2)}} > \frac{(\lambda_{12} - \lambda_{22})}{(\lambda_{21} - \lambda_{11})}$

$\Rightarrow \frac{p(x|\omega_1)}{p(x|\omega_2)} > \frac{(\lambda_{12} - \lambda_{22}) \; P{(\omega_2)}}{(\lambda_{21} - \lambda_{11}) \; P{(\omega_1)}}$

$\Rightarrow \frac{p(x|\omega_1)}{P(x|\omega_2)} > \frac{(1 - 0) \; (0.5)}{(1 - 0) \; (0.5)}$

$\Rightarrow \frac{p(x|\omega_1)}{P(x|\omega_2)} > 1$

$\Rightarrow \frac{1}{\pi b} \; \dot \; \frac{1}{1 \; + \; \bigg(\frac{x - a_1}{b} \bigg)^2} > \frac{1}{\pi b} \; \dot \; \frac{1}{1 \; + \; \bigg(\frac{x - a_2}{b} \bigg)^2}$

$\Rightarrow \frac{1}{1 \; + \; \bigg(\frac{x - a_1}{b} \bigg)^2} > \frac{1}{1 \; + \; \bigg(\frac{x - a_2}{b} \bigg)^2}$

$\Rightarrow \bigg(\frac{x - a_1}{b} \bigg)^2 > \bigg(\frac{x - a_2}{b} \bigg)^2$

$\Rightarrow x-a_1 > x-a_2$

$\Rightarrow x < \frac{a_1+a_2}{2}$

Decide $\omega_1 $ if $ x < \frac{a_1+a_2}{2}$ else decide $\omega_2$.


Plotting the class posterior probabilities and decision boundary


In [12]:
%pylab inline

import numpy as np
from matplotlib import pyplot as plt

def cauchy(x, a, b):
    """
    Calculates the Cauchy distribution's probability density 
    function.  
        
    """
    term1 = 1 / (np.pi * b)
    term2 = 1 / (1 + ((x - a)/b)**2)
    return term1 * term2

def decision_boundary(a1, a2):
    return (a1 + a2) / 2

def posterior(likelihood, prior):
    """
    Calculates the posterior probability (after Bayes Rule) without
    the scale factor p(x) (=evidence).  
        
    """
    return likelihood * prior

# generating some sample data for plotting the posterior probabilities
x = np.arange(-10, 10, 0.05)

# probability density functions
posterior1 = posterior(cauchy(x, a=0, b=0.5), prior=0.5)
posterior2 = posterior(cauchy(x, a=2, b=0.5), prior=0.5)

# plotting probability density functions
plt.plot(x, posterior1)
plt.plot(x, posterior2)
plt.title('Posterior Probabilities w. Decision Boundary for a Cauchy Distribution')
plt.ylabel('P(w)')
plt.xlabel('random variable x')
plt.legend(['P(w_1|x)', 'P(w_2|x)'], loc='upper right')
plt.ylim([0,0.4])
plt.xlim([-4,6])

# plotting the decision boundary
bound = decision_boundary(a1=0, a2=2)
plt.vlines(bound, -10, 10, color='r', alpha=0.8, linestyle=':', linewidth=2)
plt.annotate('R1', xy=(0.0, 0.35), xytext=(0.0, 0.35))
plt.annotate('R2', xy=(2.0, 0.35), xytext=(2.0, 0.35))
plt.show()


Populating the interactive namespace from numpy and matplotlib


Classifying some random example data


In [23]:
import math 
import random as rnd

def cauchy(location, scale):
    """
    Generates random cauchy distributed data
    reference: http://www.johndcook.com/python_cauchy_rng.html
    
    """
    p = 0.0
    while p == 0.0:
        p = rnd.random()
        
    return location + scale*math.tan(math.pi*(p - 0.5))


# Parameters
a_1 = 0
a_2 = 2
b = 0.5

# Generating 20 Cauchy distributed random samples for class 1 & 2
x1_samples = [cauchy(a_1, b) for i in range(20)]
x2_samples = [cauchy(a_2, b) for i in range(20)]
y = [0.1 for i in range(20)]

plt.scatter(x1_samples, y, marker='o', color='green', s=40, alpha=0.5)
plt.scatter(x2_samples, y, marker='^', color='blue', s=40, alpha=0.5)
plt.ylim([0,0.4])
plt.xlim([-4,6])

# Plotting the decision boundary
bound = decision_boundary(a1=0, a2=2)
plt.vlines(bound, -10, 10, color='r', alpha=0.8, linestyle=':', linewidth=2)
plt.annotate('R1', xy=(0.0, 0.2), xytext=(0.0, 0.2))
plt.annotate('R2', xy=(2.0, 0.2), xytext=(2.0, 0.2))

# Plot annotation
plt.title('Classifying random example data from 2 classes')
plt.ylabel('')
plt.xlabel('random variable x')
plt.legend(['w_1', 'w_2'], loc='upper right')

plt.show()



Calculating the empirical error rate


In [21]:
w1_as_w2, w2_as_w1 = 0, 0
for x1,x2 in zip(x1_samples, x2_samples):
    if x1 > decision_boundary(a1=0, a2=2):
        w1_as_w2 += 1
    if x2 <= decision_boundary(a1=0, a2=2):
        w2_as_w1 += 1
  
emp_err =  (w1_as_w2 + w2_as_w1) / float(len(x1_samples) + len(x2_samples))
    
print('Empirical Error: {}%'.format(emp_err * 100))


Empirical Error: 10.0%

Calculating the Bayes Error

$P(error) = \int_{x=-\infty}^{\infty} P(error/x) \; \cdot \; p(x) \; dx$

$P(error|x) = \left\{ \begin{array}{l l} P(\omega_1 | x) & \quad \text{if $x > \frac{a_1 + a_2}{2}$}\\ P(\omega_2 | x) & \quad \text{if $x < \frac{a_1 + a_2}{2}$}\end{array} \right.$

$\Rightarrow \int_{x=-\infty}^{\frac{a_1+a_2}{2}} P(\omega_2|x) \; \cdot \; p(x) \; dx + \int_{x=\frac{a_1+a_2}{2}}^{\infty} P(\omega_1|x) \; \cdot \; p(x) \; dx$

$\Rightarrow \int_{x=-\infty}^{\frac{a_1+a_2}{2}} p(x|\omega_2) \; \cdot \; P(\omega_2) \; dx + \int_{x=\frac{a_1+a_2}{2}}^{\infty} p(x|\omega_1) \; \cdot \; P(\omega_1) \; dx$

$\Rightarrow \int_{x=-\infty}^{\frac{a_1+a_2}{2}} \frac{1}{\pi b} \; \dot \; \frac{1}{1 \; + \; \bigg(\frac{x - a_2}{b} \bigg)^2} \; \frac{1}{2} \; dx + \int_{x=\frac{a_1+a_2}{2}}^{\infty} \frac{1}{\pi b} \; \dot \; \frac{1}{1 \; + \; \bigg(\frac{x - a_1}{b} \bigg)^2} \; \frac{1}{2} \; dx$

$\Rightarrow \frac{1}{2} - \frac{1}{\pi} \; tan^{-1} \bigg| \frac{a_2 - a_1}{2b} \bigg|$


In [11]:
def bayes_error(a1, a2, b):
    """ Calculates the Bayes Error for 2 Cauchy densities ."""
    term1 = 1/2.0 - 1/np.pi
    term2 = np.arctan(np.fabs((a2-a1)/(2*b)))
    return term1 * term2

print('Bayes Error: {:.2f}%'.format(100* bayes_error(0, 2, 0.5)))


Bayes Error: 20.12%

In [ ]: