We are going to sample 500 points each from three 2d gaussian distributions. The means of the three gaussians are $\mu_1 = [a, b]^T$, $\mu_2 = [a+2, b+4]^T$ and $\mu_3 = [a+4, b]^T$ respectively, where a is the last digit of your roll number and b is second last digit of your roll number.
Similarly the covariance matrices are $\Sigma_1 = \Sigma_2 = \Sigma_3 = I$
To generate points from 2d gaussians, we should first know how to generate random numbers.
In [1]:
%matplotlib inline
# code to sample a random number between 0 & 1
# Try running this multiple times by pressing Ctrl-Enter
import numpy as np
import matplotlib.pyplot as plt
print np.random.random()
In [14]:
print np.random.randn()
In [6]:
points = np.random.normal(3, 1, 1000)
# A histogram plot. It looks like a gaussian distribution centered around 3
plt.hist(points)
plt.show()
In [7]:
mean = np.array([3, 3])
cov = np.eye(2) # the identity matrix
points = np.random.multivariate_normal(mean, cov, 1000)
# scatter plot with x axis as the first column of points and y axis as the second column
plt.scatter(points[:, 0], points[:, 1])
plt.show()
The means of the three gaussians should be $\mu_1 = [a, b]^T$, $\mu_2 = [a+2, b+4]^T$ and $\mu_3 = [a+4, b]^T$ respectively, where a is the last digit of your roll number and b is the second last digit of your roll number.
Similarly the covariance matrices are $\Sigma_1 = \Sigma_2 = \Sigma_3 = I$
In [8]:
d1 = np.random.multivariate_normal([3, 0], cov, 500)
d2 = np.random.multivariate_normal([5, 4], cov, 500)
d3 = np.random.multivariate_normal([7, 0], cov, 500)
data = np.vstack([d1, d2, d3])
plt.scatter(d1[:, 0], d1[:, 1], color='red')
plt.scatter(d2[:, 0], d2[:, 1], color='blue')
plt.scatter(d3[:, 0], d3[:, 1], color='green')
plt.show()
We have assumed that $\Sigma = \sigma^2 I$.
Convince yourself that the Maximum Likelihood Estimate for $\sigma^2$ is $\frac{1}{2n}\sum\limits_{i=1}^n (x_i-\mu)^T(x_i-\mu)$, where $n$ is the number of samples.
Let's compute the maximum likelihood estimates for the three sets of data points (generated from 3 different gaussians) separately, denote them as $\hat\sigma_1^2$, $\hat\sigma_2^2$ and $\hat\sigma_3^2$ and then take the combined estimate as the averae of the three estimates.
In [ ]:
Refer your notes/textbook to convince yourself that in the particular case where all the normal distributions have the same prior and the same covariance matrix of the form $\sigma^2I$, the discriminant functions are given by $$g_i(x) = \mu_i^Tx - \frac{1}{2}\mu_i^T\mu_i$$Find the point at which $g_1(x) = g_2(x) = g_3(x)$
Draw the three decision boundaries by solving $g_1(x) = g_2(x)$, $g_1(x) = g_3(x)$ and $g_2(x) = g_3(x)$
In [ ]:
The kernel density model is given by $$p(x) = \frac{1}{N} \sum_{i=1}^N \frac{1}{(2\pi h^2)^{D/2}} exp\left(\frac{- (x-x_i)^T(x-x_i)}{2h^2}\right) \ $$ where D is the dimension (which is 2 here), h is the standard deviation parameter we have to set, and N is the total number of samples.
In [ ]:
# Generate 100 points
points = np.array([])
for i in range(100): # sample 100 points
if np.random.rand() > 0.5:
points = np.append(points, np.random.normal(2,1))
else:
points = np.append(points, np.random.normal(6,1))
plt.hist(points)
plt.show()
Our x ranges approximately from -2 to 10. The pdf is given by $p(x) = \frac{1}{N} \sum\limits_{i=1}^N \frac{1}{(2\pi h^2)^{1/2}} exp\left(\frac{- (x-x_i)^2}{2h^2}\right) \ $ for every value of x. In order to plot the estimated density, we compute the above pdf for a range of x, starting from -2 till 10, incrementing x by 0.02. Choose different values for the smoothing parameter h to get the best density estimate. (Try h=0.08, 0.1, 0.15 etc.) What value of h gives the bimodal distribution?
In [2]:
h = 0.08
X = np.arange(-2, 10, 0.02)
# for each point in x, we have compute its pdf
Y = np.array([])
N = len(points)
for x in X:
t = 0
for xi in points:
t += np.exp(-(x-xi)**2/(2*h*h))
y = (t/(2*np.pi*h*h)**0.5)/N
Y = np.append(Y, y)
plt.plot(X, Y)
plt.show()
Similarly do density estimation for the above data set which we sampled from 3 2d gaussians.
Note: It will be computationally expensive to calculate the density for all the points in the 2D plane. So do density estimation for points in the square [c-2, c+2]x[d-2, d+2] where (c,d) denotes the coordinates of the meeting point of the three discriminant lines in the Linear Discriminant Analysis we have done above.
In [38]:
from mpl_toolkits.mplot3d import Axes3D