numpy
and matplotlib
, plot the exponential distribution with $\lambda = 1$, $1.5$, and $2$. Include a legend that uses $\LaTeX$
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
In [2]:
#Question 1.1
x = np.linspace(0,3, 1000)
exp1 = 0.1 * np.exp(-1.0 * x)
exp2 = 1 * np.exp(-1.5 * x)
exp3 = 10 * np.exp(-2 * x)
plt.plot(x, exp1, label='$\lambda = 1.0$')
plt.plot(x, exp2, label='$\lambda = 1.5$')
plt.plot(x, exp3, label='$\lambda = 2.0$')
plt.legend(loc='upper right')
plt.show()
In [3]:
#Quesion 1.2
from scipy.special import comb
N = 10
p = 0.2
x = np.arange(0, N + 1)
b1 = comb(N, x) * p**(x) * (1 - p)**(N - x)
p = 0.5
b2 = comb(N, x) * p**(x) * (1 - p)**(N - x)
plt.plot(x, b1, 'o-', label="$N = 10$, $p = 0.2$")
plt.plot(x, b2, 'o-', label="$N = 10$, $p = 0.9$")
plt.legend()
plt.show()
In [4]:
#Question 1.3
x = np.linspace(1, 8,100)
plt.plot(x, 2**x, label='$2^x$')
plt.plot(x, x**2, label='$x^2$')
plt.legend(loc='upper left')
plt.show()
If you execute the cell below, it will write the contents of the cell to a file called che116.mplstyle.
We will use this file in the future, so hold onto it. To load this style and use it, execute plt.style.use('che116.mplstyle')
. Make the following changes to the file, write it, and then plot three interesting lines on the samge graph:
View the comments in this file to learn what all the parameters do
In [5]:
%%writefile che116.mplstyle
#set the font-size and size of things
figure.figsize: 5, 3
axes.labelsize: 14.3
axes.titlesize: 15.6
xtick.labelsize: 13
ytick.labelsize: 13
legend.fontsize: 13
grid.linewidth: 1.3
lines.linewidth: 2.275
patch.linewidth: 0.39
lines.markersize: 9.1
lines.markeredgewidth: 0
xtick.major.width: 1.3
ytick.major.width: 1.3
xtick.minor.width: 0.65
ytick.minor.width: 0.65
xtick.major.pad: 9.1
ytick.major.pad: 9.1
axes.xmargin : 0
axes.ymargin : 0
#setup our colorscheme
patch.facecolor: 348ABD # blue
patch.edgecolor: EEEEEE
patch.antialiased: True
font.size: 12.0
text.color: black
axes.facecolor: E5E5E5
axes.edgecolor: bcbcbc
axes.linewidth: 1
axes.grid: False
axes.labelcolor: 555555
axes.axisbelow: True # grid/ticks are below elements (e.g., lines, text)
axes.prop_cycle: cycler('color', ['444444', '348ABD', '988ED5', '777777', 'FBC15E', '8EBA42', 'FFB5B8'])
# E24A33 : red
# 348ABD : blue
# 988ED5 : purple
# 777777 : gray
# FBC15E : yellow
# 8EBA42 : green
# FFB5B8 : pink
xtick.color: 555555
xtick.direction: out
ytick.color: 555555
ytick.direction: out
grid.color: white
grid.linestyle: - # solid line
figure.facecolor: white
figure.edgecolor: 0.50
#animation settings
animation.html : html5
In [6]:
#Incorrect answer -> no changes
plt.style.use('che116.mplstyle')
from math import pi
x = np.linspace(0, 2 * pi, 100)
plt.plot(x, np.sin(x))
plt.plot(x, np.sin(x)**2)
plt.plot(x, np.fabs(np.sin(x)))
plt.show()
In [7]:
%%writefile che116.mplstyle
#set the font-size and size of things
figure.figsize: 10.4, 7.15
axes.labelsize: 14.3
axes.titlesize: 15.6
xtick.labelsize: 13
ytick.labelsize: 13
legend.fontsize: 13
grid.linewidth: 1.3
lines.linewidth: 2.275
patch.linewidth: 0.39
lines.markersize: 9.1
lines.markeredgewidth: 0
xtick.major.width: 1.3
ytick.major.width: 1.3
xtick.minor.width: 0.65
ytick.minor.width: 0.65
xtick.major.pad: 9.1
ytick.major.pad: 9.1
axes.xmargin : 0
axes.ymargin : 0
#setup our colorscheme
patch.facecolor: 348ABD # blue
patch.edgecolor: EEEEEE
patch.antialiased: True
font.size: 12.0
text.color: black
axes.facecolor: E5E5E5
axes.edgecolor: bcbcbc
axes.linewidth: 1
axes.grid: True
axes.labelcolor: 555555
axes.axisbelow: True # grid/ticks are below elements (e.g., lines, text)
axes.prop_cycle: cycler('color', ['E24A33', '348ABD', '988ED5', '777777', 'FBC15E', '8EBA42', 'FFB5B8'])
# E24A33 : red
# 348ABD : blue
# 988ED5 : purple
# 777777 : gray
# FBC15E : yellow
# 8EBA42 : green
# FFB5B8 : pink
xtick.color: 555555
xtick.direction: out
ytick.color: 555555
ytick.direction: out
grid.color: white
grid.linestyle: - # solid line
figure.facecolor: white
figure.edgecolor: 0.50
#animation settings
animation.html : html5
In [8]:
#Correct Answer
plt.style.use('che116.mplstyle')
from math import pi
x = np.linspace(0, 2 * pi, 100)
plt.plot(x, np.sin(x))
plt.plot(x, np.sin(x)**2)
plt.plot(x, np.fabs(np.sin(x)))
plt.show()
Create a plot with the following properties WITHOUT using a style file:
In [9]:
plt.figure(figsize=(8,6))
x = np.linspace(0, 2 * pi, 100)
plt.plot(x, np.cos(x))
plt.plot(x, np.sin(x))
plt.xlim(0, 2 * pi)
plt.ylim(-1.5, 1.5)
plt.hlines([-1, 1], 0, 2 * pi)
plt.title('Funny Title')
plt.text(pi, 0.5, '$E = mc^2$', fontdict={'fontsize': 24})
plt.show()
Mammograms are a testing procedure for breast cancer. The diagnosis procedure after a mammogram is positive is incredibly complex. We'll simplify a little bit here. If a mammogram test is positive, a woman will always return for a biopsy. A biopsy is the removal and analysis of a small amount of breast tissue. If a biopsy is positive, depending on the diagnosis, will lead to a mastectomy. The statistics from here on out are mostlye correct, but biopsy does not always follow a mammogram in real life. From ages 40 to 50, 45% of women who receive annual mammograms will have a false positive and 25% have a false negative. A false negative means that a woman had invasive breast cancer but the test did not show it. A false positive means a woman had no or benign cancer. A large study of biopsies shows that biopsies are correctly diagnosed 75% of the time (the state of cancer matches the state of the biopsy), with false positives being twice as likely as false negatives. You may assume that biopsies and mammograms are conditionally independent on the presence or absence of cancer. After positive finding from a mammogram and biopsy, a mastectomy is performed which has a 0.24% probability of mortality.The overall probability of having invasive breast cancer is 1.5% between the ages of 40 to 50. Answer the following questions:
Let's start by writing out what we know for this problem. Let $M$ be the rv for mammogram with 0 being negative screening and 1 being positive screening. Let $C$ represent no cancer (0) or benign cancer and invasive cancer (1). We are given:
$$P(C = 1) = 0.015$$$$P(M = 1\,|\, C = 0) = 0.45$$$$P(M = 0\,|\, C = 1) = 0.25$$We are being asked $$P(M = 1)$$
We can use marginalization of the conditional:
$$P(M = 1) = \sum_c P(M = 1\,|\,C = c) P(C = c)$$$$P(M = 1) = 0.45 \times 0.015 + 0.55 \times 0.985 = 0.54925$$$$P(M = 1) = 0.45 \times 0.985 + 0.75 \times 0.015 = 0.4545$$We are being asked $$P(C = 1\,|\,M = 1)$$ We can use Bayes' theorem, since all our conditionals are given in the opposite way.
$$P(C = 1\,|\,M = 1) = \frac{P(M = 1\, | \, C = 1) P(C = 1)}{P(M = 1)}$$We can plug in the numbers from above.
$$= \frac{0.75 \times 0.015}{0.4545} = 0.02475$$The probability of the woman having invasive cancer after a mammogram is 2%.
To arrive at a mastectomy, we must have a positive mammogram, a positive biopsy, and a mortal mastectomy. I will now use $B$ as the biopsy rv. If $B$ is 1, a mastectomy is performed. This question asks for $P(M = 1,B = 1, D = 1)$, where $D$ is the rv for dying during a mastectomy. $D$ is independent and $M$ and $B$ are conditionally independent on $C$. This means we need to know about the biopsy statistics. We are given not so straightforward information about the biopsy. In particular we know that:
$$P(B = 1, C = 1) + P(B = 0, C = 0) = 0.75$$We know these are joints because 75% of the time, our biopsy matches the cancer rv. There is no conditioning in that sentence. Furthermore, we know that:
$$\frac{P(B = 1\, | \, C = 0)}{P(B = 0 \, | \, C = 1)} = 2$$Rearranging and using $P(C = 0) = 0.985$, we can rewrite that as:
$$\frac{P(B = 1, C = 0)}{P(B = 0, C = 1)} = 2\times \frac{P(C = 0)}{P(C = 1)} = 135.4$$We know from marginilzation that:
$$P(B = 1, C = 0) + P(B = 0, C = 0) = P(C = 0) = 0.985$$$$P(B = 1, C = 1) + P(B = 0, C = 1) = P(C = 1) = 0.015$$You can solve for all the quantities, also using normalization as a then giving:
$$P(B = 0, C = 0) = 0.737$$$$P(B = 1, C = 0) = 0.248$$$$P(B = 0, C = 1) = 0.00183$$$$P(B = 1, C = 1) = 0.0132$$Let's now rearrange $P(M = 1, B = 1, D = 1)$:
$$P(M = 1, B = 1, D = 1) = P(B = 1, M = 1) P(D = 1)$$$$P(B = 1, M = 1) = \sum_C P(B = 1, M = 1 \,|\, C) P(C)$$$$P(B = 1, M = 1) = P(B = 1 \,|\, C = 0)P(M = 1 \,|\, C = 0)P(C = 0) + P(B = 1 \,|\, C = 1)P(M = 1 \,|\, C = 1)P(C = 1)$$$$P(B = 1, M = 1) = P(B = 1, C = 0)P(M = 1 \,|\, C = 0) + P(B = 1, C = 1)P(M = 1 \,|\, C = 1)P(C=1)$$$$P(B = 1, M = 1) = 0.248\times 0.45 + 0.0132 \times 0.75 = 0.1215$$Inserting back the mortality probability:
$$P(B = 1, M = 1, D = 1) = 0.1215\times 0.0024 = 0.000292 = 0.0292\%$$There are two survival probabilities, one for those that did have a mastectomy following mammogram and biopsy and those that had a false negative but had treatment later. We first need to find the probability of being in these two groups. We do not need to consider the $C = 0$ groups, since they will always survive cancer. So the first group is $P(B = 1, M = 1, C = 1, S = 0)$, where $S$ is survival. The second group is $P(B = 0, M = 1, C = 1, S' = 0)$ and $P(M = 0, C = 1, S' = 0)$.
We'll start with the first term:
$$P(B = 1, M = 1, C = 1, S = 0) = P(B = 1, M = 1\, |\, C = 1) P(C = 1) P(S = 0)$$$$ = P(B = 1, C = 1) P(M = 1\, | \, C = 1) P(S = 0)$$$$ = 0.0132 \times 0.75 \times (1 - 0.97) = 0.0003 = 0.03\%$$The next term:
$$P(B = 0, M = 1, C = 1, S' = 0) = P(B = 0, M = 1\, |\, C = 1) P(C = 1) P(S' = 0)$$$$ = P(B = 0, C = 1) P(M = 1\, | \, C = 1) P(S' = 0)$$$$ = 0.00183\times 0.75\times (1 - 0.93) = 0.0000961 = 0.00961\%$$and lastly:
$$P(M = 0, C = 1, S' = 0) = P(M = 0\, |\, C = 1) P(C = 1) P(S' = 0)$$$$ = 0.25 \times 0.015 \times 0.07 = 0.000263 = 0.0263\%$$So that the probability of dying from cancer is the sum of these three terms: $0.066\%$