In fact, symbolically: $\displaystyle \mathbf{P}[x\le X\le x+dx] = f_X(x) dx$
This shows that $f_X(x)$ by itself is not a probability, but rather some sort of probability intensity.
Definition:
If $X$ is a random variable, (discrete, continuous, or otherwise), we let $F_X(x) = \mathbf{P}[X\le x]$ for any $x\in \mathbb{R}$.
This $F_X(x)$ is called the cumulative distribution function of $X$.
Note: if $X$ has density $f_X$, then $\displaystyle F_X(x) = \int_{-\infty}^x f_X(x) dx$
This is from the definition od $f_X$ with $a=-\infty$.
Note: In this case, we also have $\displaystyle \frac{dF_X}{dx} f_X(x)$
This follows directly from the so-called "fundamental theorem of Calculus".
Some facts:
Also, when $X$ has a density, then $F_X$ is continuous.
$\displaystyle \int_{-\infty}^\infty f(x) dx = 1$
$f_X(x) \ge 0$
Properties:
$F_X$ increases from $0$ to $1$
The limits $0$ and $1$ at $-\infty$ and $+\infty$ exists because of "monotonous convergence of probabilities".
For $X$ with density $f_X$, the value of $f_X$ at a single point or even a countable collection of points does not matter.
Example 1: (Fundamental): Uniform density on $[0,1]$. $f_X(x) = \left\{\begin{array}{lr}1 & \text{if }x\in[0,1] \\ 0 & \text{otherwise}\end{array}\right.$
Example 2: $X$ with density $f_X(x) = \left\{\begin{array}{lr}constant & \text{if }x\in [0,\frac{1}{2}] \\ 0 & \text{otherwise}\end{array}\right.$
We conclude that $\int_{-\infty}^\infty f_X(x) dx = 1 \ \ \Longrightarrow constant = 2$
Example 3: $X$ with density $f_X(x) = \left\{\begin{array}{lr}constant & \text{if }x\in [a,b] \\ 0 & \text{otherwise}\end{array}\right.$
Then the constant is $\displaystyle constant = \frac{1}{b-a}$
Example 4: $X$ with density $\displaystyle f_X(x) = \left\{\begin{array}{lr}\displaystyle \frac{1}{2}x & \text{if }x\in [0,2] \\ 0 & \text{otherwise}\end{array}\right.$
Example 5: (fundamental) $X$ with density $\displaystyle f_X(x) = \left\{\begin{array}{lr}\displaystyle \lambda e^{-\lambda x} & \text{for }x > 0 \\ 0 & \text{otherwise}\end{array}\right.$
$\lambda>0$
Note: the only difference between above density and Laplacians density is that Laplace has $|x|$ which makes it symmetric to $x$, and as a result, it also needs $\frac{1}{2}$ coefficient.
Let;s compute CDF of $X$: $$F_X(x) = \int_{-\infty}^x f_X(y)dy \\ f_X\text{ is }0 \text{ if }y < 0 \\ \int_0^x f_X(y) dy = \int_0^x \lambda e^{-\lambda y} dy = \left. -e^{-\lambda}\right]_0^x \\ = -e^{-\lambda x} + e^0 = 1 - e^{-\lambda x}$$
So we have proved that $F_X(x) = 1 - e^{-\lambda x}$ for $x\ge 0$
Given that we have already waited $a$ times for an event to happen. What is the probability that we should wait extra $h$ longer time for the event to happen?
$$\mathbf{P}[X > a + h | X > a] = \frac{\mathbf{P}[X > a+h \ and \ X>a]}{\mathbf{P}[X>a]} \\ = \frac{\mathbf{P}[X > a+h]}{\mathbf{P}[X>a]} \\ =\displaystyle \frac{e^{-\lambda (a+h)}}{e^{-\lambda (a)}} = e^{-\lambda (h)}$$The results does not depend on "a". This means the exponential function has the "memoryless" property.
Recall: Geometric distribution was for a random variable describing how long we should wait to get the first success.
Let $Y\sim Geom(p)$, then the "survival property" $\mathbf{P}[Y > k]$ where $k$ is integer. $\displaystyle \mathbf{P}[Y>k]= \mathbf{P}[0000000\text{ (k times)}] = (1-p)^k=\displaystyle e^{\displaystyle -k\ln \displaystyle (\frac{1}{1-p})}$
Compare with $\mathbf{P}[X > a] = e^{-\lambda a}$.
We see that this is actually the same formula: the value $\displaystyle \ln(\frac{1}{1-p})$ plays the role of $\lambda$ and therefore, $Y$ also has the memoryless property.
Remark: Essentially, if a continuous random variable $X$ has the memoryless property, then $X$ is exponential. Same thing with $Y$ discrete.
Let $X$ have density $f_X$. Let $a,b$ be constants. Let $Y=aX+b$, then $Y$ has a density like this: $\displaystyle f_Y(y) = f_X(\displaystyle \frac{x-b}{a})\times \frac{1}{a}$
It can be proven as follows: $F_Y(y) = \mathbf{P}[aX+b \le y] = \mathbf{P}[\displaystyle X\le \frac{y-b}{a}] \Rightarrow \ \text{ by definition } = F_X(\displaystyle \frac{y-b}{a})$
Therefore, $f_Y(y) = \frac{dF_Y}{dy} = \frac{1}{a} F'_X(\frac{y-b}{a}) = \frac{1}{a} f_X(\frac{y-b}{a})$
Property
$$\mathbf{Var}[aX + v] = a^2 \mathbf{Var}[X]$$Let $Y\sim Uni[0,1]$. What is variance of $Y$?
Recall that $\mathbf{E}[Y] = \frac{a+b}{2}$ and $\mathbf{Var}[Y] = \mathbf{Var}[(b-a)X + a] = (b-a)^2 \mathbf{Var}[X]$.
Now, $$\mathbf{Var}[X] = \mathbf{E}[X^2] - (\mathbf{E}[X])^2 $$
$\mathbf{E}[X^2] = \int_0^1 x^2 dx = \frac{1}{3}$
So, $\mathbf{Var}[X] = \frac{1}{3} - (\frac{1}{2})^2 = \frac{1}{12}$
Finally, $\mathbf{Var}[Y] = (b-a)^2 \mathbf{Var}[X] = \frac{(b-a)^2}{12}$
Compare this with the formula in the discrete case $\{1,2,...N\}$, $\mathbf{Var}[] = \frac{N(N-1)}{12}$
Let $X\sim Expn(\lambda)$. What is $\mathbf{E}[X]$?
$\mathbf{E}[X] = \int_0^{+\infty}\lambda e^{-\lambda x} \ x\ dx $
Solve the integral by "integration by parts":
We proved that for $X\sim Expon(\lambda)$, then $\mathbf{E}[X] =\displaystyle \frac{1}{\lambda}$
This matches the idea that since the average number of arrivals of a Poisson process with parameter $\lambda$ in an interval of time 1, is $=\lambda$, it might be totally true and very satisfying that the average wait for any event is $=\frac{1}{\lambda}$
Using the moments relation for exponential random variable, we can easily find the variance:
Let $X\sim Exp(\lambda)$, then survival fimction $\mathbf{P}[X\ge x] = \displaystyle e^{\displaystyle -\lambda x}$.
$X$ is interpreted as a waiting time. $X$ is memory-less; the key-property of exponential random variables.
The rest of the story is that the inter-arrival times of a Poisson process are independent of each other, and are exponentially distributed.
In fact, there is an equivalence between saying that events arrive according to such i.i.d. exponentially distributed inter-arrival times and saying that the arrival times are the jump times of a Poisson process.
Some distributions are mixed, with a density and a discrete part. For example, we can define
$$X = \left\{\begin{array}{lcr} Y & if & Y\le B\\B & & otherwise\end{array}\right.$$where $Y$ is a random variable with density and $B$ is not random. There $X$ has a density below $B$ and $B$ is an ... for $X$.
In [16]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
x = np.arange(0, 6, 0.1)
def CDF(lam, x):
return 1 - np.exp(-lam * x)
def survival(lam, x):
return np.exp(-lam*x)
fig,ax = plt.subplots(nrows=1, ncols=2, figsize=(12,4), sharey=True)
ax[0].plot(x, CDF(1, x), 'b-', lw=4)
plt.ylim((0,1.3))
ax[1].plot(x, survival(1, x), 'r-', lw=4)
plt.show()
Let $X\sim Expon(1)$. Let $\lambda>0$ be fixed. Let $Y=\frac{1}{\lambda} X$. Let's find density of $Y$.
By our linear transformation for density: $$f_Y(y) =\displaystyle \frac{1}{1/\lambda} f_X \left(\frac{y - 0}{1/\lambda}\right) = \lambda e^{-\lambda y}$$
Here, we recognize that $Y \sim Expon(\lambda)$.
First of all, $Y > 0$.
Let's try to compute the CDF of $Y$: $$F_Y(y) = \mathbf{P}[Y\le y] = \left\{\begin{array}{lcr} 0 & if & y \le 0 \\ \mathbf{P}[\ln U \ge -y] && otherwise\end{array}\right.$$
$$\mathbf{P}[\ln U\ge -y] = \mathbf{P}[U \ge e^{\displaystyle -y}]$$Note that in order for above to be legitimate, $e^{-y}$ has to be between [0,1]. And in fact it is since $y>0$.
Next, since $U\sim Unif(0,1)$, then for any $ 0 \le a \le 1$: $\mathbf{P}[U \ge a] = 1 - \mathbf{P}[U \le a] = 1 - a$
Therefore, $$\mathbf{P}[U \ge e^{-y}] = 1 - \mathbf{P}[U \le e^{-y}] = 1 - e^{-y}$$ and we recognize that this is infact the CDF of exponential random variable. So, $Y \sim Exp(\lambda = 1)$
We have proved that to simulate a drawe of a random variable which is $Exp(\lambda=1)$, we just nneed to draw a random variable in $(0,1)$ and take the opposite of its $\ln$.
If you want to simulate an exponential random variable, we can generate uniform random numbers, and transform them according to above.
If $X\sim Exp(\lambda)$ then survival function $\mathbf{P}[X > x] = e^{\displaystyle -\lambda x}$
$$\mathbf{P}[Y < \lambda x] = e^{-\lambda x}$$$$\mathbf{P}[\frac{1}{\lambda} Y > x] = e^{-\lambda x}$$Let's call $Z=\frac{1}{\lambda}Y$. Then $\mathbf{P}[Z > x] = e^{-\lambda x}$ therefore, we recognize that $Z\sim Exp(\lambda)$
Also, this shows that $\frac{1}{\lambda}$ is a so-called scale paramater for the exponential distribution. In particular, by multiplying an exponential random variable by a number, we get another exponential random variable.
Moreover, generally for any class of random variables with $CDF=F(x;c)$ we say that $c$ is a scale parameter of $F(x;c) = \tilde{F}(\frac{x}{c})$
Note: for densities this translates as $f(x;c) = \frac{1}{c} ~ \tilde{f}(\frac{x}{c})$
$Exp(\lambda)$ has $\frac{1}{\lambda}$ as a scale parameter because
$$F_X(x) = 1 - e^{\displaystyle -\lambda x} = 1 - e^{\displaystyle -\frac{x}{\frac{1}{\lambda}}} $$Let $X$ have density $f_X$
Let $h$ be a strictly monotone function(either always decreasing or always increasing)
Let $Y=h(X)$ so $Y$ is also a random variable and has a density $f_Y$
$$f_Y(y) = f_X(h^{-1}(y)) \frac{1}{\frac{d~ h }{dx} \left(h^{-1}(y)\right)}$$Recall that $Y$ has a density $f_Y$ f we can write symbolically $\mathbf{P}[Y \in dy] = f_Y(y) dy$
$$\mathbf{P}[Y \in dy] = \mathbf{P}[h(x) \in dy] = \mathbf{P}[X \in h^{-1}(dy)] \\ = \mathbf{P}[X \in~ (\text{interval around the value }h^{-1}(y)\text{ of width }h^{-1}(dy) )] \\ =f_X\left(h^{-1}(y)\right) \times \text{width of that interval} $$$h^{-1}(y) = \frac{d~x}{dy} dy = \displaystyle \frac{d ~h^{-1}(Y)}{dy} dy = \displaystyle \frac{1}{\displaystyle \frac{d~y}{dx}} dy$
A pair of random variables $(X,Y)$ has a density $f_{X,Y}$ if for any intervals $[a,b]$ and $[c,d]$ the probability $$\mathbf{P}\left[X\in [a,b] \& Y \in [c,d]\right] = \int_a^b \left( \displaystyle \int_c^d f_{X,Y} (x,y)~dy \right)dx$$
or sometimes written as $${\int \displaystyle \int}_{[a,b]\times[c,d]} f_{X,Y}(x,y) ~dx ~dy$$
Note: $[a,b]\times [c,d]$ is a rectangle that looks like this
If $X$ and $Y$ are independent, then $\mathbf{E}[XY] = \mathbf{E}[X] \mathbf{E}[Y]$
Equally importantly, if $f_X$ and $f_Y$ are densities for $X$ and $Y$, then the pair $(X,Y)$ has density $$f_{X,Y}(x,y) = f_X(x) f_Y(y)$$
Let $(X,Y)$ have density $f_{x,Y}(x,y) = \left\{\begin{array}{lrr}15 e^{-5x-3y} & if & x<0 and y>0\\ 0 && otherwise\end{array}\right.$
What is the distribution of $(X,Y)$?
$$15 e^{-5x-3y} = 5e^{-5x} \times 3e^{-3y} \\ 5e^{-5x}\sim Exp(5) \ \ \ \ 3e^{-3y}\sim Exp(3)$$Since $X$ and $Y$ have zero density on negative values, this is really the product of these densities. Therefore, $x\&Y$ are indepedent, and $X\simExp(5)$ and $Y\sim Exp(3)$
Let $(X,Y)$ have density $f_{X,Y} = \left\{\begin{array}{lrr} 8xy & for & 0\le x \le y \le 1 \\ 0 &&otherwise\end{array}\right.$
Note: from the condition $0\le x \le y \le 1 \\ 0$ we can get that $X \le Y$. So there is a relation between them, therefore we are pretty sure that $X\&Y$ are NOT independent because of the order between them.
Note, we should check that $\displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty f_{X,Y}(x,y) ~dx ~dy = 1 $
$\displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty f_{X,Y}(x,y) ~dx ~dy = \displaystyle \int_0^1 \left(\int_0^y f_{X,Y}(x,y) ~dx \right)~dy = 1 $
With the same $X\&Y$ as given above, let $T=X+Y$ find probability $\mathbf{P}[T>1]$?
since we want $T>1$, and we know that $X<Y$, so the minimum possible value for $Y$ is $0.5$,
$$\mathbf{P}[X+Y > 1] = \int_\frac{1}{2}^1 \int_{\text{over all possible values of }x\text{ when }y\text{ is fixed}} 8xy ~dx ~dy \\= \mathbf{P}[X > 1 - Y] = \int_\frac{1}{2}^1 \int_{1-y}^1 8xy ~dx ~dy$$when $1-y < y$? This occurs if $y>\frac{1}{2}$ and it is already satisfied from the outer interal.
So, therefore, $$\mathbf{P}[T>1] = \int_\frac{1}{2}^1 \int_{1-y}^1 8xy ~dx ~dy \\ = \int_\frac{1}{2}^1 8y \int_{1-y}^1 x ~dx ~dy = \int_\frac{1}{2}^1 8y\left(\frac{1}{2}x^2\right)_{1-y}^y dy $$
Let $f_{X,Y} = \left\{\begin{array}{lrr}\displaystyle 2e^{\displaystyle -x-y}&for& 0\le y <x <\infty \\ 0 &&otherwise\end{array}\right.$
Let's find $f_X(x)=?$
$$f_X(x) = \int_{-\infty}^\infty f_{X,Y}(x,y) dy = \int_0^x 2e^{-x-y} dy = 2e^{-x} \int_0^x e^{-y}dy = 2e^{-x} \left(-e^{-y}\right)_0^x = 2e^{-x}\left(1 - e^{-x}\right) \ \ \text{for }x\ge 0$$Let's find $f_Y(y)=?$
$$f_Y(y) = 2\int_y^\infty 2e^{-x-y} dx = 2e^{-y} \int_y^\infty e^{-x}dx = 2e^{-y} \left(e^{-x}\right)_y^\infty = 2e^{-y}\left(0 + e^{-y}\right) = 2e^{-2y} \sim Exp(2)$$With this example, let $T=X+Y$. Does $T$ have a density?
Let's find the CDF $F_T$ and look for $\frac{d~F_T}{dt}$. We compute for $t>0$.
For $t>0$: $$F_T(t) = \mathbf{P}[X+Y \le 1] = \int_?^? \left(\int_?^? dx\right) dy$$
We have to find the limits on the integrals. We start by $T=X+Y \le t$, therefore, $Y \le X \le t - Y \Longrightarrow Y \le t - Y$ and we get $Y \le \frac{t}{2}$.
$$\Longrightarrow \ \int_0^\frac{t}{2} \left(\int_y^{t-y} 2e^{-x-y} ~dx \right) dy = \int_0^\frac{t}{2} 2e^{-y}\left( \int_y^{t-y} e^{-x-y} ~dx \right) dy \\ = \int_0^\frac{t}{2} 2e^{-y} \left(-e^{-x}\right)_y^{t-y} dy = \int_0^\frac{t}{2} 2e^{-y}\left(-e^{y-t} + e^{-y} \right) dy = \int_0^\frac{t}{2} -2e^{-t} +2 e^{-2y} ~dy = \left(-2e^{-t}y - e^{-2y}\right)_0^\frac{t}{2} \\= -2~\frac{t}{2}~e^{-t} - e^{-2\times \frac{2}{t}} + 1 = 1 - e^{-t} - te^{-t}$$So, we can see that the result is differentiable, which means that the derivative of $F_T(t)$ exists. So,, density of $T$ exists and $$f_T(t) = \frac{d~F_T(t)}{dt} = \left\{\begin{array}{lrr}t~e^{-t} & for & t\ge 0\\ 0 && otherwise\end{array}\right.$$
The CDF of $(X,Y)$ is $$F_{X,Y}(x,y) = \mathbf{P}[X \le x \text{ AND } Y\le y]$$
If $\displaystyle \frac{\partial^2 ~F_{X,Y}}{\partial x ~ \partial y}$ exists, then, it is the density of $(X,Y)$: $f_{X,Y}(x,y) = \displaystyle \frac{\partial^2 ~F_{X,Y}}{\partial x ~ \partial y}$
Note:
It's very easy to compute the CDF of $X_{n,n}$.
$$F_{X_{n.n}}(x) = \mathbf{P}[X_{n,n} \le x] \Longrightarrow \text{if max of some numbers is less than something, all those numbers are less than that} \\ F_{X_{n.n}}(x) = \mathbf{P}\left[X_{n,1}\le x~\&~X_{n,2}\le x~\&...~\&X_{n,n}\le x\right] $$Now, assume that $X_i$'s are independent. As a result, we can write
$$\mathbf{P}\left[X_{n,1}\le x~\&~X_{n,2}\le x~\&...~\&X_{n,n}\le x\right] = \mathbf{P}[X_{n,1}\le x]~\mathbf{P}[X_{n,2}\le x]~...~\mathbf{P}[X_{n,n}\le x] \\ \Longrightarrow = F_{X_1}(x)~F_{X_2}(x)...F_{X_n}(x)~$$What about the min?
The CDF does not work for the min as it did for max. But instead, the survival function works:
$$\mathbf{P}[\min(X_1,X_2,..X_n) \ge x] = \mathbf{P}[X_1\ge x~\&X_2\ge x~\&...\&X_n\ge x\&] = \text{with i.i.d. assumption } \\ = \mathbf{P}[X_1\ge x]~\mathbf{P}[X_2\ge x]~...~\mathbf{P}[X_n\ge x]$$Let $X_1,X_2, ...X_n$ be indepedent exponential random variables with parameters $\lambda_1,\lambda_2,...,\lambda_n$.
Then, for each $X_i$: $\mathbf{X_i \ge x} = e^{-\lambda_i x}$
The survival function for the min of $X_1,X_2, ...X_n$, will be $$\mathbf{P}[\min(X_1,X_2, ...X_n)\ge x] = \prod_i \mathbf{P}[X_i \ge x] = e^{-\lambda_1 x}~e^{-\lambda_2 x} ... ~e^{-\lambda_n x} = e^{\displaystyle -(\lambda_1 + \lambda_2 + ... +\lambda_n)x}$$
Therefore, $\min(X_1,X_2, ...X_n)\sim \displaystyle Exp(\lambda_1 + \lambda_2 + ... +\lambda_n)$
In [ ]: