uniform
: A uniform continuous random variable.norm
: A normal continuous random variable.beta
: A beta continuous random variable.gamma
: A gamma continuous random variable.t
: A Student’s T continuous random variable.chi2
: A chi-squared continuous random variable.f
: An F continuous random variable.multivariate_normal
: A multivariate normal random variable.dirichlet
: A Dirichlet random variable.wishart
: A Wishart random variable.bernoulli
: A Bernoulli discrete random variable.binom
: A binomial discrete random variable.boltzmann
: A Boltzmann (Truncated Discrete Exponential) random variable.rvs
: 샘플 생성pdf
or pmf
: Probability Density Functioncdf
: Cumulative Distribution Functionstats
: Return mean, variance, (Fisher’s) skew, or (Fisher’s) kurtosismoment
: non-central moments of the distributionfit
: parameter estimation random_state
: seedsize
: 생성하려는 샘플의 shapeloc
: 일반적으로 평균의 값scale
: 일반적으로 표준편차의 값
In [1]:
rv = sp.stats.norm(loc=10, scale=10)
rv.rvs(size=(3,10), random_state=1)
Out[1]:
In [2]:
sns.distplot(rv.rvs(size=10000, random_state=1))
Out[2]:
In [3]:
xx = np.linspace(-40, 60, 1000)
pdf = rv.pdf(xx)
plt.plot(xx, pdf)
Out[3]:
In [4]:
cdf = rv.cdf(xx)
plt.plot(xx, cdf)
Out[4]:
In [5]:
sp.pi
Out[5]:
In [6]:
import scipy.constants
In [7]:
sp.constants.c # speed of light
Out[7]:
In [8]:
x = np.linspace(-3, 3, 1000)
y1 = sp.special.erf(x)
a = plt.subplot(211)
plt.plot(x, y1)
plt.title("erf")
a.xaxis.set_ticklabels([])
y2 = sp.special.expit(x)
plt.subplot(212)
plt.plot(x, y2)
plt.title("logistic")
Out[8]:
In [5]:
A = np.array([[1, 2],
[3, 4]])
sp.linalg.inv(A)
Out[5]:
In [6]:
sp.linalg.det(A)
Out[6]:
regression과 interpolation의 차이.
interpolation(내삽? 보간?) 두 데이터가 주어졌을때 데이터가 다 맞다고 가정하고 이 사이를 채움.
데이터 분석에선 안쓰고, 과적합이 발생.
눈에 보이게 예쁘게 만들기 위함.
regression 주어진 데이터가 맞다고 가정하진 않지만,...
In [7]:
from scipy.interpolate import interp1d
x = np.linspace(0, 10, num=11, endpoint=True)
y = np.cos(-x**2/9.0)
f = interp1d(x, y)
f2 = interp1d(x, y, kind='cubic')
xnew = np.linspace(0, 10, num=41)
plt.plot(x, y, 'o', xnew, f(xnew), '-', xnew, f2(xnew), '--')
plt.legend(['data', 'linear', 'cubic'])
Out[7]:
In [8]:
x, y = np.mgrid[-1:1:20j, -1:1:20j]
z = (x+y) * np.exp(-6.0*(x*x+y*y))
plt.pcolormesh(x, y, z)
Out[8]:
In [9]:
xnew, ynew = np.mgrid[-1:1:100j, -1:1:100j]
tck = sp.interpolate.bisplrep(x, y, z, s=0)
znew = sp.interpolate.bisplev(xnew[:,0], ynew[0,:], tck)
plt.pcolormesh(xnew, ynew, znew)
Out[9]:
In [10]:
from scipy import optimize
In [14]:
# 전역 최소값 찾기 문제
# 국소 최저점 문제 (local minima)
def f(x):
return x**2 + 10*np.sin(x)
x = np.arange(-10, 10, 0.1)
plt.plot(x, f(x))
Out[14]:
In [16]:
result = optimize.minimize(f, 4)
print(result)
x0 = result['x']
x0
Out[16]:
In [17]:
# local minima
plt.plot(x, f(x));
plt.hold(True)
plt.scatter(x0, f(x0), s=200)
Out[17]:
In [18]:
x1 = optimize.minimize(sixhump, (1, 1))['x']
x2 = optimize.minimize(sixhump, (-1, -1))['x']
print(x1, x2)
In [20]:
time_step = 0.02
period = 5.
time_vec = np.arange(0, 20, time_step)
sig = np.sin(2 * np.pi / period * time_vec) + 0.5 * np.random.randn(time_vec.size)
plt.plot(sig)
Out[20]:
In [21]:
import scipy.fftpack
sample_freq = sp.fftpack.fftfreq(sig.size, d=time_step)
sig_fft = sp.fftpack.fft(sig)
pidxs = np.where(sample_freq > 0)
freqs, power = sample_freq[pidxs], np.abs(sig_fft)[pidxs]
freq = freqs[power.argmax()]
plt.stem(freqs[:50], power[:50])
plt.xlabel('Frequency [Hz]')
plt.ylabel('plower')
Out[21]:
In [ ]: