In [1]:
using PyPlot
PyPlot.svg(true)
using DataFrames
We shall discuss the relationship between the statics of the Higher Education Entrance Examination and the Maxwell–Boltzmann distribution of thermology in this article.
Consider the distribution of the velocity of the gas molecule in a certain temprature $T$, where the mess of a single molecule is $m$ and the Boltzmann's constant is $k$:
$$ f(v) = 4 \pi ({m \over 2 \pi k T})^\frac{3}{2} e^{- \frac{m v^2}{2 k T}} v^2 $$
In [2]:
table = readtable("exam_data.csv")
A = Array(table)
Out[2]:
In [3]:
B = hcat(map(x->x[1]>425?x:[x[1], x[2]/10], [A[x, :] for x in 1:size(A)[1]])...)'
Out[3]:
Assume the following equation is approximately correct:
$$ y = \alpha e^{-\beta x^2} \cdot x^2$$thus
$$ \frac{y}{x^2} = \alpha \cdot e^{-\beta x^2} $$let $ u = \log \frac{y}{x^2} $, $ v = x^2 $
$$ u = - \beta v + \log \alpha $$
In [4]:
highscore = max(B[:, 1]...)
x = (highscore .- B[2:end, 1])
y = B[2:end, 2]
_u = log.(y./(x.^2))
_v = x.^2
real_(x) = x!=Inf&&x!=-Inf&&x!=NaN
u = _u[real_.(_u)]
v = _v[real_.(_u)]
k, b = [v ones(length(v))]\u
α = e^b
β = -k
v_ = linspace(0, 500000, 1000)
#plot(x′, α.*e.^(-β.*x′.^2).*x′.^2)
#plot(x′, 2.*e.^(-0.5.*x′.^2).*x′.^2)
plot(v, u, "rx")
plot(v_, k.*v_.+b, "b--")
Out[4]:
In [7]:
_u′ = log.(y./(x.^2))[20:290]
_v′ = (x.^2)[20:290]
u′ = _u′[real_.(_u′)]
v′ = _v′[real_.(_u′)]
k′, b′ = [v′ ones(length(v′))]\u′
α′ = e^b′
β′ = -k′
v′′ = linspace(0, 200000, 1000)
#plot(x′, α.*e.^(-β.*x′.^2).*x′.^2)
#plot(x′, 2.*e.^(-0.5.*x′.^2).*x′.^2)
plot(v, u, "rx")
plot(v_, k.*v_.+b, "b--")
plot(v′′, k′.*v′′.+b′, "g--")
Out[7]:
In [6]:
plot(highscore.-x, y, "r-", label="original curve")
t = highscore:-1:1
plot(t, α.*e.^(-β.*(highscore.-t).^2).*(highscore.-t).^2, "b--", linewidth=0.7, label=L"fitting curve of $y = \alpha e^{-\beta x^2} \cdot x^2$")
plot(t, α′.*e.^(-β′.*(highscore.-t).^2).*(highscore.-t).^2, "g-", linewidth=0.5, label="fitting curve with some value removed")
title("National Higher Education Entrance Examination stastics in 2017")
legend()
axis([700, 0, 0, 200])
xticks(700:-25:0, rotation=40)
xlabel("score")
ylabel("frequency")
grid(true)