Scientific Computing — Problem Set 3

Stefan Countryman

Question 1

1. Determine the true means, $\hat{\bar{v}}_{a}$, for $v_{1}$, $v_{2}$, ..., $v_{5}$



In [1]:

    
include("q1/p1.jl")









    



~~ QUESTION 1 ~~

Part 1
	The true means v̄̂1, v̄̂2, ..., v̄̂5 are:

	v̄̂[1] = 1.7608399300642636
	v̄̂[2] = 2.886344221395922
	v̄̂[3] = 4.01184851272758
	v̄̂[4] = 5.137116475044777
	v̄̂[5] = 6.262384437361974

2. Consider values $N$ = 1,000 and $N$ = 10,000. There are $M/N$ samples of this size in our $M$ values. Histogram the sample means for these values of $N$ and determine the true standard deviation of the means $\hat{\sigma}_{\bar{v}_{a},N}$.

Recall that $\hat{\sigma}_{\bar{v}_{a},N} = \frac{\sigma}{\sqrt{N}}$, where $\sigma$ is the variance of the overall population. So we should see that $$\bigg(\frac{\hat{\sigma}_{\bar{v}_{a},N_{1}}}{\hat{\sigma}_{\bar{v}_{a},N_{2}}}\bigg)^{2} = \frac{N_{2}}{N_{2}} = 10$$ in this case. This is roughly the result we get (see below).



In [2]:

    
include("q1/p2.jl")









    



Part 2

	N1 = 1000
	N2 = 10000

	 Standard deviations of the sample means are:

	σ̂1 = [0.24311784337397993,0.19294412355261062,0.17705504776718253,0.1942781528615023,0.24564575367828553]
	σ̂2 = [0.08172736485350657,0.06386716663997997,0.05606668407073371,0.05900974192855537,0.07424533843618523]

	We expect (σ̂1./σ̂2).^2 = N2/N1 = 10:

	(σ̂1[1]/σ̂2[1])^2 = 8.849091320213951
	(σ̂1[2]/σ̂2[2])^2 = 9.12657468656696
	(σ̂1[3]/σ̂2[3])^2 = 9.972565157927063
	(σ̂1[4]/σ̂2[4])^2 = 10.839281637664836
	(σ̂1[5]/σ̂2[5])^2 = 10.94662245831677

	...which is pretty close.

	Making histograms and placing in 'histograms' vector



In [3]:

    
histograms[1]









    Out[3]:



In [4]:

    
histograms[2]









    Out[4]:



In [5]:

    
histograms[3]









    Out[5]:



In [6]:

    
histograms[4]









    Out[6]:



In [7]:

    
histograms[5]









    Out[7]:

Each histogram shows visibly smaller standard deviation for the larger sample size.

3. You can now determine the true autocorrelation function for each variable, $\hat{C}_{v_{a}, n}$, which is given by:

$$\hat{C}_{v_{a}, n} = \frac{1}{M-n} \sum_{i=1}^{M-n} \big(v_{a,i+n} - \hat{\bar{v}}_{a}\big)\big(v_{a,i} - \hat{\bar{v}}_{a}\big)$$

$n$ goes from 0 to some maximum value $n_{cut}$ with $n_{cut} << M$. Plot $\hat{C}_{v_{a},n}/\hat{C}_{v_{a},0}$ versus $n$ for $a = 1...5$.

We want to pick our bin sizes $n$ such that $\hat{C}_{v_{a},n}$ is small.



In [8]:

    
include("q1/p3.jl");









    



Part 3

	Calculating autocorrelations

	Making autocorrelation plots and placing in 'plots' vector



In [9]:

    
plots[1]









    Out[9]:



In [10]:

    
plots[2]









    Out[10]:



In [11]:

    
plots[3]









    Out[11]:



In [12]:

    
plots[4]









    Out[12]:



In [13]:

    
plots[5]









    Out[13]:

3. Find the integrated autocorrelation times

$$ \hat{\tau}_{int,v_{a}} \equiv \frac{1}{2}\frac{1}{\hat{C}_{v_{a},0}} \sum_{n=-n_{cut}}^{n_{cut}} \hat{C}_{v_{a},n} = \sum_{n=0}^{n_{cut}} \frac{\hat{C}_{v_{a},n}}{\hat{C}_{v_{a},0}} - \frac{1}{2} $$

...since the zeroth term equals 1.

Estimate a value for $n_{cut}$ from your plots. $n_{cut}$ should be large enough that $\hat{C}_{v_{a},n}/\hat{C}_{v_{a},0}$ has gotten close enough to zero that the value of $\hat{\tau}_{int,v_{a}}$ is not affected by modest changes in $n_{cut}$.

Judging by the above plots, $\hat{C}_{v_{a},n}/\hat{C}_{v_{a},0}$ reaches zero at around $n = 100$ for each dataset. Since $2\tau_{int}$ is the separation between unrelated measurements, pick $n_{cut} = 300$ to be on the safe side.



In [14]:

    
include("q1/p4.jl")









    



Part 4

	Pick ncut = 200 and calculate autocorrelation times

	Autocorrelation times:

		τ̂v1:	42.02837656926044
		τ̂v2:	42.64103861292871
		τ̂v3:	39.9497824976959
		τ̂v4:	42.55317445674922
		τ̂v5:	41.84790218850918

5. Calculate the true standard deviation of the data, i.e.

$$ \hat{\sigma}_{v_{a}}^{2} \equiv \frac{1}{M-1} \sum_{i=1}^{M} (v_{a,i} - \hat{\bar{v}}_{a})^{2} $$

For a sample of size $N$, we should have

$$ \hat{\sigma}_{\bar{v}_{a},N} = \sqrt{\frac{2\hat{\tau}_{int,v_{a}}}{N}} \hat{\sigma}_{v_{a}} $$

Giving the testable hypothesis that

$$ N = 2 \hat{\tau}_{int,v_{a}} \frac{\hat{\sigma}_{v_{a}}^{2}}{\hat{\sigma}_{\bar{v}_{a},N}^{2}} $$



In [15]:

    
include("q1/p5.jl")









    



Part 5

	Calculating σ̂ for each a
		σ̂[1] = 0.8668358466567552
		σ̂[2] = 0.6795499044957977
		σ̂[3] = 0.6377090628372403
		σ̂[4] = 0.676184727622303
		σ̂[5] = 0.8618077497822448

	Recall, from Part 2:

	N1 = 1000
	N2 = 10000


	We expect:

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

	and

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

	Test that hypothesis...

	for N1:

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

		2 * τ̂[a] * ( σ̂[a]^2 / σ̂1[a]^2 ) = N1 = 1000

	...and for N2:

		2 * τ̂[a] * (  σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

		2 * τ̂[a] * (  σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

		2 * τ̂[a] * (  σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

		2 * τ̂[a] * (  σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

		2 * τ̂[a] * (  σ̂[a]^2 / σ̂2[a]^2 ) = N2 = 10000

	Which is exactly what we would expect.

6. Calculate the true covariance matrix for the data, defined by

$$ \hat{c}_{v_{a},v_{b}} = \frac{1}{M} \sum_{i=1}^{M} (v_{a,i} - \hat{\bar{v}}_{a}) (v_{b,i} - \hat{\bar{v}}_{b}) $$

It is customary to define a normalized version of $\hat{c}_{v_{a},v_{b}}$ by

$$ \hat{\rho}_{v_{a},v_{b}} \equiv \frac{\hat{c}_{v_{a},v_{b}}}{\hat{\sigma}_{v_{a}}\hat{\sigma}_{v_{b}}} $$

Some scratch work for problem 6:

We can nicely express the covariance matrix $\hat{c}$ as

$$ \hat{c} = \frac{1}{M} D^{T}D $$

or in code as

ĉ = transpose(D) * D ./ M

where

D = v - ones(M) * transpose(v̄̂)

and we can write $\hat{\rho}_{v_{a},v_{b}}$ as

  ρ̂ = ĉ ./ ( σ * transpose(σ) )



In [16]:

    
include("q1/p6.jl")









    



Part 6

	Calculating the true covariance matrix, ĉ:

	Calculating the normalized covariance matrix, ρ̂:






    Out[16]:





5x5 Array{Float64,2}:
 0.999999    0.930249  0.623271  0.293947  6.93888e-5
 0.930249    0.999999  0.866737  0.593589  0.290118  
 0.623271    0.866737  0.999999  0.86551   0.618212  
 0.293947    0.593589  0.86551   0.999999  0.928775  
 6.93888e-5  0.290118  0.618212  0.928775  0.999999

As expected, the normalized variances along the main diagonal are all 1.



In [17]:

    
using PyPlot
surf(ρ̂)
xlabel("va")
ylabel("vb")
title("Normalized Covariance between Datasets")









    



Warning: error initializing module PyPlot:
PyCall.PyError(msg=":PyImport_ImportModule", T=PyCall.PyObject(o=0x000000001b6dd698), val=PyCall.PyObject(o=0x000000001b92ffc8), traceback=PyCall.PyObject(o=0x0000000000000000))
Warning: using PyPlot.plot in module Main conflicts with an existing identifier.






    



pltm not defined
while loading In[17], in expression starting on line 2

 in gca at /Users/Stefan/.julia/v0.3/PyPlot/src/PyPlot.jl:367
 in plot_surface at /Users/Stefan/.julia/v0.3/PyPlot/src/PyPlot.jl:434
 in surf at /Users/Stefan/.julia/v0.3/PyPlot/src/PyPlot.jl:466

I'm having trouble getting matplotlib to render the surface correctly; it looks fine from a different angle (nicely peaked along the main diagonal), but from the default angle, those polygons are transparent.

7. Pick two groups of data from the full universe of data. One should have N = 1,000 and the other should have N = 10,000. These two groups represent results one might get from simulations. These two groups represent results one might get from simulations. We want to see how well these groups represent reqults one might get from simulations. We want to see how well these groups reproduced the true statistical results for these data.

a. Estimate the autocorrelation function $C_{v_{a},n}$ from these two groups and the integrated autocorrelation time.

b. Use these to determine the standard deviation of the mean $\sigma_{\bar{b}_{a},N}$.

c. Compare this with the results from the universe of data. Also compare the normalized covariance matrix $\rho_{v_{a},v_{b}}$ from these small samples with the universe of data.



In [18]:

    
include("q1/p7.jl")









    



Part 7

	Pick subjets v1 and v2 of each dataset,
	sizes N1 = 1,000 & N2 = 10,000.

	Estimate autocorrelations

	...for N1...

	...for N2 (See plots below)

	Placing estimated autocorrelation plots 'plots1' and 'plots2':

	Estimating autocorrelation times τ1 and τ2






    



ns not defined
while loading /Users/Stefan/Dropbox/1-gsas/1st-year/Scientific Computing/Assignments/set03/q1/p7.jl, in expression starting on line 52
while loading In[18], in expression starting on line 1

 in anonymous at no file:53
 in include at /Applications/Julia-0.3.7.app/Contents/Resources/julia/lib/julia/sys.dylib
 in include_from_node1 at /Applications/Julia-0.3.7.app/Contents/Resources/julia/lib/julia/sys.dylib



In [19]:

    
plots1[1]









    Out[19]:



In [20]:

    
plots1[2]









    Out[20]:



In [21]:

    
plots1[3]









    Out[21]:



In [22]:

    
plots1[4]









    Out[22]:



In [23]:

    
plots1[5]









    Out[23]:

The autocorrelation takes too many steps (> 100) to settle down for N1 = 1,000.

We know from earlier that $n_{cut}=150$ should be sufficient, but just working off of sample sizes of 1,000, we aren't able to satisfy the condition that $n_{cut} << N_{1}$.



In [24]:

    
plots2[1]









    Out[24]:



In [25]:

    
plots2[2]









    Out[25]:



In [26]:

    
plots2[3]









    Out[26]:



In [27]:

    
plots2[4]









    Out[27]:



In [28]:

    
plots2[5]









    Out[28]:

Let's see how good ρ1 and ρ2 were as estimates of ρ̂:



In [29]:

    
ρratio1 = ρ̂ ./ ρ1









    



ρ1 not defined
while loading In[29], in expression starting on line 1



In [30]:

    
ρrelerr1 = ones(5,5) - ρratio1









    



ρratio1 not defined
while loading In[30], in expression starting on line 1

Even for the smaller of the two sample sizes, $N1 = 1,000$, the estimate is fairly close to the true value, except for the correlations between datasets v1 and v5, which are off for both sample sizes.



In [31]:

    
ρratio2 = ρ̂ ./ ρ2









    



ρ2 not defined
while loading In[31], in expression starting on line 1



In [32]:

    
ρrelerr2 = ones(5,5) - ρratio2









    



ρratio2 not defined
while loading In[32], in expression starting on line 1

Again, the estimates are (for the most part) significantly better for the larger sample size. Let's compare the ratio of the relative errors of the two estimators:



In [33]:

    
ρrelerrratio = ρrelerr1 ./ ρrelerr2









    



ρrelerr1 not defined
while loading In[33], in expression starting on line 1

Take the geometric mean of the off-diagonal rows to see how much of a reduction in error we get, on average, in going from N1 = 1,000 to N2 = 10,000.



In [34]:

    
ρoffdiagratio = ρrelerrratio;
mn = 1
for i in [1:5]
    ρoffdiagratio[i,i] = 1.0;   # Set diagonals to 1
end
for r in ρoffdiagratio
    mn *= r;
end
mn = mn ^ (1/16);           # Off-diagonal elements









    



ρrelerrratio not defined
while loading In[34], in expression starting on line 1

While the overall picture is a little muddled (there are some correlation factors where the smaller sample size coincidentally gets a more accurate measurement), the correlation factors estimated in this particular case with the larger sample size are, on average, around 2.6 times better in terms of relative error than the smaller sample sizes.

Question 2

1. Break the $M$ measurements up into groups of size $N$, calculate $\bar{v}_{a}$ for each group and then calculate $f_{i}(\bar{b}_{a})$ for each group. Calculate these function of the data means for all $M/N$ groups and find the standard deviation for $f_{i}(\bar{v}_{a})$, $\hat{\sigma}_{f_{i},N}$.



In [35]:

    
include("q2/p0.jl")
include("q2/p1.jl")









    



~~ QUESTION 2 ~~

RUNNING QUESTION 1 SCRIPT AGAIN

~~ QUESTION 1 ~~

Part 1
	The true means v̄̂1, v̄̂2, ..., v̄̂5 are:

	v̄̂[1] = 1.7608399300642636
	v̄̂[2] = 2.886344221395922
	v̄̂[3] = 4.01184851272758
	v̄̂[4] = 5.137116475044777
	v̄̂[5] = 6.262384437361974

Part 2

	N1 = 1000
	N2 = 10000

	 Standard deviations of the sample means are:

	σ̂1 = [0.24311784337397993,0.19294412355261062,0.17705504776718253,0.1942781528615023,0.24564575367828553]
	σ̂2 = [0.08172736485350657,0.06386716663997997,0.05606668407073371,0.05900974192855537,0.07424533843618523]

	We expect (σ̂1./σ̂2).^2 = N2/N1 = 10:

	(σ̂1[1]/σ̂2[1])^2 = 8.849091320213951
	(σ̂1[2]/σ̂2[2])^2 = 9.12657468656696
	(σ̂1[3]/σ̂2[3])^2 = 9.972565157927063
	(σ̂1[4]/σ̂2[4])^2 = 10.839281637664836
	(σ̂1[5]/σ̂2[5])^2 = 10.94662245831677

	...which is pretty close.

	Making histograms and placing in 'histograms' vector
Part 1
RESUMING QUESTION 2

	Calculating functions of means f1v1, f2v1, ...

	Calculating standard deviations of the functions






    Out[35]:





0.037721615637770205



In [36]:

    
include("q2/p2.jl")









    



Calculating σ̂f1, σ̂f2, σ̂f3 for sample sizes N1, N2

	Naive standard deviations for sample size 1,000
	σ̂f1N1 = 0.09358325029691549
	σ̂f2N1 = 0.08531349962730816
	σ̂f3N1 = 0.19846003433749787

	Naive standard deviations for sample size 10,000
	σ̂f1N2 = 0.031368333151644764
	σ̂f2N2 = 0.026418957974093424
	σ̂f3N2 = 0.06523332542094236

	True standard deviations for sample size 1,000
	σ̂truef1N1 = 0.04806147844102485
	σ̂truef2N1 = 0.029333752253445103
	σ̂truef3N1 = 0.11336945628023494

	True standard deviations for sample size 10,000
	σ̂truef1N2 = 0.01607397202563451
	σ̂truef2N2 = 0.009444648233277958
	σ̂truef3N2 = 0.037721615637770205



In [37]:

    
include("q2/p3.jl")









    



Part 3

	Estimating mean and standard deviation using
	jacknife resampling with bin size of 40.

	Estimated means, using N = 10000 and b = 40:

		dataset 1: 1.7405204690023046
		dataset 2: 2.8572197300584294
		dataset 3: 3.973918991114564
		dataset 4: 5.113901871965279
		dataset 5: 6.253884752816004

	Estimated standard deviations:

		dataset 1: 0.04992736300388104
		dataset 2: 0.038429589950498916
		dataset 3: 0.03473670589572516
		dataset 4: 0.039887332876986356
		dataset 5: 0.05244070766470152

	Calculating dependence of jacknife σ estimator on b.

	Generating plots in σvsbplots to show that dependence.



In [38]:

    
σvsbplots[1]









    Out[38]:



In [39]:

    
σvsbplots[2]









    Out[39]:



In [40]:

    
σvsbplots[3]









    Out[40]:



In [41]:

    
σvsbplots[4]









    Out[41]:



In [42]:

    
σvsbplots[5]









    Out[42]:

The estimated standard deviations scale with the square root of the bin size, which intuitively makes sense, given that jacknife resampling suppresses variance as bin count increases. Random fluctuations due to falling bin size become apparent beyond around b=60, but near b=40 (the rough autocorrelation time we calculated for each of our datasets), the estimated variance is stable.

4. Now calculate $f_{i}(v^{\prime}_{a,k})$ for each of the $N/b$ jacknife blocks. You can then determine $\sigma_{f_{i},N}$ from

$$ \sigma^{2}_{f_{i},N} = \frac{N/b - 1}{N/b} \sum^{N/b}_{k=1}(f_{i}(v^{\prime}_{a,k})-f_{i}(\bar{v}_{a}))^2 $$

Again, do this for a few values of b that are comparable to the integrated autocorrelation time. How does $\sigma_{f_{i},N}$ compare with $\hat{\sigma}_{f_{i},N}$ from part 1?



In [43]:

    
include("q2/p4.jl")









    



Part 4

	Estimating σ for f1..f3 with bin around b=40

	Generating plots in σfvsb to show σf dependence on b.

	Compare σfjack to naive σ̂ from part 1:

		σ̂f1N2/σfjack[4,1] = 3.0732915815122976
		σ̂f2N2/σfjack[4,2] = 4.196699645267839
		σ̂f3N2/σfjack[4,3] = 2.8213301699893942



In [44]:

    
σfvsb[1]









    Out[44]:



In [45]:

    
σfvsb[2]









    Out[45]:



In [46]:

    
σfvsb[3]









    Out[46]:

The jacknife estimated standard deviations for the functions are each larger than our naive estimates. It seems that correlations between the variables were an important factor.

Question 3

Choosing two values for $N$, estimate $\tau_{int}$ for each of the $M/N$ samples of size $N$ in the universe of data, as a function of $n_{cut}$. Then find the standard deviation $\sigma_{\tau,N}$ of $\tau_{int}$. Does the standard deviation with $n_{cut} \sim N$ decrease as $N$ increases?



In [47]:

    
include("q3/p1.jl");









    



	Pick ncut = 150, as before

	Partitioning v into M/N bins for N1 = 1000 and N2 = 10000

	Counting number of bins

	Calculating τint for M/N1 bins

	Calculating στint for M/N1 bins

	Calculating τint for M/N2 bins

	Calculating στint for M/N2 bins



In [48]:

    
στintN1./στintN2









    Out[48]:





5-element Array{Float64,1}:
 3.18221
 2.9918 
 2.79801
 2.95332
 3.05401

Indeed, the standard deviations are around 3 times higher for the smaller value of N.

Question 4

1. Make measurements of the temperature, potential energy, and the time average of the virial, which is given by

$$ \sum_{i} \sum_{j>i} r_{ij} \frac{\partial{V}_{ij}}{\partial{r}_{ij}} $$

for every molecular dynamics time step.

Read the temperature, potential energy, and virial datasets into arrays:



In [1]:

    
include("q4/p1.jl");









    



~~ QUESTION 4 ~~

Part 1

	Reading in values from MD simulation...






    Out[1]:





1600x3 Array{Float64,2}:
 1.26639   -9874.31  65.8209
 1.2673    -9877.18  67.824 
 1.28584   -9932.28  59.046 
 1.27844   -9910.02  63.1065
 1.29029   -9945.58  56.6256
 1.33752  -10087.4   32.6259
 1.30571   -9991.53  50.2689
 1.28668   -9934.5   58.5206
 1.28913   -9942.1   54.3847
 1.3044    -9988.09  44.2214
 1.2871    -9936.06  52.8821
 1.30763   -9997.97  43.4861
 1.27127   -9888.03  59.4582
 ⋮                          
 1.32472  -10049.4   48.4946
 1.29285   -9953.06  60.2433
 1.32887  -10061.2   40.8293
 1.31279  -10013.1   49.8159
 1.32367  -10046.0   47.3308
 1.28674   -9934.8   65.1338
 1.32045  -10036.4   46.7739
 1.32402  -10047.2   41.3392
 1.28518   -9930.29  56.2173
 1.30285   -9982.91  47.6141
 1.32051  -10036.5   39.4782
 1.29792   -9968.48  46.3082

2. Find autocorrelation times for those three quantities.

HAVE TO FIX THAT AUTOCORRELATION CODE



In [39]:

    
include("q4/p2.jl")









    



Part 2

	First, plot autocorrelation to estimate τint

	Making autocorrelation plots and placing in 'mdautocorr' vector

	Autocorrelation times:

		τ̂vTemperature at T = 1.069:	0.04901925380373062
		τ̂vPotential Energy at T = 1.069:	0.05185929773389353
		τ̂vVirial at T = 1.069:	0.9207022092691668
		τ̂vTemperature at T = 1.304:	0.9073115737013056
		τ̂vPotential Energy at T = 1.304:	0.9053114226942518
		τ̂vVirial at T = 1.304:	0.8850134725174073

Note that these autocorrelation times should be multiplied by 10, since that's the number of actual simulation steps between each recorded measurement. Plot autocorrelations against step separation to get a sense of the autocorrelation times:



In [33]:

    
mdautocorr[1]









    Out[33]:



In [34]:

    
mdautocorr[2]









    Out[34]:



In [35]:

    
mdautocorr[3]









    Out[35]:



In [36]:

    
mdautocorr[4]









    Out[36]:



In [37]:

    
mdautocorr[5]









    Out[37]:



In [38]:

    
mdautocorr[6]









    Out[38]:

The distance between uncorrelated steps for the virial, temperature, and potential energy at $T = 1.304$ all seem to be around 100 steps, corresponding to 10 steps in our saved data (since we only saved these quantities every 10 simulation steps). At $T = 1.069$, with a calm region for all three values at around 250 steps. Use these values in determining ncut for each quantity for the jacknife part.

3. Measure the covariance matrix for these 3 quantities.



In [44]:

    
include("q4/p3.jl")









    



Part 3

	Calculating the covariance matrix, Cmoldy:






    Out[44]:





6x6 Array{Float64,2}:
  0.000267825    -0.804464    -0.143681    …     0.0248542    0.0049742
 -0.804464     2416.39       431.494           -74.4603     -14.8766   
 -0.143681      431.494       93.51             -9.5062      -2.71595  
 -8.27622e-6      0.0247943    0.00316789       -1.1759      -0.207009 
  0.0248542     -74.4603      -9.5062         3532.88       621.752    
  0.0049742     -14.8766      -2.71595     …   621.752      126.779

4. Use $\tau_{int}$, binning, and jacknife resampling to get an error estimate on the pressure.

Recall:

$$ \frac{P}{\rho kT} = 1 - \frac{1}{6NkT} \langle \sum_{i \neq j} r_{ij} \frac{\partial{V}}{\partial{r_{ij}}} \rangle $$

which gives pressure in natural units.



In [ ]:

    
include("q4/p4.jl")









    



Part 4
Finding number of steps between independent data recordings
bp_1069 = 2 * maximum([τtemp_1069,τpe_1069,τvir_1069]) => 1.8414044185383336
bp_1304 = 2 * maximum([τtemp_1304,τpe_1304,τvir_1304]) => 1.814623147402611

	Estimating σpressure:
mdjack_1069 = jacknife(md_1069,lengthmd,int(ceil(bp_1069))) =>

Note: I tried changing what we discussed (i.e. calculating the energy by multiplying by 0.5 instead of 16), but it only gave more nonsensical looking results, so I'm keeping things as they were, since the behaviour seems qualitatively correct.