To do this problem, we need to call a few functions from pmag and ipmag to create the Fisher distributed set of directions and then plot them.
In [1]:
# import PmagPy functions
import pmagpy.pmag as pmag # import the pmag module
import pmagpy.ipmag as ipmag
import numpy as np # while we are at it, let's import numpy
# allows plotting figures in the notebook
%matplotlib inline
Let's first use ipmag.fishrot() to generate a Fisher distributed set of directions and then plot it with the pair of functions ipmag.plot_net() and ipmag.plot_di().
In [2]:
help(ipmag.fishrot)
So we can just define the parameters and assign the output to a list
In [6]:
prob1a_data=ipmag.fishrot(k=25,n=20, dec=12,inc=45)
In [7]:
help(ipmag.plot_di)
In [8]:
ipmag.plot_net(1)
ipmag.plot_di(di_block=prob1a_data,title="Fisher distributed directions")
Here we have to write a program to calcualate the Fisher statistics for the data in Problem 1a (here called prob1a_data). We can re-use some of the code we wrote in the Chapter 2 problem sets (dir2cart and cart2dir) (which also happen to be part of the pmag module) and write a new function for doing the Fisher statistics. Notice the use of a doc string (enclosed in triple quotes) at the front of the function.
In [9]:
def fisher_mean(data):
"""
calculates fisher statistics for data where data is an array of Dec, inc pairs.
Returns a dictionary fpars with dec, inc, n, R, alpha95 and CSD as keys.
dec and inc are the mean dec and inc for the data set.
"""
R = 0 # initialize R, the resultant vector length to zero
Xbar,X=[0,0,0],[]# initialize Xbar, the container for mean x,y,z
fpars={} # dictionary for the Fisher statistics
N=len(data) # number of data points incoming
if N <2: # You can't do Fisher statistics on less than 2 numbers.
return fpars
# call to dir2cart transforms directions to cartesian coordinates
X=pmag.dir2cart(data) # pmag needs to be imported before calling.
for i in range(len(X)):
for c in range(3):
Xbar[c]+=X[i][c] # get the sum of each cartesian coordinate
for c in range(3):
R+=Xbar[c]**2 # get the sum of squares of the component sums.
R=np.sqrt(R) # take the square root
for c in range(3):
Xbar[c]=Xbar[c]/R # normalize by the Resultant vector length
dir=pmag.cart2dir(Xbar) # convert back to directions
fpars["dec"]=dir[0]
fpars["inc"]=dir[1]
fpars["n"]=N
fpars["r"]=R
if N!=R: # if perfectly aligned kappa infinite!
k=(N-1.)/(N-R)
fpars["k"]=k
csd=81./np.sqrt(k)
else:
fpars['k']='inf'
csd=0.
b=20.**(1./(N-1.)) -1
a=1-b*(N-R)/R
if a<-1:a=-1
a95=np.arccos(a)*180./np.pi
fpars["alpha95"]=a95
fpars["csd"]=csd
if a<0: fpars["alpha95"] = 180.0
return fpars
In [10]:
fpars= fisher_mean(prob1a_data)
print (fpars) # it isn't pretty, but it works.
Here we call ipmag.fishrot() again.
In [11]:
prob1c_data=ipmag.fishrot(k=25,n=20, dec=12,inc=45)
fpars2=fisher_mean(prob1c_data)
print (fpars2)
Now we can write a little function to calculate Watson's F statistic (Equation 11.16 in the Essentials textbook).
In [12]:
def watsons_f(DI1,DI2):
# first calculate R for the combined data set, then R1 and R2 for each individually.
DI=np.concatenate((DI1,DI2),axis=0) # create a new array from two smaller ones
fpars=fisher_mean(DI) # re-use our functionfrom problem 1b
fpars1=fisher_mean(DI1)
fpars2=fisher_mean(DI2)
N=fpars['n']
R=fpars['r']
R1=fpars1['r']
R2=fpars2['r']
F=(N-2.)*((R1+R2-R)/(N-R1-R2))
return F
And now we call it.
In [13]:
print ('F = %10.5f'%watsons_f(prob1a_data,prob1c_data))
So, how do we figure out what the critical value of F is? I found this website with F tables at: http://www.socr.ucla.edu/Applets.dir/F_Table.html looking at the table for $\alpha$=0.05 (95% confidence) with df$_1$=2 and df$_2$= 2(N-2), I found the value of 3.117. Our value above is much lower than that, so indeed these two data sets are probably drawn from the same distribution.
Better yet, there is a function pmag.fcalc( ) that will also help...
In [14]:
help(pmag.fcalc)
In [15]:
pmag.fcalc(2,2*(fpars['n']+fpars2['n']-2))
Out[15]:
So here all our work will pay off, because we can just re-use the functions we already wrote.
In [17]:
prob1d_data=ipmag.fishrot(k=25,n=20, dec=55,inc=60)
print ('F = %10.5f'%(watsons_f(prob1a_data,prob1d_data))) # we already did the tranpose thing
And those two are clearly NOT drawn from the same distribution!
To check with the function pmag.watsons_f:
In [18]:
help(pmag.watsons_f)
So now we do this:
In [19]:
pmag.watsons_f(prob1a_data,prob1c_data)
Out[19]:
These are the values for F and the critical value of F for this N. The first value is way less than 3.1, so these two data sets are likely drawn from the same distribution.
Now we re-run it with ps11_prob1d.dat and see what we get:
In [20]:
pmag.watsons_f(prob1a_data,prob1d_data)
Out[20]:
Looks like these two are different....
So our code is working the same way watsons_f.py works.
Now we can run the function ipmag.common_mean_watson( ) for use within a notebok and see how the two Watson methods compare.
In [21]:
help(ipmag.common_mean_watson)
In [37]:
ipmag.common_mean_watson(prob1a_data,prob1c_data,plot='yes')
Out[37]:
Re-doing this for the other pair of files:
In [38]:
ipmag.common_mean_watson(prob1a_data,prob1d_data,plot='yes')
Out[38]:
So the two methods give the same answer.
For this, we use ipmag.fishrot() as before. But that returns a nested list of declination, inclination data, so to just work on the inclinations, I should turn it into an array, take the transpose (so declination is in the first row and inclination is in the second). Then I can get the inclinations out easily and calculate the average inclination value (using the Numpy functions for Gaussian statistics).
In [56]:
prob2a_data=np.array(ipmag.fishrot(k=20,n=20,inc=50)) # I'm using the same N and kappa as for the others
Incs=prob2a_data.transpose()[1] # fish out the inclinations
Incs
Out[56]:
In [57]:
Incs.mean() # gaussian mean of the inclinations
Out[57]:
For this, I can re-use fisher_mean( ) from above. I like this function stuff - it saves a lot of time!
In [59]:
fisher_mean(prob2a_data)
Out[59]:
See how the Fisher mean is steeper than the gaussian mean?
I'm going to call pmag.doincfish( ) to see if we can get a less biassed estimate for inclination.
In [60]:
help(pmag.doincfish)
In [61]:
print (pmag.doincfish(Incs))
So that program did better at fixing the problem!
For this problem, we first need to read in the data file _ps11prob3a.dat from the Chapter_11 folder and peel off the declination, inclination data.
In [73]:
data=np.loadtxt('Chapter_11/ps11_prob3a.dat') # read in the data
import pandas as pd # let's use this cool module for wrangling data!
df=pd.DataFrame(data,columns=['dec','inc','dip_dir','dip'])
print (df.head())
And because we now know how, let's make a quickie plot.
In [74]:
di_block=df[['dec','inc']].values
ipmag.plot_net(2) # make the net
ipmag.plot_di(di_block=di_block) # plot the dec,inc data
Now we can use the function pmag.doprinc( ) to calculate the principal component of the directional data.
In [75]:
help(pmag.doprinc)
In [76]:
princ_pars=pmag.doprinc(di_block)
print (princ_pars)
Any vector with an angle of > 90 from the principal direction princ_pars['dec'] and princ_pars['inc'] belongs in the reverse polarity group. So we can use the function pmag.angle( ) to separate the two groups. First let's paste on the principal directions to our panda frame.
In [77]:
df['principal_dec']=princ_pars['dec'] # add the principal components to the panda frame.
df['principal_inc']=princ_pars['inc']
print (frame.head()) # I love how this works
Let's find out what pmag.angle expects.
In [78]:
help(pmag.angle)
So first we have to set up D2 with the principal directions (DI1 is already set up - it is DI). Then we call pmag.angle( ).
In [80]:
PD=np.array([df['principal_dec'],df['principal_inc']]).transpose()
df['angle']=pmag.angle(di_block,PD)
print (df.head())
Now all we have to do is test the angle and sort the two directions into their own frames.
In [81]:
Mode_1= df[df['angle']<=90] # these are the normal directions
Mode_2=df[df['angle']>90] # these are the reverse directions
In [84]:
Mode_2_flip=Mode_2.copy() # make a copy of Mode_2.
Mode_2_flip['dec']=(Mode_2['dec']-180.)%360. # subtracts 180 but does modular 360 (stays 0=>360 this way.)
Mode_2_flip['inc']=-Mode_2['inc'] # take the negative of the inclination
flip_df=pd.concat([Mode_1,Mode_2_flip]) # this is the list of all the data, flipping the reverse mode.
Now we can call our old friend fisher_mean from Problem 1b and finish the job.
In [85]:
DIflip=np.array([flip_df['dec'],flip_df['inc']]).transpose()
geo_stats=fisher_mean(DIflip)
# print out the statistics for the (flipped) geographic reference frame.
print ('Fisher statistics for flipped data set',geo_stats)
Now we can call pmag.dotilt_V( ) similar to in Chapter 9, but this one works on arrays of input data. We can the values from the pandas DataFrame flip_frame using the syntax df.Series.values where df is a DataFrame and Series is the column name.
In [86]:
flip_data=np.array([flip_frame.dec.values,flip_frame.inc.values,flip_frame.dip_dir.values,flip_frame.dip.values]).transpose()
rot=pmag.dotilt_V(flip_data)
flip_df['dec_rot']=rot[0]
flip_df['inc_rot']=rot[1]
flip_df.head()
Out[86]:
Repeat fisher_mean( ) call with tilt corrected data:
In [87]:
DItilt=np.array([flip_df['dec_rot'],flip_df['inc_rot']]).transpose()
tilt_stats=fisher_mean(DItilt)
print ('Fisher statistics for Mode 1',tilt_stats)
To finish this off, we take the ratio of the two k values (before and after tilt) and compare with the F-table values ($\alpha$=.05 and degrees of freedom 2(N1-1) and 2(N2-1). Now you should know that there is also a pmag.fcalc()
In [88]:
print (geo_stats['k']/tilt_stats['k'])
In [89]:
print (pmag.fcalc(2*geo_stats['n']-1,2*tilt_stats['n']))
So our F value would pass the McElhinny fold test.