RESULTS
Add SFRs to analysis. Focus on size, stellar mass, SFR relation
RESULTS
In [4]:
import numpy as np
from pylab import *
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
import rpy2.robjects as robjects
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
In [2]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py
In [3]:
# updating sigma vs LX to include errorbars on sigma values
plotsigmaLx()
In [144]:
%run ~/Dropbox/pythonCode/LCSbiweight.py
In [145]:
getbiweightall()
In [146]:
plotall()
Bianca asked about the specifics of how I calculate local density. I need to dig through code to figure out exactly what I do.
Looks like I calculate local density in
LCSwritemasterNSA.py
The local density measurements are written to
homedir+'research/LocalClusters/NSAmastertables/LocalDensityTables/'+self.prefix+'_localdensity.fits'
These are read by
LCSReadmasterBaseNSA.py
as self.ld
When the columns are merged:
LCSmergespiralcats.py
calls
LCSReadmasterBaseNSA.py
for each cluster and the following columns are added to the merged table
ld_columns=['SIGMA_NN','SIGMA_5','SIGMA_10','RHOMASS']
The local density measurements use the SDSS spectroscopic sample, so all $r < 17.7$ galaxies are included. It would be more appropriate to use an absolute magnitude cut because we are sampling clusters at a range of redshifts. Using the apparent magnitude cut will make Coma's local density measurements greater than Hercules, just because of the different distances to each cluster. Therefore, I should redo the local density cuts and use an absolute r-band magnitude cut.
The appropriate absolute magnitude cut is $M_r$ that corresponds to $r=17.7$ at Hercules. I already refer to this in the paper.
The stellar mass limit is determined primarily by the SDSS spectroscopic limit of $r=17.7$. This corresponds to an absolute magnitude of $M_r = -22.3$ at the distance to Hercules ($m-M=35.97$), our furthest cluster.
Therefore, I should use $M_r = -22.3$ cut when calculating local density.
Checking distance modulus.
$$ m_2 - m_1 = 2.5 \log_{10}(f_1/f_2) $$$$ m - M = 2.5 \log_{10}\left( \frac{L}{4 \pi (10~pc)^2} \frac{4 \pi d_{pc}^2}{L} \right) $$$$ m - M = 5 \log_{10} (d_{pc}) - 5 $$For Hercules, $z = 0.037$ and
$$ d_{Mpc} = z * c /H_0 = (0.037)(3.e5)/(70~km/s/Mpc) = 158.6~Mpc$$Thus
$$ m - M = 5 \log_{10} (158.6e6) - 5 = 36.00 $$To get the absolute magnitude corresponding to $r = 17.7$
$$ M = m - 36 = 17.7 - 36 = -18.3 $$Run:
LCSCalcLocalDensity.py -m -18.3 -v 2500
LCSmergespiralcats.py -r
mergedata()
s=spirals()
In [19]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --diskonly 1
In [116]:
flag = np.ones(len(s.s.RA),'bool')
print 'starting sample size = ',len(s.s.RA)
flag = flag & s.lirflag
print 'sample size after lir cut = ',sum(flag)
flag = flag & ~s.agnflag
print 'sample size after agn cut = ',sum(flag)
flag = flag & s.sizeflag
print 'sample size after size cut = ',sum(flag)
#flag = flag & s.massflag
#print 'sample size after mass cut = ',sum(flag)
flag = flag & (s.galfitflag & ~s.badfits)
print 'sample with successful galfit models = ',sum(flag)
f = flag
#flag = flag & s.s['matchflag']
#print 'sample size after gim2d cut = ',sum(flag)
#flag = flag & s.sbflag
#print 'sample size after surface brightness cut = ',sum(flag)
# why do we loose 100 galaxies based in gim2d cut?
# how many are in SDSS but don't have gim2d data?
print 'ASIDE: number of SDSS galaxies without gim2d fit = ',sum((s.s.ISDSS > -1) & ~s.s['matchflag'])
#print 'NSAID of SDSS galaxies without gim2d match: ',s.s.NSAID[(s.s.ISDSS > -1) & ~s.s['matchflag']]
flag = flag & s.gim2dflag
print 'sample size after gim2d cut = ',sum(flag)
#flag = flag & s.massflag
#print 'sample size after mass cut = ',sum(flag)
#print 'galaxies that meet all criteria but have galfitflag = False'
#t = f & (s.sb_obs < 20) & ~s.galfitflag
#c = s.s.CLUSTER[t]
#nsa = s.s.NSAID[t]
#for i in range(len(c)):
# print c[i],nsa[i]
#flag = flag & s.galfitflag & (s.s.fcre1/s.s.fcre1err > .5)
print 'sample size after galfit cut = ',sum(flag)
#flag = flag & s.blueflag2
#print 'sample size after color cut = ',sum(flag)
print '\t cluster galaxies = ',sum(flag & s.membflag)
print '\t field galaxies = ',sum(flag & ~s.membflag)
print ''
print 'number of galaxies with galfit model but not B/T',sum(s.galfitflag & ~s.gim2dflag)# & ~s.agnflag)
In [58]:
# number of galaxies that should be fit with galfit
print sum(s.lirflag & ~s.agnflag & s.sizeflag & s.massflag)
# number of these that fall below sb cut
print sum(s.lirflag & ~s.agnflag & s.sizeflag & s.massflag & (s.sb_obs > 20.))
# number of these with sucessful galfit fits
print sum(s.lirflag & ~s.agnflag & s.sizeflag & s.massflag& s.galfitflag)
# number of these with B/T fits
print sum(s.lirflag & ~s.agnflag & s.sizeflag & s.massflag & s.galfitflag & s.gim2dflag)
print 'fraction with B/T = ', 189./234
print 'number that are missing B/T = ',234-189
In [55]:
# print ids of galaxies that meet selection criteria but have galfitflag = 0
flag = s.lirflag & ~s.agnflag & s.sizeflag & s.massflag & ~s.galfitflag & (s.sb_obs < 20.)
index = np.arange(len(s.s.RA))[flag]
for i in index:
print s.s.CLUSTER[i],s.s.NSAID[i]
Want to see if there is anything systematic about the galaxies that have no B/T fit.
Could use sersic index instead of B/T, and normalize by Re rather than Re disk. I will make this change if the referee requests it. Otherwise, I will just say that it doesn't affect our results.
In [22]:
flag = np.ones(len(s.s.RA),'bool')
#print 'starting sample size = ',len(s.s.RA)
flag = flag & s.lirflag
#print 'sample size after lir cut = ',sum(flag)
flag = flag & ~s.agnflag
#print 'sample size after agn cut = ',sum(flag)
flag = flag & s.sizeflag
#print 'sample size after size cut = ',sum(flag)
flag = flag & s.massflag
#print 'sample size after mass cut = ',sum(flag)
noBT = flag & s.galfitflag & ~s.gim2dflag
BT = flag & s.galfitflag & s.gim2dflag
print 'number of galaxies that meet all selection criteria, have galfit model, but not B/T',sum(noBT)
print 'number of galaxies that meet all selection criteria, have galfit model, AND B/T',sum(BT)
# compare sersic index of those with and without BT fits
plt.figure(figsize=(6,4))
plt.hist(s.s.SERSIC_N[noBT],cumulative=True,normed=True,histtype='step',color='blue')
plt.hist(s.s.SERSIC_N[BT],cumulative=True,normed=True,histtype='step',color='red')
ks(s.s.SERSIC_N[BT],s.s.SERSIC_N[noBT])
plt.xlabel('Sersic N')
plt.ylabel('Cumulative Distribution')
# compare sersic N with B/T
plt.figure(figsize=(6,4))
newflag = flag & s.gim2dflag
plt.plot(s.s.SERSIC_N[newflag],s.s.B_T_r[newflag],'bo')
plt.plot(s.s.ng[newflag],s.s.B_T_r[newflag],'ro')
spearman(s.s.ng[newflag],s.s.B_T_r[newflag])
# these are highly correlated. could use sersic N instead of B/T
Out[22]:
In [27]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9. --diskonly 0
In [28]:
plt.figure(figsize=(6,4))
plt.hist(s.sizeratio[noBT],cumulative=True,normed=True,histtype='step',label='No B/T')
plt.hist(s.sizeratio[BT],cumulative=True,normed=True,histtype='step',label='B/T')
ks(s.sizeratio[BT],s.sizeratio[noBT])
plt.xlabel('Re(24)/Rd')
plt.ylabel('Cumulative Distribution')
plt.legend()
Out[28]:
In [43]:
flag = np.ones(len(s.s.RA),'bool')
#print 'starting sample size = ',len(s.s.RA)
flag = flag & s.lirflag
#print 'sample size after lir cut = ',sum(flag)
flag = flag & ~s.agnflag
#print 'sample size after agn cut = ',sum(flag)
flag = flag & s.sizeflag
#print 'sample size after size cut = ',sum(flag)
flag = flag & s.massflag
#print 'sample size after mass cut = ',sum(flag)
noBT = flag & ~s.gim2dflag
BT = flag & s.galfitflag & s.gim2dflag
print sum(noBT)
# print ID information for galaxies with no B/T fits
# RA DEC RUN CAMCOL FIELD RERUN
print 'RA DEC ISDSS RUN CAMCOL FIELD RERUN'
RA = s.s.RA[noBT]
DEC = s.s.DEC[noBT]
RUN = s.s.RUN[noBT]
CAMCOL = s.s.CAMCOL[noBT]
FIELD = s.s.FIELD[noBT]
RERUN = s.s.RERUN[noBT]
ISDSS = s.s.ISDSS[noBT]
for i in range(len(RA)):
print '%12.8f %12.8f %10i %4i %6i %6i %6i'%(RA[i],DEC[i],ISDSS[i], RUN[i],CAMCOL[i],FIELD[i],RERUN[i])
In [22]:
print 'GALAXIES WITH B/T FIT'
a = s.membflag[BT]
print 'MEMBFLAG: mean, std = %.2f +/- %.2f'%(np.mean(a),np.std(a))
print 'GALAXIES WITH NO B/T FIT'
a = s.membflag[noBT]
print 'MEMBFLAG: mean, std = %.2f +/- %.2f'%(np.mean(a),np.std(a))
In [25]:
plt.figure(figsize=(4,3))
plt.hist(s.s.B_T_r[s.gim2dflag])
# so luc does fit B/T = 1
Out[25]:
NGC6107 146875 - point source, wants to make size less than 1 pix over a large range of starting parameters, set galfitflag = True
NGC6107 146878 - made cutout (100"x100"), ran and it converged.
MKW8 18098 - little dwarf, not a great fit - leave as is
AWM4 68283 - dwarf, started w/bigger mag = 14.5, and it converged
wrote file
wrote file
wrote file
Was using wrong PRF here. Reset and moving forward. Not sure why that was set to use mips one.
wrote file
wrote file
In [10]:
# trying to get objID for galaxies that don't have B/T fits so Luc can run code
flag = s.galfitflag & ~s.gim2dflag
names = s.s.NSAID[flag]
ra = s.s.RA[flag]
dec=s.s.DEC[flag]
#for i in range(len(names)):
# print names[i], ra[i], dec[i]
print '%%%%%%%%%%%%'
# trying to get objID for galaxies that don't have B/T fits so Luc can run code
for i in range(len(names)):
print '%12.8f, %12.8f'%(ra[i], dec[i])
In [19]:
# what about sersic index as proxy for B/T?
plt.figure()
flag = s.gim2dflag & ~s.agnflag & s.lirflag & s.sizeflag
plt.plot(s.s.ng[flag],s.s.B_T_r[flag],'bo')
plt.ylabel('B/T')
plt.xlabel('Sersic Index')
Out[19]:
In [21]:
flag = s.galfitflag & s.massflag & s.lirflag & s.sizeflag & ~s.agnflag
spearman(s.s.SERSIC_N[flag],s.sizeratio[flag])
flag = flag & s.gim2dflag
spearman(s.s.B_T_r[flag],s.sizeratio[flag])
Out[21]:
In [22]:
USE_DISK_ONLY
Out[22]:
In [65]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --diskonly 1
In [66]:
s.compare_cluster_field()
No serious systematics between field and cluster samples except for redshift.
I updated LCSmaketileplot.py so that a given galaxy wouldn't be repeated when randomly selecting galaxies within a certain range of size ratios (using numpy.random.shuffle).
In [ ]:
# run LCSmaketileplot.py
%run ~/Dropbox/pythonCode/LCSmaketileplot.py
# plot galaxies with size ratios between 0.1 and .3
# 0 < Re(24)/Re(r) < 0.3
paperplots()
#s.plotcolor_subset(size1=0,size2=0.3,showall=False)
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_smallsize.png')
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_smallsize.eps')
# plot galaxies with size ratios between 0.3 and 0.6
# 0.3 < Re(24)/Re(r) < 0.6
#s.plotcolor_subset(size1=0.4,size2=0.6,showall=False)
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_medsize.png')
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_medsize.eps')
# plot galaxies with size ratios greater than 0.75
# 0.75 < Re(24)/Re(r)
#s.plotcolor_subset(size1=0.75,size2=2,showall=False)
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_largesize.png')
#savefig(homedir+'research/LocalClusters/SamplePlots/cutouts_largesize.eps')
As of 1/6/16, need to make one more pass through the sample and remove galaxies that are blended with nearby companion. Not sure if people think the numbers in each panel are useful.
Galaxies that are blended with a nearby companion are:
ALSO running this from ipython and using paperplots().
Changed in May 2016 to normalize the size of the 24um emission using the size of the r-band disk only. The disk sizes are from Simard+11 B/D decompositions. The sense of the results are the same, but the average ratio of $R_e(24)/R_e(r)$ is now close to one; when I normalized using $R_e$ from the single-component Sersic fit, the average/median ratio of $R_e(24)/R_e(r) = 0.6-0.7$.
In [86]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --diskonly 1
In [70]:
plotpositionson24()
In [87]:
plotRe24vsReall()
In [73]:
s.plotsizehist(colorflag=True)
In [74]:
s.plotsizehist(colorflag=True,btcut=.3)
In [23]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --diskonly 1 --minmass 9
In [24]:
print 'CLUSTER MEMBERS'
dat = s.sizeratio[s.sampleflag & s.membflag]
print 'size ratio mean (median) +/- std (err mean) = %.2f (%.2f) +/- %.2f (%.2f)'%(np.mean(dat),np.median(dat),np.std(dat),np.std(dat)/np.sqrt(1.*len(dat)))
print 'FIELD GALAXIES'
dat = s.sizeratio[s.sampleflag & ~s.membflag]
print 'size ratio mean (median) +/- std (err mean) = %.2f (%.2f) +/- %.2f (%.2f)'%(np.mean(dat),np.median(dat),np.std(dat),np.std(dat)/np.sqrt(1.*len(dat)))
print 'KS TEST COMPARING CLUSTER AND FIELD'
ks(s.sizeratio[s.sampleflag & s.membflag],s.sizeratio[s.sampleflag & ~s.membflag])
Out[24]:
Looking at Group vs Field as well
In [25]:
btcut = 1.3
flag = s.sampleflag & s.membflag & (s.s.B_T_r < btcut)
groupflag = flag & ((s.s.CLUSTER == 'MKW11') | (s.s.CLUSTER == 'MKW8') | (s.s.CLUSTER == 'AWM4') | (s.s.CLUSTER == 'NGC6107'))
clusterflag = flag & ((s.s.CLUSTER == 'Hercules') | (s.s.CLUSTER == 'A1367') | (s.s.CLUSTER == 'A2052') | (s.s.CLUSTER == 'A2063'))
bigmommaflag = flag & (s.s.CLUSTER == 'Coma')
fieldflag = s.sampleflag & (s.s.B_T_r < btcut) & ~s.membflag & ~s.dvflag
nearfieldflag = s.sampleflag & (s.s.B_T_r < btcut) & ~s.membflag & s.dvflag
In [26]:
#sizegroup = s.sizeratio[groupflag]
print 'CLUSTER MEMBERS'
dat = s.sizeratio[clusterflag | bigmommaflag]
print 'size ratio mean (median) +/- std (err mean) = %.2f (%.2f) +/- %.2f (%.2f)'%(np.mean(dat),np.median(dat),np.std(dat),np.std(dat)/np.sqrt(1.*len(dat)))
print 'GROUP MEMBERS'
dat = s.sizeratio[groupflag]
print 'size ratio mean (median) +/- std (err mean) = %.2f (%.2f) +/- %.2f (%.2f)'%(np.mean(dat),np.median(dat),np.std(dat),np.std(dat)/np.sqrt(1.*len(dat)))
print 'FIELD GALAXIES'
dat = s.sizeratio[fieldflag | nearfieldflag]
print 'size ratio mean (median) +/- std (err mean) = %.2f (%.2f) +/- %.2f (%.2f)'%(np.mean(dat),np.median(dat),np.std(dat),np.std(dat)/np.sqrt(1.*len(dat)))
In [ ]:
In [80]:
s.plotsizedvdr(plothexbin=True,plotmembcut=True,plotbadfits=False,hexbinmax=40)
In [81]:
s.plotsizestellarmassblue()
In [82]:
print 'Field size ratios (Re(24)/Re(r)):'
flag=s.sampleflag & ~s.agnflag & ~s.membflag & s.gim2dflag
print 'mean = %5.2f'%(np.mean(s.sizeratio[flag]))
print 'weighted mean = %5.2f'%(ws.weighted_mean(s.sizeratio[flag],weights=1./s.sizeratioERR[flag]))
print 'Cluster size ratios:'
flag=s.sampleflag & ~s.agnflag & s.membflag & s.gim2dflag
print 'mean = %5.2f'%(np.mean(s.sizeratio[flag]))
print 'weighted mean = %5.2f'%(ws.weighted_mean(s.sizeratio[flag],weights=1./s.sizeratioERR[flag]))
The right panel shows the weighted mean vs. stellar mass. The size ratio for cluster galaxies lies below that of the field, but there is no trend between size and stellar mass.
Need to change this so that the solid and dashed lines show the weighted mean for the field. As of now, they are showing the mean.
also need to reset the y axis to reflect the higher size ratios that I get when using the disk size only.
In [88]:
s.plotsizeBTblue()
This shows $R_e(24)/R_e(r)$ vs. B/T for cluster and field galaxies. The left panels show the values for individual galaxies. The $\rho$ values indicates the 68% confidence interval for the Spearman rank correlation coeffienct, which I determine using Monte Carlo resampling. The Monte Carlo code assumes that the errors reported by galfit for $R_e(24)$ are gaussion - this is not likely to be true, but it's better than nothing.
Correlation is less significant among field galaxies.
In [89]:
s.plotsizeNUVrblue()
In [85]:
spearman(s.sizeratio[s.sampleflag],s.ur[s.sampleflag])
Out[85]:
In [6]:
s.plotcolorcolor3panel()
Trying to call R functions from python using rpy2. This will allow me to better document the procedure and results because I can include it in this notebook!
Need to implement error analysis here. Use monte carlo method again.
In the end, it might be better to remove galaxies with large errors associated with 24um size. They are probably just adding noise.
In [1]:
#%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9. --diskonly 0
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py
In [27]:
ppcor = rpackages.importr('ppcor')
pcor = robjects.r['pcor'] # can calculate partial correlation coeff for matrix
pcor_test = robjects.r['pcor.test']
print robjects.r['pi']
In [36]:
flag = s.sampleflag & (s.s.SIGMA_5 != 0)#& ~s.membflag
rmass = robjects.FloatVector(s.logstellarmass[flag])
rsize = robjects.FloatVector(s.sizeratio[flag])
rBT = robjects.FloatVector(s.s.B_T_r[flag])
#rBT = robjects.FloatVector(s.s.SERSIC_N[flag])
rrad = robjects.FloatVector(s.s.DR_R200[flag])
rsigma5 = robjects.FloatVector(s.s.SIGMA_5[flag])
rsigma10 = robjects.FloatVector(s.s.SIGMA_10[flag])
rsigmaNN = robjects.FloatVector(s.s.SIGMA_NN[flag])
print sum(flag)
In [33]:
print rBT[0:10]
In [37]:
print 'Partial Correlation between size and BT, controlling for Sigma_5, M, and projected radius'
t = pcor_test(rsize,rBT,[rsigma5,rmass,rrad],method='spearman')
print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print '** Partial Correlation between size and M, controlling for B/T, Sigma_5**'
t = pcor_test(rsize,rmass,[rBT,rsigma5],method='spearman')
print t
print 'RESULT: size and stellar mass are NOT correlated'
print ''
print ''
print '** Partial Correlation between size and Sigma_5, controlling for B/T, M**'
t = pcor_test(rsize,rsigma5,[rBT,rmass],method='spearman')
print t
print 'RESULT: size and Sigma_5 are strongly correlated'
In [35]:
print '** Partial Correlation between size and M, controlling for B/T, Sigma_10**'
t = pcor_test(rsize,rmass,[rBT,rsigma10],method='spearman')
print t
print 'RESULT: size and stellar mass are NOT correlated'
print ''
print ''
print 'Partial Correlation between size and B/T, controlling for Sigma_10, M, and projected radius'
t = pcor_test(rsize,rBT,[rsigma10,rmass,rrad],method='spearman')
print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print '** Partial Correlation between size and Sigma_10, controlling for B/T, M**'
t = pcor_test(rsize,rsigma10,[rBT,rmass],method='spearman')
print t
print 'RESULT: size and Sigma_10 are strongly correlated (more so than sigma_5)'
print ''
print ''
print '** Partial Correlation between size and Sigma_NN, controlling for B/T, Sigma_10 **'
t = pcor_test(rsize,rsigmaNN,[rBT,rsigma10],method='spearman')
print t
print 'RESULT: size and Sigma_NN are weakly correlated (even after controlling for Sigma_10 and B/T)'
In [20]:
# how to get value back from robject
t[1][0]
Out[20]:
In [44]:
print 'RUNNING AGAIN WITHOUT COMA IN SAMPLE'
print ''
flag = nc.sampleflag & (nc.s.SIGMA_5 != 0)#& ~s.membflag
rmass = robjects.FloatVector(nc.logstellarmass[flag])
rsize = robjects.FloatVector(nc.sizeratio[flag])
rsizeerr = robjects.FloatVector(nc.sizeratioERR[flag])
rBT = robjects.FloatVector(nc.s.B_T_r[flag])
rBT = robjects.FloatVector(nc.s.SERSIC_N[flag])
rrad = robjects.FloatVector(nc.s.DR_R200[flag])
rsigma5 = robjects.FloatVector(nc.s.SIGMA_10[flag])
print '** Partial Correlation between size and M, controlling for B/T, Sigma_5**'
t = pcor_test(rsize,rmass,[rBT,rsigma5],method='spearman')
print t
print 'RESULT: size and stellar mass are NOT correlated'
print ''
print ''
print 'Partial Correlation between size and BT, controlling for Sigma_5, M, and projected radius'
t = pcor_test(rsize,rBT,[rsigma5,rmass,rrad],method='spearman')
print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print '** Partial Correlation between size and Sigma_5, controlling for B/T, M**'
t = pcor_test(rsize,rsigma5,[rBT,rmass],method='spearman')
print t
print 'RESULT: size and Sigma_5 is 2-sigma correlated when coma is removed'
In [38]:
#%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9. --diskonly 0
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py
ppcor = rpackages.importr('ppcor')
pcor = robjects.r['pcor'] # can calculate partial correlation coeff for matrix
pcor_test = robjects.r['pcor.test']
In [53]:
def pcor_with_errors(x,y,yerr,z,Nmc = 1000,plotflag=False,method='spearman'):
'''
this is pretty costume to my specific needs in this paper.
I am always looking to measure the correlation between X variable and size (Re24/Rer).
so this function takes only y error
need to pass in x,y,yerr, and z as robjects.FloatVector
z is a list of variables to control for:
z = [rsigma5,rmass,rrad]
Nmc = number of monte carlo realizations. default = 1000
Returns: arrays coeff, pvalue that contain the cor cooef and p-value for each realization.
'''
ysim = np.zeros(Nmc,'f')
coeff = np.zeros(Nmc,'f')
pvalue = np.zeros(Nmc,'f')
stat = np.zeros(Nmc,'f')
for i in range(Nmc):
ysim=np.random.normal(y,scale=yerr,size=len(y))
#ysim = y
rysim = robjects.FloatVector(ysim)
t = pcor_test(x,rysim,z,method=method)
coeff[i] = t[0][0]
pvalue[i] = t[1][0]
stat[i] = t[2][0]
#print 'mean (median) = %5.2f (%5.2f), std = %5.2f'%(cave,np.median(rhosim),cstd)
#print 'confidence interval from sorted list of MC fit values:'
#print 'lower = %5.2f (%5.2f), upper = %5.2f (%5.2f)'%(lower,cave-cstd, upper,cave+cstd)
if plotflag:
#cave=np.mean(coeff)
#cstd=np.std(coeff)
#q1=50-34 # mean minus one std
#lower=np.percentile(coeff,q1)
#q2=50+34 # mean minus one std
#upper=np.percentile(coeff,q2)
#plt.figure(figsize=(10,4))
#plt.subplot(1,2,1)
#plt.hist(coeff,bins=10,normed=True)
#plt.xlabel(r'$Corr Coeff \ \rho $')
#plt.axvline(x=cave,ls='-',color='k')
#plt.axvline(x=lower,ls='--',color='k')
#plt.axvline(x=upper,ls='--',color='k')
#plt.subplot(1,2,2)
#plt.hist(np.log10(pvalue),bins=10,normed=True)
#plt.xlabel(r'$\log_{10}(p \ value)$')
plt.figure()
plt.hexbin(coeff,np.log10(pvalue))
plt.xlabel('$Corr \ Coeff$')
plt.ylabel('$log_{10}(p-value)$')
#plt.axhline(y=np.log10(1-.68),ls=':',color='r',lw=2)
plt.axhline(y=np.log10(1-.95),ls='--',color='y',lw=2)
plt.axhline(y=np.log10(1-.997),ls='--',color='r',lw=1)
print 'mean,median, 68 percent confidence interval values:'
print '\t cor coeff = %4.2f (%4.2f) [%4.2f, %4.2f]'%(np.mean(coeff),np.median(coeff),np.percentile(coeff,16),np.percentile(coeff,84))
print '\t p-value = %4.2e (%4.2e) [%4.2e, %4.2e]'%(np.mean(pvalue),np.median(pvalue),np.percentile(pvalue,16),np.percentile(pvalue,84))
print '\t statistic = %5.4f (%5.4f) [%5.4f, %5.4f]'%(np.mean(stat),np.median(stat),np.percentile(stat,16),np.percentile(stat,84))
#print 'mean,median, 68 percent confidence interval +/-:'
#print '\t cor coeff = %4.2f (%4.2f) [%4.2f, %4.2f]'%(np.mean(coeff),np.median(coeff),np.mean(coeff)-np.percentile(coeff,16),np.percentile(coeff,84)-np.mean(coeff))
#print '\t p-value = %4.2e (%4.2e) [%4.2e, %4.2e]'%(np.mean(pvalue),np.median(pvalue),np.mean(pvalue)-np.percentile(pvalue,16),np.percentile(pvalue,84)-np.mean(pvalue))
#print '\t statistic = %5.4f (%5.4f) [%5.4f, %5.4f]'%(np.mean(stat),np.median(stat),np.mean(stat)-np.percentile(stat,16),np.percentile(stat,84)-np.mean(stat))
k,pnorm=normaltest(coeff)
print 'probability that distribution of cor coef is normal = %5.4f'%(pnorm)
k,pnorm=normaltest(pvalue)
print 'probability that distribution of p-values is normal = %5.4f'%(pnorm)
return coeff,pvalue,stat
In [16]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py
In [70]:
flag = s.sampleflag & (s.s.SIGMA_5 != 0)#&
rmass = robjects.FloatVector(s.logstellarmass[flag])
rsize = robjects.FloatVector(s.sizeratio[flag])
rsizeerr = robjects.FloatVector(s.sizeratioERR[flag])
rBT = robjects.FloatVector(s.gim2d.B_T_r[flag])
#rBT = robjects.FloatVector(s.s.SERSIC_N[flag])
rrad = robjects.FloatVector(s.s.DR_R200[flag])
rsigma5 = robjects.FloatVector((s.s.SIGMA_5[flag]))
print sum(flag)
In [71]:
#rizeerr = robjects.FloatVector(np.ones(sum(flag),'f'))
print 'Partial Correlation between size and M, controlling for B/T, Sigma_5'
coeff,p,stat = pcor_with_errors(rmass,rsize,rsizeerr,[rBT,rsigma5],plotflag=False)
print 'RESULT: size and stellar mass are NOT correlated'
print ''
print ''
print 'Partial Correlation between size and BT, controlling for Sigma_5, M, and projected radius'
coeff,p,stat = pcor_with_errors(rBT,rsize,rsizeerr,[rsigma5,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print 'Partial Correlation between size and Sigma_5, controlling for B/T, M'
coeff,p,stat = pcor_with_errors(rsigma5,rsize,rsizeerr,[rBT,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and Sigma_5 are strongly correlated'
In [66]:
flag = s.sampleflag & (s.s.SIGMA_5 != 0)#& ~s.membflag
rmass = robjects.FloatVector(s.logstellarmass[flag])
rsize = robjects.FloatVector(s.sizeratio[flag])
rsizeerr = robjects.FloatVector(s.sizeratioERR[flag])
rBT = robjects.FloatVector(s.gim2d.B_T_r[flag])
#rBT = robjects.FloatVector(s.s.SERSIC_N[flag])
rrad = robjects.FloatVector(s.s.DR_R200[flag])
rsigma5 = robjects.FloatVector((s.s.SIGMA_5[flag]))
print (sum(flag))
In [67]:
print 'IF WE LIMIT SAMPLE TO GALAXIES WITH B/T FITS'
#rizeerr = robjects.FloatVector(np.ones(sum(flag),'f'))
print 'Partial Correlation between size and M, controlling for B/T, Sigma_5'
coeff,p,stat = pcor_with_errors(rmass,rsize,rsizeerr,[rBT,rsigma5],plotflag=False)
print 'RESULT: size and stellar mass are NOT correlated'
print ''
print ''
print 'Partial Correlation between size and BT, controlling for Sigma_5, M, and projected radius'
coeff,p,stat = pcor_with_errors(rBT,rsize,rsizeerr,[rsigma5,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print 'Partial Correlation between size and Sigma_5, controlling for B/T, M'
coeff,p,stat = pcor_with_errors(rsigma5,rsize,rsizeerr,[rBT,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and Sigma_5 are strongly correlated'
In [72]:
flag = nc.sampleflag & (nc.s.SIGMA_5 != 0)#& ~s.membflag
rmass = robjects.FloatVector(nc.logstellarmass[flag])
rsize = robjects.FloatVector(nc.sizeratio[flag])
rsizeerr = robjects.FloatVector(nc.sizeratioERR[flag])
rBT = robjects.FloatVector(nc.gim2d.B_T_r[flag])
#rBT = robjects.FloatVector(nc.s.SERSIC_N[flag])
rrad = robjects.FloatVector(nc.s.DR_R200[flag])
rsigma5 = robjects.FloatVector((nc.s.SIGMA_5[flag]))
In [73]:
print 'REPEATING WITHOUT COMA'
#rizeerr = robjects.FloatVector(np.ones(sum(flag),'f'))
print 'Partial Correlation between size and M, controlling for B/T, Sigma_5'
coeff,p,stat = pcor_with_errors(rmass,rsize,rsizeerr,[rBT,rsigma5],plotflag=False)
print 'RESULT: size and stellar mass are NOT strongly correlated'
print ''
print ''
print 'Partial Correlation between size and BT, controlling for Sigma_5, M, and projected radius'
coeff,p,stat = pcor_with_errors(rBT,rsize,rsizeerr,[rsigma5,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and B/T are strongly correlated'
print ''
print ''
print 'Partial Correlation between size and Sigma_5, controlling for B/T, M'
coeff,p,stat = pcor_with_errors(rsigma5,rsize,rsizeerr,[rBT,rmass],method='spearman',plotflag=False)
#print t
print 'RESULT: size and Sigma_5 are not strongly correlated'
Yay! This gives the same results as when running R studio!!!
The right thing to do is to run pcor on the matrix of mass, size, BT, sigma5, and projected radius. And anything else that might affect size???
can't figure out how to implement this using rpy2, so skipping for now.
Below are things I've tried but didn't work.
In [ ]:
data = np.array([s.sizeratio[flag],s.s.B_T_r[flag]])
d = robjects.FloatVector(data)
data.shape
print data[0,3]
print data
In [ ]:
data_matrix = robjects.r['matrix'](data,nrow = 2)#,rrad,rsigma5,rmass], nrow = 5)
print data_matrix
In [ ]:
pc = pcor(data_matrix, method = 'spearman')
Wondering why I need to have a mass cut at all as long as the field and cluster samples are comparable. What is the effect of having mass cut > 9 vs mass cut > 9.75?
If mass cut is increased to 9.75, then we lose any significant correlation between size and B/T and/or size and environment. I'll check the R code as well to see.
Partial correlation coefficient analysis indicates that the size-B/T correlation is still significant, but a residual correlation between size and environment is not detected.
So, which is right. What about 9.5?
What about 9.3?
mass > 9
mass > 8
RESULT
In [195]:
# here is min mass = 9.
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9.
makemoneyplots()
In [177]:
# here is min mass = 9.3
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9.3
makemoneyplots()
In [170]:
# here is min mass = 9.5
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py
makemoneyplots()
In [193]:
# here is min mass = 9.75
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 9.75
makemoneyplots()
In [144]:
flag = s.sampleflag & (s.logstellarmass > 9.) & (s.logstellarmass < 9.8) & (s.s.B_T_r < .2)
memb = flag & s.membflag
field = flag & ~s.membflag
# compare color
print 'comparing NUV-r color of cluster and field galaxies'
ks(s.NUVr[memb],s.NUVr[field])
Out[144]:
How are sfrs different? plot a cumulative dist
In [145]:
plt.figure()
t = plt.hist(s.SFR_BEST[memb],bins=sum(memb),cumulative=True,normed=True,color='k',histtype='step',label='Cluster')
#print sum(memb),sum(field),s.SFR_BEST[memb]
t = plt.hist(s.SFR_BEST[field],bins=sum(field),cumulative=True,normed=True,color='0.5',histtype='step',label='Field')
plt.xlabel('SFR')
plt.legend(loc = 'upper left')
plt.ylim(-.05,1.05)
# compare SFR
print 'comparing SFR of cluster and field galaxies'
ks(s.SFR_BEST[memb],s.SFR_BEST[field])
Out[145]:
In [146]:
plt.figure()
t = plt.hist(s.ur[memb],bins=sum(memb),cumulative=True,normed=True,color='k',histtype='step',label='Cluster')
#print sum(memb),sum(field),s.SFR_BEST[memb]
t = plt.hist(s.ur[field],bins=sum(field),cumulative=True,normed=True,color='0.5',histtype='step',label='Field')
plt.xlabel('u-r')
plt.legend(loc = 'upper left')
plt.ylim(-.05,1.05)
print 'comparing u-r color of cluster and field galaxies'
ks(s.ur[memb],s.ur[field])
Out[146]:
In [19]:
plt.figure()
flag = s.sampleflag
sp = plt.scatter(s.s.SIGMA_5[flag],s.sizeratio[flag],c=s.s.B_T_r[flag],s=60,vmin=0,vmax=.4)
plt.colorbar()
plt.gca().set_xscale('log')
t = spearman_with_errors(s.s.SIGMA_5[flag],s.sizeratio[flag],s.sizeratioERR[flag])
In [ ]:
s.plotsizesb()
At first glance, this seems disconcerting, but the two variables are correlated b/c I use Re(24) to calculate surface brightness. So larger Re(24) should have fainter surface brightness.
R partial correlation analysis shows some very strong correlations between B/A and parameters I would expect to be correlated.
stellar mass
SFR
surface density
NUV24 color - this one is not surprising
In [194]:
# here is min mass = 8.
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 7.
makemoneyplots()
In [9]:
figure()
plot(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.s.SIZE_RATIO[s.sampleflag & ~s.agnflag],'bo')
xlabel('B/A')
ylabel('Re(24)/Re(r)')
spearman(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.s.SIZE_RATIO[s.sampleflag & ~s.agnflag])
Out[9]:
In [ ]:
s.s.columns
In [ ]:
figure()
plot(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.s.FLUX24[s.sampleflag & ~s.agnflag],'bo')
xlabel('B/A')
ylabel('FLUX24')
spearman(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.s.FLUX24[s.sampleflag & ~s.agnflag])
In [ ]:
figure()
plot(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.NUVr[s.sampleflag & ~s.agnflag],'bo')
xlabel('B/A')
ylabel('NUV-r')
spearman(s.s.fcaxisratio1[s.sampleflag & ~s.agnflag],s.NUVr[s.sampleflag & ~s.agnflag])
I make the claim that the MIPS data is much lower resolution and SNR than typically used with galfit, but Greg said no.
Median redshift of my sample is $$z=0.03.$$ At this distance, there are $$0.604~kpc/arcsec.$$ The MIPS pixel size is 2.45 arcsec, so this means 1.48 kpc per resolution element.
The pixel size of Hubble is 0.1 arcsec/pixel. Therefore, to get 1.48 kpc per resolution element, we want to find the redshift where that angular diameter distance corresponds to $\sim$14.8 kpc/arcsec. Using Ned Wright's cosmology calculator to figure this out. Can't seem to get this to ever get that high. The max value peaks around $z = 1.6$ at 8.615 kpc/arcsec.
So, it looks like my statement is correct.
In [21]:
s.printsize()
In [29]:
%run ~/Dropbox/pythonCode/LCSanalyzeblue.py --minmass 7.
In [31]:
flag = s.sampleflag & (s.sizeratio < .7) & ~s.membflag & (s.s.DELTA_V > 3.)
print sum(flag)
print s.s.NSAID[flag]
Out[31]:
In [34]:
flag = s.sampleflag & (s.sizeratio < .7) & ~s.membflag & (s.s.DELTA_V < 3.) & (s.s.DR_R200 > 2.)
print sum(flag)
print s.s.NSAID[flag]
In [ ]: