Outline

GC per Galaxy

Harris Data Inspection

clean data
- add iMType
- exclude 0: Milky Way Galaxy & 356: A1689-BCG with NaN VMag
- remove duplicate NGC4417(228=NaN), select better result for VCC-1386 (273=NaN)
missing data
- VMag -> KMag: linear Regression for low Magnitude galaxies

* VMag == null: 1 MWG
* KMag == null: 77
* -Ngc == 0: 6
* sigma == null: 148
* Reff == null: 79
* -logMd == null: 166
* -logMGC == null: 1
* logMB == null: 357

explore data
- MType correlation: sigma, KMag, Reff...
- Ngc correlation: KMag, VMag, Reff...
- pairplot for different iMType: E & S0 have bimode
  - Harris fig 10 & MType not agree; classification statistics not agree
- skymap

GWGC Data Inspection

clean data
- duplicate are not affecting
missing VMag
- cross check Galaxy with Harris
- kde fitting with B-Mag or I-Mag: b_max=-19.6276150479/I_max=-21.0082876063 while v_max=-20.5158289467

Ngc estimation model

Model Selection
- Ngc vs. logMd: jointplot with regression, well fit; however, errorbar very wild
- Sn vs. VMag: also dirty, but doable
LOWESS model
- prediction plot of Sn vs VMag, Sn is cut to 0 for VMag notin GCpG.VMag.range (-11.17, -24.19)
- total amount of GC: 649111
- skymap of Galaxy with GC
- Ngc vs VMag regression plot with distribution

55 Globular Cluster Data

data explore
- age vs. Z, HBtype, ...
GC age estimator
- random number generator based on kde
- kde plot for estimation and data

Build GC library

relational database?
- VGG DataFrame
- GC@G DataFrame
- BHB@GC DataFrame
bhlib: load all BH info
GClib: manipulate from VGG.Ngc & assign random Model
BHBdata: draw BHB from BHlib, combine host GC & Galaxy info. from GClib

host galaxy info	GC info	BH info
1-5	6-7	8-14
GC, RA, DEC, Dist, VMag	Model, T_GC(GC birth time=13Gyr-Age), [Fe/H, f_b, ...]	T_eject, Type1, Type2, M1, M2, Seperation,Ecc, (BHlib.Model=GClib.Model)

1       2   3    4     5     6      7    8        9      10     11     12     13  14
GALAXY, RA, DEC, DIST, VMag, Model, Age, T_eject, Type1, Type2, Mass1, Mass2, AP, ECC
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.015 14.0 14.0 22.743 7.4624 7.3712 0.1834
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.015 14.0 14.0 23.152 8.137 59.156 0.2299
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.379 6.9613 124.3 0.9227
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 10.707 7.7546 10.758 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.894 7.7425 438.42 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.963 7.8389 113.13 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.403 8.6012 48.091 0.6402
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.582 8.8515 26.886 0.5168
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.77 9.2126 91.431 0.7603
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 25.773 11.163 141.72 0.2939

GC model setting

Harris MW GC property

To improve

use DM halo mass for correlation: SDSS DR7 for galaxy another catalog SDSS
HDF5 to store simulation data for easier visualization: HDF5 for python HDF5 for fortran

To explore

Tidal effect to binary stellar evolution & cross section change for non rigid body tidal disruption
NSC & MBH on Mass-$\sigma$ plot early paper
Massive BH formation: runaway collision channels Gürkan 04 runaway in YSC



In [1]:

    
%pylab inline
import numpy as np
from datetime import datetime
import random
import pandas as pd
import os









    



Populating the interactive namespace from numpy and matplotlib

Harris Catalog: Sn vs VMag with LOWESS model



In [2]:

    
from scipy.interpolate import interp1d
import statsmodels.api as sm

## load Harris catalog
GCpG=pd.read_csv("/Users/domi/Dropbox/Research/Local_universe/data/GCpG.csv") # read csv version data
# GCpG[GCpG.duplicated()]
GCpG.drop_duplicates(['Name'],inplace=True)

## calculate Sn 
Sn=GCpG.Ngc*10**(0.4*(GCpG.VMag+15.0))
sn_fit=GCpG[['VMag']]
sn_fit['Sn']=pd.Series(Sn,index=sn_fit.index)

## LOWESS model to predict value

# introduce some floats in our x-values
x = sn_fit.VMag
y = sn_fit.Sn
# lowess will return our "smoothed" data with a y value for at every x-value
lowess = sm.nonparametric.lowess(y, x, frac=0.36)
# unpack the lowess smoothed points to their values
lowess_x = list(zip(*lowess))[0]
lowess_y = list(zip(*lowess))[1]
# run scipy's interpolation. There is also extrapolation I believe
f = interp1d(lowess_x, lowess_y, bounds_error=False)









    



/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/ipykernel/__main__.py:12: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

GW Galaxy catalog with $d<30Mpc$ -> VGG



In [3]:

    
## load Galaxy catalog
Galaxy=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/GWGCCatalog_IV.csv',
                   delim_whitespace=True)

## Convert data with ~
i=0
for column in Galaxy:
    i+=1
    if (i>4 and i<23):
        Galaxy[column]=Galaxy[column].convert_objects(convert_numeric=True)
        
VGG=Galaxy[Galaxy.Dist<30]

## add and convert new VMag for VGG
# VGG['VMag']=pd.Series(VGG.Abs_Mag_B + 4.83 - 5.48,index=VGG.index)
# extra=VGG.Abs_Mag_B.isnull()&VGG.Abs_Mag_I.notnull()
# VGG.VMag[extra]=VGG.Abs_Mag_I[extra] + 4.83 - 4.08









    



/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/ipykernel/__main__.py:10: FutureWarning: convert_objects is deprecated.  Use the data-type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.

VGG: convert B/I to VMag based on VGG&Harris



In [4]:

    
import seaborn as sns

# ## use cross checked galaxy for magnitude convertion
# ind_GCpG=GCpG.Name.isin(VGG.Name)
# ind_VGG=VGG.Name.isin(GCpG.Name)

# f_joint, (ax1, ax2, ax3) = subplots(3, sharex=True, sharey=True, figsize=(8,8))
# xlim([-25,-10])
# p_v=sns.kdeplot(GCpG.VMag[ind_GCpG].get_values(),shade=True,ax=ax1,label='VMag in Harris')
# x,y = p_v.get_lines()[0].get_data()
# v_max=x[y.argmax()]
# ax1.vlines(v_max, 0, y.max())

# p_b=sns.kdeplot(VGG.Abs_Mag_B[ind_VGG].get_values(),shade=True,ax=ax2,label='BMag in White')
# x,y = p_b.get_lines()[0].get_data()
# b_max=x[y.argmax()]
# ax2.vlines(b_max, 0, y.max())

# p_I=sns.kdeplot(VGG.Abs_Mag_I[ind_VGG].get_values(),shade=True,ax=ax3,label='IMag in White')
# x,y = p_I.get_lines()[0].get_data()
# I_max=x[y.argmax()]
# ax3.vlines(I_max, 0, y.max())

## add and convert new VMag for VGG: VMag = V_max + B/I - B/I_max
VGG['VMag']=pd.Series(VGG.Abs_Mag_B -20.515828946699045 + 19.627615047946385,index=VGG.index)
extra=VGG.Abs_Mag_B.isnull()&VGG.Abs_Mag_I.notnull()
VGG.VMag[extra]=VGG.Abs_Mag_I[extra] -20.515828946699045 + 21.008287606264027









    



/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/ipykernel/__main__.py:25: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/ipykernel/__main__.py:27: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/pandas/core/generic.py:4702: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)
/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/IPython/core/interactiveshell.py:2881: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  exec(code_obj, self.user_global_ns, self.user_ns)

Compute Ngc for VGG



In [5]:

    
## calculate Ngc based on Sn from f(VMag)
VGG['Ngc']=pd.Series((map(lambda x: 0 if isnan(f(x)) else int(f(x)/10**(0.4*(x+15))),VGG.VMag)),index=VGG.index)
# sum(VGG.Ngc)









    



/Users/domi/anaconda3/envs/py35/lib/python3.5/site-packages/ipykernel/__main__.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app

Estimate GC age based on 55 MWGCs

Kernel Density Estimation: Predict KDE & Generate Data

Simulate from Kernel Density Estimate (empirical PDF)



In [6]:

    
GCage=pd.read_csv("/Users/domi/Dropbox/Research/Local_universe/data/55GC_age.csv")



In [7]:

    
def generate_rand_from_pdf(pdf, x_grid):
    cdf = np.cumsum(pdf) # pdf from kde plot x128
    cdf = cdf / cdf[-1]  # normalization
    values = np.random.rand(sum(VGG.Ngc)) # sample size Ngc
    value_bins = np.searchsorted(cdf, values) # group Ngc into nearest 128 bins
    random_from_cdf = x_grid[value_bins]
    return random_from_cdf  # return Ngc



In [8]:

    
age_kde=sns.kdeplot(GCage.Age,kernel='gau',bw='silverman')
#x_grid = np.linspace(min(GCage.Age)-1, max(GCage.Age)+1, len(age_kde.get_lines()[0].get_data()[1])) 
age_curve=age_kde.get_lines()[0].get_data()
f2=interp1d(age_curve[0],age_curve[1],kind='cubic')



In [9]:

    
age_grid=np.linspace(min(age_curve[0]), max(age_curve[0]), 10000)



In [10]:

    
## original method, age not well spreaded. 

# x_grid = np.linspace(min(GCage.Age)-1, max(GCage.Age)+1, len(age_kde.get_lines()[0].get_data()[1])) 
# # define how many GC need to estimate the age
# GCage_from_kde = generate_rand_from_pdf(age_kde.get_lines()[0].get_data()[1], x_grid) 
# sns.distplot(GCage_from_kde)



In [11]:

    
## Age better spreaded
GCage_from_kde = generate_rand_from_pdf(f2(age_grid), age_grid) 
# sns.distplot(GCage_from_kde)



In [12]:

    
# cut off at 13.5 Gyrs
GCage_from_kde[GCage_from_kde>=13.5]=np.random.choice(GCage_from_kde[GCage_from_kde<13.5],sum(GCage_from_kde>=13.5))



In [13]:

    
# numpy.savetxt("/Users/domi/Dropbox/Research/Local_universe/data/GCage.dat", GCage_from_kde, delimiter=",")



In [14]:

    
# GCage_from_kde=numpy.loadtxt("/Users/domi/Dropbox/Research/Local_universe/data/GCage.dat",delimiter=",")

Multiple realization

load bhblib
load GClib
build BHBdata

Build BH LIB

Random samples in Pandas



In [15]:

    
## Load BHLIB
bhlib=pd.DataFrame()
###################################
for i in range(1,325): # model
    for j in range(1,11):  # model id
###################################
        bhe=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/BHsystem/%d-%d-bhe.dat' %(i,j),
                    usecols=[0, 2, 3, 4, 6, 8, 10, 20], names=['T_eject','Type1','Type2','M1','M2','Seperation','Ecc','Model'], 
                    header=None, delim_whitespace=True)
        bhe.Model='%d-%d' %(i,j)
        bhlib=pd.concat([bhlib,bhe],ignore_index=False)



In [16]:

    
# BHB binary
bhblib=bhlib[bhlib.Type1==14*(bhlib.Type2==14)].copy(deep=True)
bhsys=bhlib[-(bhlib.Type1==14*(bhlib.Type2==14))].copy(deep=True)



In [17]:

    
bhblib=bhblib.drop(bhblib.columns[[1, 2]], axis=1)



In [18]:

    
bhblib.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/bhblib.dat',
              index=True, sep=' ', header=True, float_format='%2.6f')



In [19]:

    
# bhblib=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/bhblib.dat',
#               sep=' ', index_col=0)



In [20]:

    
print(shape(bhsys),shape(bhblib),shape(bhlib))









    



(11095, 8) (87364, 6) (98459, 8)

bhlib: T_eject Type1 Type2 M1 M2 Seperation Ecc Model

Build GClib, assign GC Model to each GC in VGG Galaxy, add Age to GC Model

Slow, only run it when necessary



In [23]:

    
# # index of the Galaxy with GC
# ind_GC=VGG.index[VGG.Ngc>0]
# # initialize the GClib
# GClib=pd.DataFrame()

# for row in ind_GC:
#     GClib=GClib.append([VGG.ix[[row],['RA','Dec','Dist','VMag']]]*VGG.Ngc[row]) # creat GCs in each Galaxy

# GClib=GClib.reset_index()
# GClib=GClib.rename(columns = {'index':'Galaxy'})

# # write to file
# GClib.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/GClib.dat',
#              index=False,sep=' ', header=False, float_format='%2.6f')



In [21]:

    
GClib=pd.DataFrame()
GClib=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/GClib.dat',
                sep=' ',header=None, names=['Galaxy','RA','Dec','Dist','VMag','Model','Age'])

# assign model to GClib
GClib['Model']=GClib['RA'].apply(lambda x: str(randint(1,325))+'-'+str(randint(1,11)))
# assign Age to GClib
GClib['Age']=np.random.choice(GCage_from_kde,size(GCage_from_kde))

GClib: Galaxy RA Dec Dist VMag Model Age

Galaxy is the index of GC in VGG



In [23]:

    
# to check pd.merge correctly assign the right BH infor from bhlib to GClib based on Model
display(GClib.ix[[133],GClib.columns],pd.merge(GClib.ix[[133],GClib.columns],bhblib,on='Model'),bhblib[bhblib.Model=='64-9'])



In [24]:

    
# with pd.option_context('display.max_columns', None):
#     display(bhlib.sample(5),VGG.sample(5))

Build BHBdata and write to file



In [25]:

    
# ## Slow
# BHBdata=pd.merge(GClib,bhblib,on='Model')
# BHBdata.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/BHBdata.dat.gz',
#                index=False,sep=' ', header=False, float_format='%2.6f',compression='gzip')



In [ ]:



In [ ]:



In [23]:

    
from gcpg import build



In [45]:

    
reload(build)









    Out[45]:





<module 'gcpg.build' from 'gcpg/build.py'>



In [24]:

    
build.run(1)









    Out[24]:





1



In [9]:

    
from astropy.table import Table



In [30]:

    
help(Table.read)









    



Help on method read in module astropy.table.table:

read(cls, *args, **kwargs) method of __builtin__.type instance
    Read and parse a data table and return as a Table.
    
    This function provides the Table interface to the astropy unified I/O
    layer.  This allows easily reading a file in many supported data formats
    using syntax such as::
    
      >>> from astropy.table import Table
      >>> dat = Table.read('table.dat', format='ascii')
      >>> events = Table.read('events.fits', format='fits')
    
    The arguments and keywords (other than ``format``) provided to this function are
    passed through to the underlying data reader (e.g. `~astropy.io.ascii.read`).
    
    The available built-in formats are:
    
    =========================== ==== ===== ============= ==========
               Format           Read Write Auto-identify Deprecated
    =========================== ==== ===== ============= ==========
                          ascii  Yes   Yes            No           
                   ascii.aastex  Yes   Yes            No           
                    ascii.basic  Yes   Yes            No           
                      ascii.cds  Yes    No            No           
         ascii.commented_header  Yes   Yes            No           
                      ascii.csv  Yes   Yes            No           
                  ascii.daophot  Yes    No            No           
                     ascii.ecsv  Yes   Yes            No           
               ascii.fast_basic  Yes   Yes            No           
    ascii.fast_commented_header  Yes   Yes            No           
                 ascii.fast_csv  Yes   Yes            No           
           ascii.fast_no_header  Yes   Yes            No           
                 ascii.fast_rdb  Yes   Yes            No           
                 ascii.fast_tab  Yes   Yes            No           
              ascii.fixed_width  Yes   Yes            No           
    ascii.fixed_width_no_header  Yes   Yes            No           
     ascii.fixed_width_two_line  Yes   Yes            No           
                     ascii.html  Yes   Yes           Yes           
                     ascii.ipac  Yes   Yes            No           
                    ascii.latex  Yes   Yes           Yes           
                ascii.no_header  Yes   Yes            No           
                      ascii.rdb  Yes   Yes           Yes           
               ascii.sextractor  Yes    No            No           
                      ascii.tab  Yes   Yes            No           
                           fits  Yes   Yes           Yes           
                           hdf5  Yes   Yes           Yes           
                        votable  Yes    No            No           
                         aastex  Yes   Yes            No        Yes
                            cds  Yes    No            No        Yes
                            csv  Yes   Yes           Yes        Yes
                        daophot  Yes    No            No        Yes
                           html  Yes   Yes            No        Yes
                           ipac  Yes   Yes            No        Yes
                          latex  Yes   Yes            No        Yes
                            rdb  Yes   Yes            No        Yes
    =========================== ==== ===== ============= ==========
    
    Deprecated format names like ``aastex`` will be removed in a future version.
    Use the full name (e.g. ``ascii.aastex``) instead.



In [17]:

    
from astropy.time import Time



In [18]:

    
Time('2015-11-2').gps









    Out[18]:





1130457617.0



In [ ]:



In [ ]:

	Galaxy	RA	Dec	Dist	VMag	Model	Age	T_eject	M1	M2	Seperation	Ecc
0	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.1401	24.971	11.4590	15.927	0.6669
1	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.2287	37.825	56.7130	64.290	0.1099
2	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.3125	14.788	29.0980	85.299	0.9994
3	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.3168	27.653	21.9470	228.440	0.6048
4	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.3168	27.984	22.5070	123.260	0.8348
5	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.3170	26.214	29.0010	232.590	0.8620
6	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.4201	81.988	56.9500	42.399	0.4616
7	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.4613	18.768	25.1270	38229.000	0.4662
8	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	0.4733	26.200	15.3020	107.360	0.5582
9	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	1.2949	29.481	31.1780	58.136	0.9673
10	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	1.3678	48.780	9.2538	81.883	0.5036
11	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	1.5758	27.536	28.2480	25.898	0.9574
12	133	0.05414	16.14555	13.183	-20.568214	64-9	11.805858	2.0091	21.239	26.4280	110.190	0.2828

	T_eject	M1	M2	Seperation	Ecc	Model
0	0.1401	24.971	11.4590	15.927	0.6669	64-9
1	0.2287	37.825	56.7130	64.290	0.1099	64-9
2	0.3125	14.788	29.0980	85.299	0.9994	64-9
3	0.3168	27.653	21.9470	228.440	0.6048	64-9
4	0.3168	27.984	22.5070	123.260	0.8348	64-9
5	0.3170	26.214	29.0010	232.590	0.8620	64-9
6	0.4201	81.988	56.9500	42.399	0.4616	64-9
7	0.4613	18.768	25.1270	38229.000	0.4662	64-9
8	0.4733	26.200	15.3020	107.360	0.5582	64-9
9	1.2949	29.481	31.1780	58.136	0.9673	64-9
10	1.3678	48.780	9.2538	81.883	0.5036	64-9
11	1.5758	27.536	28.2480	25.898	0.9574	64-9
14	2.0091	21.239	26.4280	110.190	0.2828	64-9