0: Milky Way Galaxy & 356: A1689-BCG with NaN VMagNGC4417(228=NaN), select better result for VCC-1386 (273=NaN)* VMag == null: 1 MWG
* KMag == null: 77
* -Ngc == 0: 6
* sigma == null: 148
* Reff == null: 79
* -logMd == null: 166
* -logMGC == null: 1
* logMB == null: 357
relational database?
bhlib: load all BH info
GClib: manipulate from VGG.Ngc & assign random Model
BHBdata: draw BHB from BHlib, combine host GC & Galaxy info. from GClib
| host galaxy info | GC info | BH info | 
|---|---|---|
| 1-5 | 6-7 | 8-14 | 
| GC, RA, DEC, Dist, VMag | Model, T_GC(GC birth time=13Gyr-Age), [Fe/H, f_b, ...] | T_eject, Type1, Type2, M1, M2, Seperation,Ecc, (BHlib.Model=GClib.Model) | 
1       2   3    4     5     6      7    8        9      10     11     12     13  14
GALAXY, RA, DEC, DIST, VMag, Model, Age, T_eject, Type1, Type2, Mass1, Mass2, AP, ECC
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.015 14.0 14.0 22.743 7.4624 7.3712 0.1834
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.015 14.0 14.0 23.152 8.137 59.156 0.2299
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.379 6.9613 124.3 0.9227
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 10.707 7.7546 10.758 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.894 7.7425 438.42 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 22.963 7.8389 113.13 0.0
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.403 8.6012 48.091 0.6402
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.582 8.8515 26.886 0.5168
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 23.77 9.2126 91.431 0.7603
25 0.00792 17.22027 16.222 -16.7982138988 207-5 11.1515151515 0.0301 14.0 14.0 25.773 11.163 141.72 0.2939
In [1]:
    
%pylab inline
import numpy as np
from datetime import datetime
import random
import pandas as pd
import os
    
    
In [2]:
    
from scipy.interpolate import interp1d
import statsmodels.api as sm
## load Harris catalog
GCpG=pd.read_csv("/Users/domi/Dropbox/Research/Local_universe/data/GCpG.csv") # read csv version data
# GCpG[GCpG.duplicated()]
GCpG.drop_duplicates(['Name'],inplace=True)
## calculate Sn 
Sn=GCpG.Ngc*10**(0.4*(GCpG.VMag+15.0))
sn_fit=GCpG[['VMag']]
sn_fit['Sn']=pd.Series(Sn,index=sn_fit.index)
## LOWESS model to predict value
# introduce some floats in our x-values
x = sn_fit.VMag
y = sn_fit.Sn
# lowess will return our "smoothed" data with a y value for at every x-value
lowess = sm.nonparametric.lowess(y, x, frac=0.36)
# unpack the lowess smoothed points to their values
lowess_x = list(zip(*lowess))[0]
lowess_y = list(zip(*lowess))[1]
# run scipy's interpolation. There is also extrapolation I believe
f = interp1d(lowess_x, lowess_y, bounds_error=False)
    
    
In [3]:
    
## load Galaxy catalog
Galaxy=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/GWGCCatalog_IV.csv',
                   delim_whitespace=True)
## Convert data with ~
i=0
for column in Galaxy:
    i+=1
    if (i>4 and i<23):
        Galaxy[column]=Galaxy[column].convert_objects(convert_numeric=True)
        
VGG=Galaxy[Galaxy.Dist<30]
## add and convert new VMag for VGG
# VGG['VMag']=pd.Series(VGG.Abs_Mag_B + 4.83 - 5.48,index=VGG.index)
# extra=VGG.Abs_Mag_B.isnull()&VGG.Abs_Mag_I.notnull()
# VGG.VMag[extra]=VGG.Abs_Mag_I[extra] + 4.83 - 4.08
    
    
In [4]:
    
import seaborn as sns
# ## use cross checked galaxy for magnitude convertion
# ind_GCpG=GCpG.Name.isin(VGG.Name)
# ind_VGG=VGG.Name.isin(GCpG.Name)
# f_joint, (ax1, ax2, ax3) = subplots(3, sharex=True, sharey=True, figsize=(8,8))
# xlim([-25,-10])
# p_v=sns.kdeplot(GCpG.VMag[ind_GCpG].get_values(),shade=True,ax=ax1,label='VMag in Harris')
# x,y = p_v.get_lines()[0].get_data()
# v_max=x[y.argmax()]
# ax1.vlines(v_max, 0, y.max())
# p_b=sns.kdeplot(VGG.Abs_Mag_B[ind_VGG].get_values(),shade=True,ax=ax2,label='BMag in White')
# x,y = p_b.get_lines()[0].get_data()
# b_max=x[y.argmax()]
# ax2.vlines(b_max, 0, y.max())
# p_I=sns.kdeplot(VGG.Abs_Mag_I[ind_VGG].get_values(),shade=True,ax=ax3,label='IMag in White')
# x,y = p_I.get_lines()[0].get_data()
# I_max=x[y.argmax()]
# ax3.vlines(I_max, 0, y.max())
## add and convert new VMag for VGG: VMag = V_max + B/I - B/I_max
VGG['VMag']=pd.Series(VGG.Abs_Mag_B -20.515828946699045 + 19.627615047946385,index=VGG.index)
extra=VGG.Abs_Mag_B.isnull()&VGG.Abs_Mag_I.notnull()
VGG.VMag[extra]=VGG.Abs_Mag_I[extra] -20.515828946699045 + 21.008287606264027
    
    
In [5]:
    
## calculate Ngc based on Sn from f(VMag)
VGG['Ngc']=pd.Series((map(lambda x: 0 if isnan(f(x)) else int(f(x)/10**(0.4*(x+15))),VGG.VMag)),index=VGG.index)
# sum(VGG.Ngc)
    
    
In [6]:
    
GCage=pd.read_csv("/Users/domi/Dropbox/Research/Local_universe/data/55GC_age.csv")
    
In [7]:
    
def generate_rand_from_pdf(pdf, x_grid):
    cdf = np.cumsum(pdf) # pdf from kde plot x128
    cdf = cdf / cdf[-1]  # normalization
    values = np.random.rand(sum(VGG.Ngc)) # sample size Ngc
    value_bins = np.searchsorted(cdf, values) # group Ngc into nearest 128 bins
    random_from_cdf = x_grid[value_bins]
    return random_from_cdf  # return Ngc
    
In [8]:
    
age_kde=sns.kdeplot(GCage.Age,kernel='gau',bw='silverman')
#x_grid = np.linspace(min(GCage.Age)-1, max(GCage.Age)+1, len(age_kde.get_lines()[0].get_data()[1])) 
age_curve=age_kde.get_lines()[0].get_data()
f2=interp1d(age_curve[0],age_curve[1],kind='cubic')
    
    
In [9]:
    
age_grid=np.linspace(min(age_curve[0]), max(age_curve[0]), 10000)
    
In [10]:
    
## original method, age not well spreaded. 
# x_grid = np.linspace(min(GCage.Age)-1, max(GCage.Age)+1, len(age_kde.get_lines()[0].get_data()[1])) 
# # define how many GC need to estimate the age
# GCage_from_kde = generate_rand_from_pdf(age_kde.get_lines()[0].get_data()[1], x_grid) 
# sns.distplot(GCage_from_kde)
    
In [11]:
    
## Age better spreaded
GCage_from_kde = generate_rand_from_pdf(f2(age_grid), age_grid) 
# sns.distplot(GCage_from_kde)
    
In [12]:
    
# cut off at 13.5 Gyrs
GCage_from_kde[GCage_from_kde>=13.5]=np.random.choice(GCage_from_kde[GCage_from_kde<13.5],sum(GCage_from_kde>=13.5))
    
In [13]:
    
# numpy.savetxt("/Users/domi/Dropbox/Research/Local_universe/data/GCage.dat", GCage_from_kde, delimiter=",")
    
In [14]:
    
# GCage_from_kde=numpy.loadtxt("/Users/domi/Dropbox/Research/Local_universe/data/GCage.dat",delimiter=",")
    
In [15]:
    
## Load BHLIB
bhlib=pd.DataFrame()
###################################
for i in range(1,325): # model
    for j in range(1,11):  # model id
###################################
        bhe=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/BHsystem/%d-%d-bhe.dat' %(i,j),
                    usecols=[0, 2, 3, 4, 6, 8, 10, 20], names=['T_eject','Type1','Type2','M1','M2','Seperation','Ecc','Model'], 
                    header=None, delim_whitespace=True)
        bhe.Model='%d-%d' %(i,j)
        bhlib=pd.concat([bhlib,bhe],ignore_index=False)
    
In [16]:
    
# BHB binary
bhblib=bhlib[bhlib.Type1==14*(bhlib.Type2==14)].copy(deep=True)
bhsys=bhlib[-(bhlib.Type1==14*(bhlib.Type2==14))].copy(deep=True)
    
In [17]:
    
bhblib=bhblib.drop(bhblib.columns[[1, 2]], axis=1)
    
In [18]:
    
bhblib.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/bhblib.dat',
              index=True, sep=' ', header=True, float_format='%2.6f')
    
In [19]:
    
# bhblib=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/bhblib.dat',
#               sep=' ', index_col=0)
    
In [20]:
    
print(shape(bhsys),shape(bhblib),shape(bhlib))
    
    
bhlib: T_eject Type1   Type2   M1  M2  Seperation  Ecc Model
In [23]:
    
# # index of the Galaxy with GC
# ind_GC=VGG.index[VGG.Ngc>0]
# # initialize the GClib
# GClib=pd.DataFrame()
# for row in ind_GC:
#     GClib=GClib.append([VGG.ix[[row],['RA','Dec','Dist','VMag']]]*VGG.Ngc[row]) # creat GCs in each Galaxy
# GClib=GClib.reset_index()
# GClib=GClib.rename(columns = {'index':'Galaxy'})
# # write to file
# GClib.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/GClib.dat',
#              index=False,sep=' ', header=False, float_format='%2.6f')
    
In [21]:
    
GClib=pd.DataFrame()
GClib=pd.read_csv('/Users/domi/Dropbox/Research/Local_universe/data/GClib.dat',
                sep=' ',header=None, names=['Galaxy','RA','Dec','Dist','VMag','Model','Age'])
# assign model to GClib
GClib['Model']=GClib['RA'].apply(lambda x: str(randint(1,325))+'-'+str(randint(1,11)))
# assign Age to GClib
GClib['Age']=np.random.choice(GCage_from_kde,size(GCage_from_kde))
    
GClib: Galaxy  RA  Dec Dist    VMag    Model   Age
Galaxy is the index of GC in VGG
In [23]:
    
# to check pd.merge correctly assign the right BH infor from bhlib to GClib based on Model
display(GClib.ix[[133],GClib.columns],pd.merge(GClib.ix[[133],GClib.columns],bhblib,on='Model'),bhblib[bhblib.Model=='64-9'])
    
    
    
    
In [24]:
    
# with pd.option_context('display.max_columns', None):
#     display(bhlib.sample(5),VGG.sample(5))
    
In [25]:
    
# ## Slow
# BHBdata=pd.merge(GClib,bhblib,on='Model')
# BHBdata.to_csv('/Users/domi/Dropbox/Research/Local_universe/data/BHBdata.dat.gz',
#                index=False,sep=' ', header=False, float_format='%2.6f',compression='gzip')
    
In [ ]:
    
    
In [ ]:
    
    
In [23]:
    
from gcpg import build
    
In [45]:
    
reload(build)
    
    Out[45]:
In [24]:
    
build.run(1)
    
    Out[24]:
In [9]:
    
from astropy.table import Table
    
In [30]:
    
help(Table.read)
    
    
In [17]:
    
from astropy.time import Time
    
In [18]:
    
Time('2015-11-2').gps
    
    Out[18]:
In [ ]:
    
    
In [ ]: