See Week 8 for a review on Pandas
Today we'll be using a dataset by the Resolve Survey. RESOLVE is a volume-limited census of stellar, gas, and dynamical mass as well as star formation and merging within >50,000 cubic Mpc of the nearby cosmic web, reaching down to the dwarf galaxy regime and up to structures on tens of Mpc scales such as filaments, walls, and voids.
In [1]:
# Importing modules
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
sns.set_context("notebook")
First, we need to read in the data from RESOLVE.
In [2]:
## Reading in data
# Column Names
colnames = ['RA','DEC','CZobs','Mr','HaloID','logMHalo',
'NGalH','FLAG','CZreal','DISTC','Vptot',
'Vptang','Morph','logMstar','rmag','umag',
'FSMGR','GalMatchFlag','ur_col','MhiMass','GroupID',
'GroupNGals','GroupRproj','GroupCZdisp',
'GroupLogMHalo','GroupGalType']
# We use `pd.read_csv` to read CSV files
# The argument in `sep`, i.e. "\s+" tells pandas to read white spaces
RS = pd.read_csv('./data/Resolve_Catalogue.dat',
sep='\s+',
skiprows=2,
names=colnames)
In [3]:
RS.head()
Out[3]:
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
DISTC
...
FSMGR
GalMatchFlag
ur_col
MhiMass
GroupID
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
0
222.361029
13.152031
7462.85561
-21.264
198732
15.141
469
0
9255.587090
1.098726
...
0.200007
ECO
2.091679
2.041738e+09
0
1
0.000000
0.00
12.225
1
1
222.808069
12.397480
7396.39554
-22.864
198732
15.141
469
0
9313.492255
1.300715
...
0.089128
ECO
2.181833
8.912519e+09
1
1
0.000000
0.00
13.172
1
2
222.219307
12.979642
6653.37177
-19.477
198732
15.141
469
0
9293.469960
0.802355
...
0.133072
ECO
2.220791
3.548133e+08
2
18
0.240688
277.11
13.621
0
3
222.153140
13.081357
5945.69782
-20.666
198732
15.141
469
0
9292.222436
1.018345
...
0.372099
ECO
1.339117
8.511383e+09
3
1
0.000000
0.00
11.989
1
4
222.540987
12.821932
6795.38705
-21.867
198732
15.141
469
0
9288.730044
0.404541
...
0.368589
ECO
1.815042
2.137960e+09
4
5
0.154728
235.29
13.138
1
5 rows × 26 columns
In [4]:
RS.shape
Out[4]:
(5080, 26)
Let's show some plots for this dataset
In [5]:
sns.jointplot("logMHalo","logMstar",data=RS, color='green')
Out[5]:
<seaborn.axisgrid.JointGrid at 0x10f1d9690>
In [6]:
g = sns.lmplot(x="logMHalo", y="logMstar", hue="FLAG", data=RS, size=8)
In [7]:
sns.lmplot(x="logMHalo", y="GroupLogMHalo", hue="FLAG", data=RS, size=8)
Out[7]:
<seaborn.axisgrid.FacetGrid at 0x10fff6050>
Now that we have looked at some of the data, calculating some statistics for it. Let's first create a subsample of the dataset
In [31]:
RS1 = RS.loc[ (RS.logMHalo >= 11) & (RS.logMHalo <= 12)]
print(RS1.shape)
# Resetting Indices
RS1.reset_index(inplace=True)
RS1.head()
(2730, 26)
Out[31]:
index
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
...
FSMGR
GalMatchFlag
ur_col
MhiMass
GroupID
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
0
32
134.634782
-0.513661
2560.74945
-19.378
281739
11.606
1
1
2389.978581
...
0.462300
ECO
1.659651
2.570394e+09
11
1
0.000000
0.00
11.512
1
1
36
135.134024
46.477611
3658.44177
-20.172
282776
11.675
1
1
4155.831904
...
1.293450
ECO
1.210722
9.332556e+09
14
1
0.000000
0.00
11.801
1
2
37
135.812900
46.573182
3503.62320
-18.793
282779
11.360
1
1
4126.863319
...
0.752197
ECO
1.395662
2.951213e+09
15
2
0.090258
73.24
12.077
0
3
38
135.791237
46.371980
3608.43558
-20.721
282781
11.892
1
1
4136.444575
...
0.514906
ECO
1.762735
2.454707e+09
15
2
0.090258
73.24
12.077
1
4
39
141.414228
47.734262
3592.98311
-19.252
283291
11.806
3
0
4055.758985
...
1.687002
ECO
1.203329
3.548138e+09
16
3
0.054441
14.31
12.221
0
5 rows × 27 columns
In [54]:
RS1.loc[:,'RArad' ] = RS1.RA.map( lambda x: np.radians(x))
RS1.loc[:,'DECrad'] = RS1.DEC.map(lambda x: np.radians(x))
RS1[['RA','RArad','DEC','DECrad']].head()
Out[54]:
RA
RArad
DEC
DECrad
0
134.634782
2.349820
-0.513661
-0.008965
1
135.134024
2.358534
46.477611
0.811187
2
135.812900
2.370382
46.573182
0.812855
3
135.791237
2.370004
46.371980
0.809344
4
141.414228
2.468144
47.734262
0.833120
In [58]:
RS1.loc[:,'RArad'].apply(np.degrees)
RS1['RArad'].head()
Out[58]:
0 2.349820
1 2.358534
2 2.370382
3 2.370004
4 2.468144
Name: RArad, dtype: float64
In [10]:
# Obtaining unique list of `HaloID`
print(RS1['HaloID'].shape)
RS1['HaloID'].head()
(2730,)
Out[10]:
0 281739
1 282776
2 282779
3 282781
4 283291
Name: HaloID, dtype: int64
Number of unique elements in HaloID
In [11]:
HaloID_unq = RS1.HaloID.unique()
HaloID_unq.shape
Out[11]:
(2405,)
In [12]:
HaloID_unq
Out[12]:
array([ 281739, 282776, 282779, ..., 1769274, 1769811, 1769883])
You can also plot distributions, x-y plots, etc using Pandas.
Let's plot the distribution of logMstar
.
See more here: http://pandas.pydata.org/pandas-docs/stable/visualization.html
In [20]:
RS1.head()
Out[20]:
index
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
...
ur_col
MhiMass
GroupID
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
RArad
DECrad
0
32
134.634782
-0.513661
2560.74945
-19.378
281739
11.606
1
1
2389.978581
...
1.659651
2.570394e+09
11
1
0.000000
0.00
11.512
1
2.349820
-0.008965
1
36
135.134024
46.477611
3658.44177
-20.172
282776
11.675
1
1
4155.831904
...
1.210722
9.332556e+09
14
1
0.000000
0.00
11.801
1
2.358534
0.811187
2
37
135.812900
46.573182
3503.62320
-18.793
282779
11.360
1
1
4126.863319
...
1.395662
2.951213e+09
15
2
0.090258
73.24
12.077
0
2.370382
0.812855
3
38
135.791237
46.371980
3608.43558
-20.721
282781
11.892
1
1
4136.444575
...
1.762735
2.454707e+09
15
2
0.090258
73.24
12.077
1
2.370004
0.809344
4
39
141.414228
47.734262
3592.98311
-19.252
283291
11.806
3
0
4055.758985
...
1.203329
3.548138e+09
16
3
0.054441
14.31
12.221
0
2.468144
0.833120
5 rows × 29 columns
Let's group data by the GroupID
In [25]:
Groups_dict = RS1.groupby('GroupID')
Groups_dict
Out[25]:
<pandas.core.groupby.DataFrameGroupBy object at 0x111141a90>
You can calcualte the sum
of all columns, even though sometimes it
would not make sense to do it
In [27]:
Groups_dict.sum()
Out[27]:
index
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
...
FSMGR
ur_col
MhiMass
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
RArad
DECrad
GroupID
2
1376
221.840954
13.170086
6471.97371
-20.334
329126
11.902
1
1
6718.518758
...
0.140674
1.980040
1.412539e+09
18
0.240688
277.11
13.621
0
3.871855
0.229861
11
32
134.634782
-0.513661
2560.74945
-19.378
281739
11.606
1
1
2389.978581
...
0.462300
1.659651
2.570394e+09
1
0.000000
0.00
11.512
1
2.349820
-0.008965
14
36
135.134024
46.477611
3658.44177
-20.172
282776
11.675
1
1
4155.831904
...
1.293450
1.210722
9.332556e+09
1
0.000000
0.00
11.801
1
2.358534
0.811187
15
75
271.604137
92.945162
7112.05878
-39.514
565560
23.252
2
2
8263.307894
...
1.267103
3.158397
5.405920e+09
4
0.180516
146.48
24.154
1
4.740386
1.622199
16
120
424.030674
142.999764
10817.20762
-59.562
849873
35.418
9
1
12174.507866
...
3.244401
5.414899
8.241571e+09
9
0.163323
42.93
36.663
1
7.400731
2.495817
17
85
313.547838
32.546649
5204.49813
-39.424
567438
23.460
4
1
5596.463143
...
6.690735
3.393892
1.044847e+10
4
0.002866
71.72
23.856
1
5.472442
0.568046
18
89
281.848772
91.648853
7269.08157
-39.732
567674
23.530
4
1
8135.351431
...
0.493270
3.312586
1.028967e+10
4
0.134670
99.96
24.118
1
4.919189
1.599574
19
46
157.586636
48.572398
3573.40651
-19.150
284054
11.680
1
1
3967.525066
...
0.814564
1.449894
2.454705e+09
1
0.000000
0.00
11.429
1
2.750406
0.847748
20
47
140.765926
1.976329
2868.57700
-19.408
284061
11.420
1
1
2872.762118
...
0.122841
2.304186
3.890450e+08
1
0.000000
0.00
11.529
1
2.456829
0.034493
21
48
138.073734
2.411931
2778.31482
-19.504
284291
11.582
1
1
2835.919200
...
0.306532
1.811672
1.621812e+09
1
0.000000
0.00
11.566
1
2.409841
0.042096
22
99
300.873734
90.025563
7178.04751
-37.705
568728
23.054
4
1
7894.825551
...
0.831315
3.768253
1.798101e+09
4
0.014326
14.30
23.200
1
5.251237
1.571242
23
51
143.089444
46.579813
3592.03982
-18.780
284394
11.561
1
1
4071.099481
...
0.598305
1.689678
7.762471e+08
1
0.000000
0.00
11.194
1
2.497382
0.812971
24
105
268.856882
47.360939
5160.58882
-38.200
568894
23.104
4
1
5546.536342
...
0.912425
3.118435
3.519127e+09
4
0.117478
19.98
23.374
1
4.692438
0.826604
25
54
157.348751
41.615085
3574.88872
-19.603
284582
11.493
1
1
3837.559625
...
0.183973
1.952976
1.023292e+09
1
0.000000
0.00
11.599
1
2.746254
0.726320
26
55
162.693188
1.961879
2597.58034
-19.294
284615
11.479
1
1
2601.087229
...
0.226535
2.245443
2.691535e+08
1
0.000000
0.00
11.486
1
2.839532
0.034241
27
56
157.460001
49.488156
3591.90825
-20.171
284695
11.456
1
1
3976.642566
...
1.150765
1.377150
6.760832e+09
1
0.000000
0.00
11.800
1
2.748195
0.863731
28
57
154.988872
34.509459
2571.39141
-19.508
284863
11.437
1
1
2858.866877
...
0.415138
1.852596
9.332558e+08
1
0.000000
0.00
11.569
1
2.705066
0.602304
29
58
141.940558
17.033201
2650.67690
-18.800
284912
11.390
1
1
2796.632604
...
0.153223
2.180119
1.949847e+08
1
0.000000
0.00
11.246
1
2.477330
0.297285
30
59
142.293349
44.636027
3536.51570
-19.459
285027
11.421
1
1
4011.339702
...
0.197674
1.325869
5.370314e+09
1
0.000000
0.00
11.549
1
2.483487
0.779046
31
121
421.183035
0.429715
7131.96169
-39.266
603912
22.952
4
1
6586.841058
...
2.385056
2.584658
1.026811e+10
4
0.051576
36.00
23.834
1
7.351031
0.007500
34
68
212.282651
2.338598
3085.75016
-19.072
302035
11.253
1
1
3385.519790
...
1.063081
1.290844
5.370319e+09
1
0.000000
0.00
11.399
1
3.705031
0.040816
35
69
211.798464
3.995041
2810.03796
-19.208
302069
11.613
1
1
3248.862125
...
0.190598
1.381033
4.677355e+09
1
0.000000
0.00
11.453
1
3.696581
0.069727
36
673
212.174133
5.381462
3114.65284
-19.228
307184
11.380
1
1
2937.728600
...
0.394274
1.700705
2.238722e+09
17
0.236390
257.23
13.859
0
3.703137
0.093924
38
86
211.087234
3.402649
3375.15985
-18.947
302148
11.774
1
1
3271.091928
...
5.917758
1.092800
5.370311e+09
1
0.000000
0.00
11.346
1
3.684167
0.059387
39
176
422.701455
7.998941
7017.20687
-39.768
604466
23.095
2
2
6580.283489
...
0.149476
4.821240
1.677295e+09
6
0.272206
378.68
24.390
1
7.377532
0.139608
40
88
212.197331
2.983055
3377.05188
-20.878
302274
11.990
1
1
3439.729113
...
0.157861
2.236076
1.288250e+09
1
0.000000
0.00
12.076
1
3.703542
0.052064
41
4686
1069.342879
22.153717
22868.84505
-97.327
1613918
58.577
7
4
24817.288333
...
9.706564
8.588776
1.182789e+10
650
1.840975
3534.55
71.750
0
18.663554
0.386655
47
136
218.110631
-0.474472
3298.73044
-20.337
302654
11.881
1
1
3713.675833
...
1.560939
1.238695
5.888431e+09
2
0.020057
102.47
12.154
1
3.806749
-0.008281
48
137
217.408884
1.073528
3689.52809
-19.039
302686
11.258
1
1
3791.755414
...
0.390536
1.508501
2.951212e+09
1
0.000000
0.00
11.386
1
3.794501
0.018737
50
302
437.596073
0.876629
7124.28258
-39.600
605431
23.455
2
2
7557.338338
...
1.125030
3.078099
6.399664e+09
4
0.126074
263.94
23.916
1
7.637492
0.015300
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
3021
5042
181.540914
38.772784
7455.67089
-19.343
1747151
11.312
1
1
7478.296457
...
0.659876
1.286974
3.388442e+09
1
0.000000
0.00
11.500
1
3.168487
0.676713
3022
5043
196.480535
39.110130
6284.05072
-19.134
1747163
11.338
1
1
6383.299571
...
0.204553
2.113695
4.897794e+08
1
0.000000
0.00
11.425
1
3.429232
0.682601
3023
5044
180.138126
43.598741
5160.45686
-19.252
1747165
11.324
1
1
5396.837248
...
0.053944
2.153711
3.235936e+08
1
0.000000
0.00
11.471
1
3.144003
0.760942
3024
5045
170.442172
28.923516
6073.01371
-19.454
1747185
11.226
1
1
6211.177928
...
0.906933
1.294022
4.897787e+09
1
0.000000
0.00
11.545
1
2.974777
0.504811
3025
5046
189.293884
26.504427
7278.13218
-19.227
1747192
11.354
1
1
7204.081386
...
1.955318
1.193350
4.265791e+09
1
0.000000
0.00
11.463
1
3.303802
0.462590
3026
5047
194.256326
31.051162
6777.01163
-19.015
1747195
11.320
1
1
6826.537972
...
2.169908
1.620665
1.778280e+09
1
0.000000
0.00
11.377
1
3.390412
0.541945
3027
5048
186.561296
30.761089
5891.61225
-18.909
1747211
11.323
1
1
5926.337095
...
0.595122
1.330795
1.949845e+09
1
0.000000
0.00
11.326
1
3.256109
0.536882
3028
5050
178.617072
46.740076
6716.89863
-19.467
1747267
11.306
1
1
6991.574758
...
1.463564
1.177580
5.128619e+09
1
0.000000
0.00
11.551
1
3.117456
0.815768
3029
5051
179.935088
34.079822
6448.46123
-19.023
1747268
11.344
1
1
6536.768504
...
0.795007
1.466522
2.041740e+09
1
0.000000
0.00
11.381
1
3.140460
0.594805
3030
5052
191.954131
34.376651
5983.22710
-18.915
1747272
11.257
1
1
5975.862182
...
0.136626
1.296644
2.951211e+09
1
0.000000
0.00
11.331
1
3.350232
0.599986
3031
5053
188.798977
35.992546
7252.91671
-19.121
1747320
11.327
1
1
7239.302066
...
0.218624
1.832817
2.137965e+09
1
0.000000
0.00
11.422
1
3.295164
0.628188
3032
5054
197.703967
31.057696
6184.44301
-18.883
1747322
11.311
1
1
6172.350613
...
0.052827
1.703667
1.288250e+09
1
0.000000
0.00
11.313
1
3.450585
0.542059
3033
5055
198.738687
47.101130
6878.82904
-18.779
1747380
11.305
1
1
7003.185527
...
1.084265
1.368940
1.412536e+09
1
0.000000
0.00
11.186
1
3.468644
0.822070
3034
5056
177.805632
38.976741
7319.89246
-19.115
1747385
11.242
1
1
7407.814068
...
0.661404
1.217119
4.073803e+09
1
0.000000
0.00
11.420
1
3.103294
0.680272
3035
5057
205.518590
28.272752
6852.72107
-18.975
1747390
11.288
1
1
6795.855810
...
0.266980
2.240971
2.137960e+08
1
0.000000
0.00
11.360
1
3.586976
0.493453
3036
5058
183.424051
34.075996
6469.06771
-18.903
1747409
11.293
1
1
6571.591618
...
1.639086
1.229792
3.090297e+09
1
0.000000
0.00
11.323
1
3.201354
0.594738
3037
5059
189.737700
35.372671
7386.18756
-19.000
1747616
11.249
1
1
7353.243883
...
0.071320
2.101870
4.897794e+08
1
0.000000
0.00
11.369
1
3.311548
0.617370
3038
5060
194.535351
39.521058
7106.22670
-19.036
1747646
11.236
1
1
7120.347680
...
0.213695
1.622625
9.772379e+08
1
0.000000
0.00
11.386
1
3.395282
0.689773
3039
5061
189.455632
40.706769
6535.03388
-18.880
1747659
11.226
1
1
6697.603189
...
2.280949
1.126925
2.691540e+09
1
0.000000
0.00
11.312
1
3.306625
0.710467
3040
5062
188.822804
35.106492
5989.50850
-18.798
1747701
11.229
1
1
6021.227993
...
0.540425
1.716230
7.079457e+08
1
0.000000
0.00
11.238
1
3.295580
0.612724
3041
5063
178.790815
45.958268
7043.30642
-18.913
1747762
11.250
1
1
7266.875328
...
0.335774
1.780758
7.413110e+08
1
0.000000
0.00
11.330
1
3.120488
0.802123
3042
5064
206.961728
29.870739
6938.13588
-19.294
1766927
11.491
1
1
6937.666670
...
0.093860
2.263285
3.388445e+08
4
0.098854
116.12
12.799
0
3.612164
0.521343
3044
10139
415.800925
56.850251
13965.42429
-40.212
3536000
23.410
2
2
13936.910932
...
0.550681
4.273449
2.721539e+09
4
0.151862
63.34
24.362
1
7.257095
0.992224
3045
10143
420.583605
62.950970
14897.36483
-39.325
3536502
23.227
2
2
14825.958141
...
2.328639
3.173143
6.170059e+09
4
0.272206
34.76
23.792
1
7.340569
1.098702
3046
5073
208.252977
33.128485
7400.77434
-18.877
1768410
11.604
2
0
7487.473602
...
5.218676
1.155014
2.454709e+09
1
0.000000
0.00
11.311
1
3.634700
0.578201
3047
5074
211.572491
24.948825
7240.41820
-19.068
1768843
11.624
1
1
7263.552369
...
2.149163
0.977274
4.897791e+09
1
0.000000
0.00
11.397
1
3.692637
0.435439
3048
5075
208.408550
30.109803
6979.92873
-19.789
1769134
11.625
1
1
7024.615893
...
0.583560
1.787069
1.023292e+09
1
0.000000
0.00
11.662
1
3.637415
0.525515
3049
10153
411.450877
56.076102
14141.70787
-38.107
3538548
23.050
4
1
14139.803707
...
3.205733
2.428322
5.588691e+09
4
0.068768
21.22
23.342
1
7.181173
0.978713
3050
5078
208.674537
27.215947
6940.82469
-18.801
1769811
11.424
1
1
6868.269133
...
0.100881
1.860897
5.623415e+08
1
0.000000
0.00
11.247
1
3.642058
0.475008
3051
5079
211.909499
25.671006
7396.22195
-18.877
1769883
11.274
1
1
7384.348483
...
5.218676
1.155014
2.454709e+09
1
0.000000
0.00
11.311
1
3.698518
0.448044
2244 rows × 26 columns
You can also get back a particular group
In [48]:
# Group with the most number of Galaxies
GroupID_max = int(RS1.GroupID.loc[RS1.GroupNGals==RS1.GroupNGals.max()].unique())
# Getting the stats for that Group
Groups_dict.get_group(GroupID_max)
Out[48]:
CZobs
CZreal
DEC
DECrad
DISTC
FLAG
FSMGR
GalMatchFlag
GroupCZdisp
GroupGalType
...
RA
RArad
Vptang
Vptot
index
logMHalo
logMstar
rmag
umag
ur_col
244
4184.84668
4433.164364
4.731734
0.082584
0.000000
1
2.215381
ECO
706.91
0
...
212.719898
3.712663
627.97968
675.29264
538
11.628
10.032284
-20.288746
-19.064381
1.224365
410
4924.61353
5034.595634
4.314873
0.075309
0.000000
1
0.442543
ECO
706.91
0
...
214.448100
3.742825
553.74793
564.56429
778
11.941
10.599734
-19.646400
-17.251831
2.394569
413
4882.03830
4945.885136
4.393863
0.076687
0.000000
1
0.203016
ECO
706.91
0
...
213.896606
3.733200
464.00287
468.37493
801
11.578
10.293238
-18.854952
-16.671577
2.183374
501
4472.13537
5202.238398
4.334079
0.075644
0.000000
1
6.603787
ECO
706.91
0
...
214.155222
3.737714
323.99750
798.76455
1284
11.715
10.287772
-19.358240
-18.197712
1.160528
502
4405.21117
5201.404801
4.379168
0.076431
0.072612
0
0.241837
ECO
706.91
0
...
214.123053
3.737152
287.37216
846.46739
1285
11.715
9.280000
-19.149672
-17.523731
1.625940
5 rows × 28 columns
You can compute different statistics for the different columns
In [49]:
Groups_dict.get_group(GroupID_max).describe()
Out[49]:
CZobs
CZreal
DEC
DECrad
DISTC
FLAG
FSMGR
GroupCZdisp
GroupGalType
GroupLogMHalo
...
RA
RArad
Vptang
Vptot
index
logMHalo
logMstar
rmag
umag
ur_col
count
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.00
5.0
5.00
...
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
5.000000
mean
4573.769010
4963.457667
4.430743
0.077331
0.014522
0.800000
1.941313
706.91
0.0
14.35
...
213.868576
3.732711
451.420028
670.692760
937.200000
11.715400
10.098606
-19.459602
-17.741846
1.717755
std
319.422742
316.312926
0.171308
0.002990
0.032473
0.447214
2.737242
0.00
0.0
0.00
...
0.671383
0.011718
145.731638
157.730799
333.350716
0.139116
0.499827
0.546421
0.920608
0.556157
min
4184.846680
4433.164364
4.314873
0.075309
0.000000
0.000000
0.203016
706.91
0.0
14.35
...
212.719898
3.712663
287.372160
468.374930
538.000000
11.578000
9.280000
-20.288746
-19.064381
1.160528
25%
4405.211170
4945.885136
4.334079
0.075644
0.000000
1.000000
0.241837
706.91
0.0
14.35
...
213.896606
3.733200
323.997500
564.564290
778.000000
11.628000
10.032284
-19.646400
-18.197712
1.224365
50%
4472.135370
5034.595634
4.379168
0.076431
0.000000
1.000000
0.442543
706.91
0.0
14.35
...
214.123053
3.737152
464.002870
675.292640
801.000000
11.715000
10.287772
-19.358240
-17.523731
1.625940
75%
4882.038300
5201.404801
4.393863
0.076687
0.000000
1.000000
2.215381
706.91
0.0
14.35
...
214.155222
3.737714
553.747930
798.764550
1284.000000
11.715000
10.293238
-19.149672
-17.251831
2.183374
max
4924.613530
5202.238398
4.731734
0.082584
0.072612
1.000000
6.603787
706.91
0.0
14.35
...
214.448100
3.742825
627.979680
846.467390
1285.000000
11.941000
10.599734
-18.854952
-16.671577
2.394569
8 rows × 26 columns
In [60]:
RS1.cov()
Out[60]:
index
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
...
ur_col
MhiMass
GroupID
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
RArad
DECrad
index
2.181596e+06
-2.298427e+04
5.586957e+03
7.772780e+05
1.931854e+01
6.154578e+08
-1.441836e+01
-2.628546e+01
8.913818e+00
8.106301e+05
...
2.619069e+01
-2.665743e+11
1.311627e+06
-8.450746e+02
-8.782327e+00
-1.349381e+04
-8.693728e+01
5.784895e+01
-4.011512e+02
9.751079e+01
RA
-2.298427e+04
1.089537e+03
-3.420840e+01
2.757614e+03
-2.955545e-01
-6.679839e+06
1.474443e-01
-2.572352e-01
1.350572e-01
1.508786e+03
...
-4.593881e-02
6.882014e+08
-1.330206e+04
1.396574e+01
1.476375e-01
2.422790e+02
1.521332e+00
-8.423405e-01
1.901602e+01
-5.970492e-01
DEC
5.586957e+03
-3.420840e+01
2.319493e+02
-2.092093e+03
1.229002e-02
3.753073e+06
-2.776626e-02
2.496057e-01
-7.613347e-02
-1.448510e+03
...
-7.458507e-04
-5.427480e+08
3.905507e+03
-6.081837e+00
-1.928684e-02
-8.218754e+01
-4.941584e-01
2.200978e-01
-5.970492e-01
4.048280e+00
CZobs
7.772780e+05
2.757614e+03
-2.092093e+03
1.654774e+06
-2.310160e+01
1.476149e+08
9.432305e+00
-4.304231e+00
-1.158965e+00
1.575706e+06
...
3.806679e+01
-1.371571e+11
4.454930e+05
-2.168341e+02
-9.466808e-01
-2.640655e+03
1.055219e+01
-1.348847e-01
4.812945e+01
-3.651391e+01
Mr
1.931854e+01
-2.955545e-01
1.229002e-02
-2.310160e+01
3.456844e-01
2.438294e+03
-7.860392e-02
-4.982106e-03
-2.889540e-02
-2.440613e+01
...
-7.208798e-02
-4.127290e+08
1.591201e+01
6.463370e-03
-1.080272e-03
-1.447294e+00
-1.095948e-01
-4.498804e-02
-5.158399e-03
2.145012e-04
HaloID
6.154578e+08
-6.679839e+06
3.753073e+06
1.476149e+08
2.438294e+03
2.304506e+11
-2.017190e+03
2.410209e+02
-4.727293e+01
1.623230e+08
...
1.829660e+03
-5.093372e+13
3.804580e+08
-2.043365e+05
-1.349997e+03
-3.194161e+06
-1.997173e+04
1.213782e+04
-1.165852e+05
6.550347e+04
logMHalo
-1.441836e+01
1.474443e-01
-2.776626e-02
9.432305e+00
-7.860392e-02
-2.017190e+03
4.046042e-02
3.145101e-02
-1.402054e-02
9.859384e+00
...
1.709318e-02
8.438130e+07
-1.129491e+01
6.450592e-02
1.211647e-03
1.595945e+00
3.725773e-02
-9.509890e-03
2.573389e-03
-4.846127e-04
NGalH
-2.628546e+01
-2.572352e-01
2.496057e-01
-4.304231e+00
-4.982106e-03
2.410209e+02
3.145101e-02
2.507830e-01
-1.088891e-01
-7.419269e+00
...
5.959124e-03
-1.206168e+07
-1.685324e+01
2.462191e-01
4.749852e-03
6.838984e+00
6.283496e-02
-9.178099e-02
-4.489601e-03
4.356440e-03
FLAG
8.913818e+00
1.350572e-01
-7.613347e-02
-1.158965e+00
-2.889540e-02
-4.727293e+01
-1.402054e-02
-1.088891e-01
1.065840e-01
-2.967087e-01
...
2.494139e-03
6.993297e+07
6.058182e+00
-1.120744e-01
-2.192924e-03
-3.117333e+00
-2.805117e-02
7.410140e-02
2.357192e-03
-1.328780e-03
CZreal
8.106301e+05
1.508786e+03
-1.448510e+03
1.575706e+06
-2.440613e+01
1.623230e+08
9.859384e+00
-7.419269e+00
-2.967087e-01
1.536303e+06
...
3.694438e+01
-1.332002e+11
4.676088e+05
-1.325822e+02
-2.442582e-01
-2.289977e+03
1.287176e+01
-6.949726e-01
2.633328e+01
-2.528127e+01
DISTC
-2.850860e-01
-2.433786e-02
7.560089e-04
7.343221e-01
2.474714e-03
1.727109e+02
1.626163e-03
1.014831e-02
-9.851798e-03
4.615477e-01
...
-3.733074e-04
-6.462987e+06
-2.776427e-01
9.311207e-03
3.099434e-04
2.935209e-01
2.728544e-03
-6.825898e-03
-4.247758e-04
1.319484e-05
Vptot
-8.204723e+04
2.212872e+03
-6.798065e+02
4.119205e+04
-7.070875e-01
-2.892913e+07
1.618646e+00
1.535922e+00
-1.208064e+00
4.957065e+04
...
1.543374e+00
-2.023708e+10
-5.143447e+04
1.260621e+02
1.656184e+00
1.684616e+03
1.329682e+01
-8.997785e+00
3.862190e+01
-1.186486e+01
Vptang
-9.014971e+04
2.843134e+03
-9.531855e+02
6.529761e+04
-2.970740e-01
-3.435219e+07
1.420597e+00
1.500098e+00
-9.836101e-01
6.382373e+04
...
1.965117e+00
-1.886259e+10
-5.658740e+04
8.171065e+01
1.294948e+00
1.469719e+03
1.129570e+01
-8.118135e+00
4.962205e+01
-1.663622e+01
logMstar
-2.100861e+01
3.433979e-01
-9.543213e-02
1.807609e+01
-1.686256e-01
-4.153282e+03
6.186612e-02
-3.336331e-02
6.182902e-02
1.940680e+01
...
3.543189e-02
2.202997e+08
-1.728105e+01
1.834790e-02
3.831666e-04
-4.884735e-02
4.254488e-02
4.604768e-02
5.993424e-03
-1.665605e-03
rmag
9.695980e+00
-4.198897e-02
-4.127163e-02
-2.869988e+01
3.524043e-01
-5.526142e+02
-7.891440e-02
-2.965146e-03
-3.047496e-02
-2.973787e+01
...
-7.865790e-02
-3.910549e+08
1.023456e+01
6.105523e-03
-1.189317e-03
-1.444331e+00
-1.109829e-01
-4.856009e-02
-7.328458e-04
-7.203258e-04
umag
3.588669e+01
-8.792803e-02
-4.201727e-02
9.366916e+00
2.803163e-01
1.277058e+03
-6.182122e-02
2.993990e-03
-2.798082e-02
7.206523e+00
...
1.263307e-01
-1.510898e+09
2.417838e+01
-7.343608e-03
-1.640577e-03
-1.566172e+00
-8.919846e-02
-4.200461e-02
-1.534634e-03
-7.333396e-04
FSMGR
-3.949605e+01
-2.077870e-02
-3.316090e-01
-2.043116e+01
1.039979e-01
-8.796008e+03
-1.966349e-02
1.519587e-03
-8.545597e-03
-1.669011e+01
...
-3.078101e-01
2.385122e+09
-2.446775e+01
3.987326e-01
2.861589e-03
2.970075e+00
-8.278981e-03
-2.637713e-02
-3.626567e-04
-5.787669e-03
ur_col
2.619069e+01
-4.593881e-02
-7.458507e-04
3.806679e+01
-7.208798e-02
1.829660e+03
1.709318e-02
5.959124e-03
2.494139e-03
3.694438e+01
...
2.049886e-01
-1.119843e+09
1.394381e+01
-1.344921e-02
-4.512595e-04
-1.218404e-01
2.178444e-02
6.555485e-03
-8.017834e-04
-1.301755e-05
MhiMass
-2.665743e+11
6.882014e+08
-5.427480e+08
-1.371571e+11
-4.127290e+08
-5.093372e+13
8.438130e+07
-1.206168e+07
6.993297e+07
-1.332002e+11
...
-1.119843e+09
1.566881e+19
-1.756067e+11
-1.745911e+07
4.917855e+06
2.866463e+09
1.414368e+08
7.499084e+07
1.201138e+07
-9.472739e+06
GroupID
1.311627e+06
-1.330206e+04
3.905507e+03
4.454930e+05
1.591201e+01
3.804580e+08
-1.129491e+01
-1.685324e+01
6.058182e+00
4.676088e+05
...
1.394381e+01
-1.756067e+11
7.965887e+05
-6.626612e+02
-6.720297e+00
-9.953393e+03
-6.641786e+01
4.325604e+01
-2.321647e+02
6.816395e+01
GroupNGals
-8.450746e+02
1.396574e+01
-6.081837e+00
-2.168341e+02
6.463370e-03
-2.043365e+05
6.450592e-02
2.462191e-01
-1.120744e-01
-1.325822e+02
...
-1.344921e-02
-1.745911e+07
-6.626612e+02
3.722428e+01
1.700104e-01
2.560579e+02
1.309322e+00
-6.030891e-01
2.437481e-01
-1.061481e-01
GroupRproj
-8.782327e+00
1.476375e-01
-1.928684e-02
-9.466808e-01
-1.080272e-03
-1.349997e+03
1.211647e-03
4.749852e-03
-2.192924e-03
-2.442582e-01
...
-4.512595e-04
4.917855e+06
-6.720297e+00
1.700104e-01
2.613553e-03
2.293824e+00
1.660395e-02
-1.217465e-02
2.576760e-03
-3.366188e-04
GroupCZdisp
-1.349381e+04
2.422790e+02
-8.218754e+01
-2.640655e+03
-1.447294e+00
-3.194161e+06
1.595945e+00
6.838984e+00
-3.117333e+00
-2.289977e+03
...
-1.218404e-01
2.866463e+09
-9.953393e+03
2.560579e+02
2.293824e+00
3.399523e+03
1.823995e+01
-1.241124e+01
4.228566e+00
-1.434443e+00
GroupLogMHalo
-8.693728e+01
1.521332e+00
-4.941584e-01
1.055219e+01
-1.095948e-01
-1.997173e+04
3.725773e-02
6.283496e-02
-2.805117e-02
1.287176e+01
...
2.178444e-02
1.414368e+08
-6.641786e+01
1.309322e+00
1.660395e-02
1.823995e+01
1.792074e-01
-1.010384e-01
2.655225e-02
-8.624691e-03
GroupGalType
5.784895e+01
-8.423405e-01
2.200978e-01
-1.348847e-01
-4.498804e-02
1.213782e+04
-9.509890e-03
-9.178099e-02
7.410140e-02
-6.949726e-01
...
6.555485e-03
7.499084e+07
4.325604e+01
-6.030891e-01
-1.217465e-02
-1.241124e+01
-1.010384e-01
1.698882e-01
-1.470161e-02
3.841431e-03
RArad
-4.011512e+02
1.901602e+01
-5.970492e-01
4.812945e+01
-5.158399e-03
-1.165852e+05
2.573389e-03
-4.489601e-03
2.357192e-03
2.633328e+01
...
-8.017834e-04
1.201138e+07
-2.321647e+02
2.437481e-01
2.576760e-03
4.228566e+00
2.655225e-02
-1.470161e-02
3.318921e-01
-1.042047e-02
DECrad
9.751079e+01
-5.970492e-01
4.048280e+00
-3.651391e+01
2.145012e-04
6.550347e+04
-4.846127e-04
4.356440e-03
-1.328780e-03
-2.528127e+01
...
-1.301755e-05
-9.472739e+06
6.816395e+01
-1.061481e-01
-3.366188e-04
-1.434443e+00
-8.624691e-03
3.841431e-03
-1.042047e-02
7.065581e-02
27 rows × 27 columns
In [61]:
RS1.corr()
Out[61]:
index
RA
DEC
CZobs
Mr
HaloID
logMHalo
NGalH
FLAG
CZreal
...
ur_col
MhiMass
GroupID
GroupNGals
GroupRproj
GroupCZdisp
GroupLogMHalo
GroupGalType
RArad
DECrad
index
1.000000
-0.471436
0.248366
0.409091
0.022246
0.868005
-0.048530
-0.035537
0.018485
0.442789
...
0.039165
-0.045595
0.994961
-0.093777
-0.116307
-0.156689
-0.139040
0.095023
-0.471436
0.248366
RA
-0.471436
1.000000
-0.068048
0.064945
-0.015229
-0.421557
0.022207
-0.015562
0.012533
0.036878
...
-0.003074
0.005267
-0.451524
0.069347
0.087490
0.125888
0.108874
-0.061913
1.000000
-0.068048
DEC
0.248366
-0.068048
1.000000
-0.106786
0.001373
0.513336
-0.009064
0.032727
-0.015312
-0.076734
...
-0.000108
-0.009003
0.287319
-0.065452
-0.024771
-0.092555
-0.076646
0.035062
-0.068048
1.000000
CZobs
0.409091
0.064945
-0.106786
1.000000
-0.030544
0.239041
0.036453
-0.006682
-0.002760
0.988251
...
0.065360
-0.026936
0.388020
-0.027628
-0.014395
-0.035207
0.019377
-0.000254
0.064945
-0.106786
Mr
0.022246
-0.015229
0.001373
-0.030544
1.000000
0.008639
-0.664644
-0.016921
-0.150537
-0.033490
...
-0.270806
-0.177340
0.030323
0.001802
-0.035940
-0.042219
-0.440324
-0.185642
-0.015229
0.001373
HaloID
0.868005
-0.421557
0.513336
0.239041
0.008639
1.000000
-0.020890
0.001003
-0.000302
0.272805
...
0.008418
-0.026804
0.887975
-0.069766
-0.055008
-0.114119
-0.098276
0.061344
-0.421557
0.513336
logMHalo
-0.048530
0.022207
-0.009064
0.036453
-0.664644
-0.020890
1.000000
0.312227
-0.213503
0.039545
...
0.187691
0.105977
-0.062914
0.052562
0.117827
0.136080
0.437545
-0.114704
0.022207
-0.009064
NGalH
-0.035537
-0.015562
0.032727
-0.006682
-0.016921
0.001003
0.312227
1.000000
-0.666023
-0.011953
...
0.026283
-0.006085
-0.037707
0.080586
0.185531
0.234225
0.296397
-0.444654
-0.015562
0.032727
FLAG
0.018485
0.012533
-0.015312
-0.002760
-0.150537
-0.000302
-0.213503
-0.666023
1.000000
-0.000733
...
0.016874
0.054115
0.020791
-0.056266
-0.131390
-0.163768
-0.202968
0.550679
0.012533
-0.015312
CZreal
0.442789
0.036878
-0.076734
0.988251
-0.033490
0.272805
0.039545
-0.011953
-0.000733
1.000000
...
0.065833
-0.027149
0.422695
-0.017532
-0.003855
-0.031687
0.024531
-0.001360
0.036878
-0.076734
DISTC
-0.005257
-0.020082
0.001352
0.015548
0.114639
0.009799
0.220190
0.551942
-0.821899
0.010142
...
-0.022457
-0.044470
-0.008473
0.041566
0.165126
0.137113
0.175550
-0.451053
-0.020082
0.001352
Vptot
-0.275506
0.332498
-0.221382
0.158817
-0.005965
-0.298882
0.039911
0.015212
-0.018353
0.198353
...
0.016907
-0.025356
-0.285819
0.102477
0.160674
0.143300
0.155784
-0.108270
0.332498
-0.221382
Vptang
-0.305539
0.431187
-0.313307
0.254108
-0.002529
-0.358225
0.035355
0.014995
-0.015082
0.257771
...
0.021728
-0.023855
-0.317389
0.067043
0.126802
0.126187
0.133575
-0.098597
0.431187
-0.313307
logMstar
-0.030929
0.022622
-0.013625
0.030555
-0.623644
-0.018813
0.668791
-0.144868
0.411812
0.034046
...
0.170170
0.121018
-0.042102
0.006539
0.016298
-0.001822
0.218536
0.242929
0.022622
-0.013625
rmag
0.010307
-0.001997
-0.004255
-0.035031
0.941120
-0.001807
-0.616006
-0.009297
-0.146569
-0.037672
...
-0.272786
-0.155118
0.018005
0.001571
-0.036528
-0.038896
-0.411644
-0.184987
-0.001997
-0.004255
umag
0.036088
-0.003957
-0.004098
0.010815
0.708145
0.003951
-0.456495
0.008880
-0.127300
0.008636
...
0.414436
-0.566932
0.040237
-0.001788
-0.047664
-0.039897
-0.312963
-0.151366
-0.003957
-0.004098
FSMGR
-0.021495
-0.000506
-0.017502
-0.012767
0.142185
-0.014729
-0.078580
0.002439
-0.021041
-0.010824
...
-0.546496
0.484352
-0.022037
0.052534
0.044995
0.040947
-0.015721
-0.051442
-0.000506
-0.017502
ur_col
0.039165
-0.003074
-0.000108
0.065360
-0.270806
0.008418
0.187691
0.026283
0.016874
0.065833
...
1.000000
-0.624848
0.034506
-0.004869
-0.019496
-0.004615
0.113659
0.035128
-0.003074
-0.000108
MhiMass
-0.045595
0.005267
-0.009003
-0.026936
-0.177340
-0.026804
0.105977
-0.006085
0.054115
-0.027149
...
-0.624848
1.000000
-0.049706
-0.000723
0.024302
0.012420
0.084405
0.045963
0.005267
-0.009003
GroupID
0.994961
-0.451524
0.287319
0.388020
0.030323
0.887975
-0.062914
-0.037707
0.020791
0.422695
...
0.034506
-0.049706
1.000000
-0.121692
-0.147284
-0.191269
-0.175788
0.117584
-0.451524
0.287319
GroupNGals
-0.093777
0.069347
-0.065452
-0.027628
0.001802
-0.069766
0.052562
0.080586
-0.056266
-0.017532
...
-0.004869
-0.000723
-0.121692
1.000000
0.545063
0.719807
0.506939
-0.239821
0.069347
-0.065452
GroupRproj
-0.116307
0.087490
-0.024771
-0.014395
-0.035940
-0.055008
0.117827
0.185531
-0.131390
-0.003855
...
-0.019496
0.024302
-0.147284
0.545063
1.000000
0.769547
0.767216
-0.577775
0.087490
-0.024771
GroupCZdisp
-0.156689
0.125888
-0.092555
-0.035207
-0.042219
-0.114119
0.136080
0.234225
-0.163768
-0.031687
...
-0.004615
0.012420
-0.191269
0.719807
0.769547
1.000000
0.738987
-0.516446
0.125888
-0.092555
GroupLogMHalo
-0.139040
0.108874
-0.076646
0.019377
-0.440324
-0.098276
0.437545
0.296397
-0.202968
0.024531
...
0.113659
0.084405
-0.175788
0.506939
0.767216
0.738987
1.000000
-0.579065
0.108874
-0.076646
GroupGalType
0.095023
-0.061913
0.035062
-0.000254
-0.185642
0.061344
-0.114704
-0.444654
0.550679
-0.001360
...
0.035128
0.045963
0.117584
-0.239821
-0.577775
-0.516446
-0.579065
1.000000
-0.061913
0.035062
RArad
-0.471436
1.000000
-0.068048
0.064945
-0.015229
-0.421557
0.022207
-0.015562
0.012533
0.036878
...
-0.003074
0.005267
-0.451524
0.069347
0.087490
0.125888
0.108874
-0.061913
1.000000
-0.068048
DECrad
0.248366
-0.068048
1.000000
-0.106786
0.001373
0.513336
-0.009064
0.032727
-0.015312
-0.076734
...
-0.000108
-0.009003
0.287319
-0.065452
-0.024771
-0.092555
-0.076646
0.035062
-0.068048
1.000000
27 rows × 27 columns
You can handle missing values in DataFrames without a problem.
Check "Handling of missing values" to learn more about this.
In [63]:
df = pd.DataFrame({'int_col' : [1,2,6,8,-1], 'float_col' : [0.1, 0.2,0.2,10.1,None], 'str_col' : ['a','b',None,'c','a']})
df
Out[63]:
float_col
int_col
str_col
0
0.1
1
a
1
0.2
2
b
2
0.2
6
None
3
10.1
8
c
4
NaN
-1
a
We can use the dropna
function to drop NaN numbers
In [66]:
df.dropna()
Out[66]:
float_col
int_col
str_col
0
0.1
1
a
1
0.2
2
b
3
10.1
8
c
In [67]:
df3 = df.copy()
mean = df3['float_col'].mean()
df3
Out[67]:
float_col
int_col
str_col
0
0.1
1
a
1
0.2
2
b
2
0.2
6
None
3
10.1
8
c
4
NaN
-1
a
In [68]:
df3['float_col'].fillna(mean)
Out[68]:
0 0.10
1 0.20
2 0.20
3 10.10
4 2.65
Name: float_col, dtype: float64
In [69]:
df3
Out[69]:
float_col
int_col
str_col
0
0.1
1
a
1
0.2
2
b
2
0.2
6
None
3
10.1
8
c
4
NaN
-1
a
In [ ]:
Content source: VandyAstroML/Vanderbilt_Computational_Bootcamp
Similar notebooks: