Basic data analysis with Pandas

Today we will learn how to:

  • load a single cell RNA-seq expression dataset
  • access the data
  • group the data
  • compute summary statistics

We will use a real scRNA-seq dataset from Tirosh et al, Science, 2016.

Please first download the dataset from this link.


In [1]:
import sys,os
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import pandas as pd
import numpy as np
import scipy
sns.set_style('white')

In [2]:
if not os.path.isdir('Figures'):
    os.mkdir("Figures")

Loading Data

Pandas can load data from multiple sources:

  • Delimited files ("TSV", "CSV", etc)
  • Excel files
  • JSON
  • HTML
  • and everything else listed here...

Delimited file format is highly common, and the example dataset we will use has this format. The others are similar.


In [3]:
file_loc = "/Users/murat/Data/tirosh/GSE72056_melanoma_single_cell_revised_v2.txt.gz"
sc_data = pd.read_csv(file_loc, "\t", index_col=0)

In [4]:
sc_data


Out[4]:
Cy72_CD45_H02_S758_comb CY58_1_CD45_B02_S974_comb Cy71_CD45_D08_S524_comb Cy81_FNA_CD45_B01_S301_comb Cy80_II_CD45_B07_S883_comb Cy81_Bulk_CD45_B10_S118_comb Cy72_CD45_D09_S717_comb Cy74_CD45_A03_S387_comb Cy71_CD45_B05_S497_comb Cy80_II_CD45_C09_S897_comb ... CY75_1_CD45_CD8_7__S265_comb CY75_1_CD45_CD8_3__S127_comb CY75_1_CD45_CD8_1__S61_comb CY75_1_CD45_CD8_1__S12_comb CY75_1_CD45_CD8_1__S25_comb CY75_1_CD45_CD8_7__S223_comb CY75_1_CD45_CD8_1__S65_comb CY75_1_CD45_CD8_1__S93_comb CY75_1_CD45_CD8_1__S76_comb CY75_1_CD45_CD8_7__S274_comb
Cell
tumor 72.00000 58.00000 71.00000 81.00000 80.00000 81.00000 72.00000 74.00000 71.00000 80.00000 ... 75.00000 75.000000 75.000000 75.000000 75.000000 75.00000 75.00000 75.00000 75.000000 75.00000
malignant(1=no,2=yes,0=unresolved) 1.00000 1.00000 2.00000 2.00000 2.00000 2.00000 1.00000 1.00000 2.00000 2.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 1.00000
non-malignant cell type (1=T,2=B,3=Macro.4=Endo.,5=CAF;6=NK) 2.00000 1.00000 0.00000 0.00000 0.00000 0.00000 1.00000 1.00000 0.00000 0.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 0.00000
C9orf152 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS11 9.21720 8.37450 9.31300 7.88760 8.32910 7.83360 8.37370 8.13380 8.43730 7.59680 ... 0.00000 7.863900 5.850500 0.626390 6.273400 5.48890 4.92620 7.09580 3.997000 3.98970
ELMO2 0.00000 0.00000 2.12630 0.00000 0.00000 0.77400 0.00000 0.00000 0.00000 0.38294 ... 0.00000 0.000000 3.157200 4.793200 0.000000 0.00000 5.52960 0.00000 0.000000 0.00000
CREB3L1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PNMA1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.51420 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MMP2 0.00000 0.00000 0.73812 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.86970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TMEM216 0.00000 0.00000 0.00000 0.00000 3.79490 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 3.682900 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.29007
TRAF3IP2-AS1 2.85140 2.09830 0.61730 0.96495 1.47350 3.15800 1.30800 1.38020 0.00000 1.12630 ... 3.75450 1.747100 1.787700 1.384000 1.340900 2.40520 1.69880 1.55890 0.471250 1.67460
LRRC37A5P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC653712 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.54798 0.000000 0.000000 0.000000 0.000721 0.00000 0.00000 0.00000 0.000000 0.00000
C10orf90 0.00000 0.00000 0.00000 3.40690 1.44680 2.78760 0.00000 0.00000 1.73980 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
ZHX3 0.00000 0.52907 0.00000 0.51197 0.00000 1.49520 1.10970 0.00000 1.28210 0.00000 ... 1.89240 0.282530 0.329430 0.878360 0.336740 0.53614 0.32554 0.46722 0.449290 0.35422
ERCC5 0.00000 0.00000 0.00000 0.00000 2.28660 2.37410 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 5.072500 0.53382 0.66077 5.27270 0.000000 2.92570
GPR98 0.00000 0.00000 0.00000 0.00000 0.00000 0.02290 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP3 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CTAGE10P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
APBB2 0.17377 0.00000 0.00000 1.11570 0.00000 0.29278 0.00000 0.28688 0.00000 0.00000 ... 0.43193 0.000000 0.091346 0.033131 0.000000 0.00000 0.00000 0.25625 0.044071 0.19094
KLHL13 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
KRTAP10-8 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61900 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PDCL3 0.00000 0.00000 0.00000 4.76640 2.99000 2.35220 0.00000 0.00000 3.89220 2.84320 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
AEN 0.00000 0.00000 0.00000 0.00000 2.31610 3.47680 0.00000 0.00000 4.10310 4.56970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 3.38020 0.000000 6.49550
FRG2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.26504 0.000000 0.000000 0.000000 0.158650 0.21004 0.17355 0.00000 0.000000 0.00000
DECR1 0.00000 0.00000 5.09060 0.00000 3.30030 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 1.366000 0.000000 0.00000 0.00000 0.00000 3.826700 0.00000
SALL1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.172980 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GGT3P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CADM4 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS18 10.07200 7.47900 10.98600 9.61720 9.98130 10.30500 9.75130 8.74800 9.24680 9.57730 ... 8.19610 8.415500 7.091700 0.000000 8.029200 8.36490 8.65190 7.12710 8.997400 1.92120
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
CNST 0.00000 0.00000 0.00000 0.00000 1.92640 2.18080 0.00000 0.00000 3.62840 0.00000 ... 0.14580 0.000000 0.000000 0.789670 0.000000 0.00000 3.88480 0.62217 0.000000 0.00000
ERCC2 0.00000 0.00000 0.00000 0.00000 0.00000 2.35810 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.47660
EAF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
VPS13B 0.47093 0.00000 4.09810 1.64660 0.00000 1.20730 0.00000 0.00000 0.00000 0.00000 ... 0.16349 6.656700 0.000000 5.809900 3.463600 0.44281 0.00000 0.00000 0.000000 5.81820
TP53TG3B 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 3.50170 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
C14orf101 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61340 0.000000 0.000000 0.000000 0.000000 0.00000 2.57820 0.00000 1.787200 0.00000
ST18 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PSMB9 0.00000 0.94261 0.00000 0.00000 3.31090 2.47200 0.00000 0.00000 3.02010 3.83630 ... 1.57680 6.442400 7.282700 5.912200 5.644600 6.58150 7.42070 5.68740 6.471700 6.50470
ADRBK2 0.27023 0.64708 3.89930 0.00000 0.00000 0.10969 0.00000 0.00000 0.85758 0.00000 ... 2.16790 0.072676 0.635890 0.249530 0.547600 0.58503 0.10893 0.12135 0.268260 0.41887
HCLS1 5.88820 7.68530 0.00000 0.00000 0.00000 0.00000 0.00000 1.98040 0.00000 0.00000 ... 0.00000 6.643100 4.323600 0.965700 6.834200 0.93640 5.30360 3.22540 6.160100 6.31980
GPR15 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC100093631 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TTTY17A 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CSF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MIR515-1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
SLC2A11 0.00000 0.00000 0.00000 0.00000 0.00000 2.03630 0.00000 0.00000 0.00000 0.00000 ... 0.82884 0.000000 0.000000 0.000000 0.000000 0.28620 0.00000 0.20950 0.000000 0.40439
SELO 0.00000 0.00000 0.00000 0.00000 1.34770 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GRIP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GOLGA8B 0.55385 1.45470 0.00000 3.25500 0.34369 1.27620 0.00000 0.00000 0.00000 0.11636 ... 0.00000 0.507200 0.066124 0.000000 0.403930 2.32910 0.00000 0.34443 0.000000 3.10200
MIR4691 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GPLD1 0.62667 1.05450 0.99639 0.23143 0.00000 0.44996 0.67536 0.98477 0.00000 0.15575 ... 2.61240 0.160170 0.894100 0.808400 0.482300 1.29620 0.99245 0.97516 0.492080 1.06820
SNORD115-39 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RAB8A 0.00000 0.00000 2.76340 4.19370 2.57050 3.75260 0.00000 0.00000 0.00000 0.00000 ... 4.08420 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PCIF1 0.00000 0.00000 3.67820 0.00000 0.00000 4.54490 0.00000 0.00000 1.87890 0.00000 ... 0.00000 0.000000 4.613600 5.142300 3.771200 0.00000 5.54650 0.00000 0.000000 5.00810
PIK3IP1 7.60690 0.00000 0.00000 0.00000 0.00000 0.00000 6.54570 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 6.408700 0.00000 3.73840 0.00000 0.000000 6.75350
SNRPD2 0.00000 0.00000 3.98710 5.26390 6.08240 5.64240 0.00000 0.00000 5.94510 6.52150 ... 4.44710 4.760700 0.000000 4.743300 5.404100 6.09860 0.00000 0.00000 0.000000 5.93130
SLC39A6 0.00000 0.00000 3.87770 3.76600 1.78160 4.36790 0.00000 0.00000 3.44980 0.43189 ... 0.00000 0.000000 0.475460 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.23980
CTSC 2.66380 6.99010 1.61260 4.84170 4.46070 1.88240 1.07180 1.69380 1.11970 0.88830 ... 2.33570 1.315000 2.067100 3.192600 1.461500 3.64640 7.00040 1.96150 7.191800 6.15880
AQP7 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000

23689 rows × 4645 columns

Multiple types of data interspersed together.

Let's separate them.

Accessing the DataFrame

Two main methods to access elements:

  • indices
  • numbers

In [5]:
sc_data


Out[5]:
Cy72_CD45_H02_S758_comb CY58_1_CD45_B02_S974_comb Cy71_CD45_D08_S524_comb Cy81_FNA_CD45_B01_S301_comb Cy80_II_CD45_B07_S883_comb Cy81_Bulk_CD45_B10_S118_comb Cy72_CD45_D09_S717_comb Cy74_CD45_A03_S387_comb Cy71_CD45_B05_S497_comb Cy80_II_CD45_C09_S897_comb ... CY75_1_CD45_CD8_7__S265_comb CY75_1_CD45_CD8_3__S127_comb CY75_1_CD45_CD8_1__S61_comb CY75_1_CD45_CD8_1__S12_comb CY75_1_CD45_CD8_1__S25_comb CY75_1_CD45_CD8_7__S223_comb CY75_1_CD45_CD8_1__S65_comb CY75_1_CD45_CD8_1__S93_comb CY75_1_CD45_CD8_1__S76_comb CY75_1_CD45_CD8_7__S274_comb
Cell
tumor 72.00000 58.00000 71.00000 81.00000 80.00000 81.00000 72.00000 74.00000 71.00000 80.00000 ... 75.00000 75.000000 75.000000 75.000000 75.000000 75.00000 75.00000 75.00000 75.000000 75.00000
malignant(1=no,2=yes,0=unresolved) 1.00000 1.00000 2.00000 2.00000 2.00000 2.00000 1.00000 1.00000 2.00000 2.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 1.00000
non-malignant cell type (1=T,2=B,3=Macro.4=Endo.,5=CAF;6=NK) 2.00000 1.00000 0.00000 0.00000 0.00000 0.00000 1.00000 1.00000 0.00000 0.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 0.00000
C9orf152 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS11 9.21720 8.37450 9.31300 7.88760 8.32910 7.83360 8.37370 8.13380 8.43730 7.59680 ... 0.00000 7.863900 5.850500 0.626390 6.273400 5.48890 4.92620 7.09580 3.997000 3.98970
ELMO2 0.00000 0.00000 2.12630 0.00000 0.00000 0.77400 0.00000 0.00000 0.00000 0.38294 ... 0.00000 0.000000 3.157200 4.793200 0.000000 0.00000 5.52960 0.00000 0.000000 0.00000
CREB3L1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PNMA1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.51420 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MMP2 0.00000 0.00000 0.73812 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.86970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TMEM216 0.00000 0.00000 0.00000 0.00000 3.79490 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 3.682900 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.29007
TRAF3IP2-AS1 2.85140 2.09830 0.61730 0.96495 1.47350 3.15800 1.30800 1.38020 0.00000 1.12630 ... 3.75450 1.747100 1.787700 1.384000 1.340900 2.40520 1.69880 1.55890 0.471250 1.67460
LRRC37A5P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC653712 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.54798 0.000000 0.000000 0.000000 0.000721 0.00000 0.00000 0.00000 0.000000 0.00000
C10orf90 0.00000 0.00000 0.00000 3.40690 1.44680 2.78760 0.00000 0.00000 1.73980 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
ZHX3 0.00000 0.52907 0.00000 0.51197 0.00000 1.49520 1.10970 0.00000 1.28210 0.00000 ... 1.89240 0.282530 0.329430 0.878360 0.336740 0.53614 0.32554 0.46722 0.449290 0.35422
ERCC5 0.00000 0.00000 0.00000 0.00000 2.28660 2.37410 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 5.072500 0.53382 0.66077 5.27270 0.000000 2.92570
GPR98 0.00000 0.00000 0.00000 0.00000 0.00000 0.02290 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP3 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CTAGE10P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
APBB2 0.17377 0.00000 0.00000 1.11570 0.00000 0.29278 0.00000 0.28688 0.00000 0.00000 ... 0.43193 0.000000 0.091346 0.033131 0.000000 0.00000 0.00000 0.25625 0.044071 0.19094
KLHL13 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
KRTAP10-8 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61900 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PDCL3 0.00000 0.00000 0.00000 4.76640 2.99000 2.35220 0.00000 0.00000 3.89220 2.84320 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
AEN 0.00000 0.00000 0.00000 0.00000 2.31610 3.47680 0.00000 0.00000 4.10310 4.56970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 3.38020 0.000000 6.49550
FRG2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.26504 0.000000 0.000000 0.000000 0.158650 0.21004 0.17355 0.00000 0.000000 0.00000
DECR1 0.00000 0.00000 5.09060 0.00000 3.30030 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 1.366000 0.000000 0.00000 0.00000 0.00000 3.826700 0.00000
SALL1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.172980 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GGT3P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CADM4 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS18 10.07200 7.47900 10.98600 9.61720 9.98130 10.30500 9.75130 8.74800 9.24680 9.57730 ... 8.19610 8.415500 7.091700 0.000000 8.029200 8.36490 8.65190 7.12710 8.997400 1.92120
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
CNST 0.00000 0.00000 0.00000 0.00000 1.92640 2.18080 0.00000 0.00000 3.62840 0.00000 ... 0.14580 0.000000 0.000000 0.789670 0.000000 0.00000 3.88480 0.62217 0.000000 0.00000
ERCC2 0.00000 0.00000 0.00000 0.00000 0.00000 2.35810 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.47660
EAF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
VPS13B 0.47093 0.00000 4.09810 1.64660 0.00000 1.20730 0.00000 0.00000 0.00000 0.00000 ... 0.16349 6.656700 0.000000 5.809900 3.463600 0.44281 0.00000 0.00000 0.000000 5.81820
TP53TG3B 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 3.50170 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
C14orf101 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61340 0.000000 0.000000 0.000000 0.000000 0.00000 2.57820 0.00000 1.787200 0.00000
ST18 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PSMB9 0.00000 0.94261 0.00000 0.00000 3.31090 2.47200 0.00000 0.00000 3.02010 3.83630 ... 1.57680 6.442400 7.282700 5.912200 5.644600 6.58150 7.42070 5.68740 6.471700 6.50470
ADRBK2 0.27023 0.64708 3.89930 0.00000 0.00000 0.10969 0.00000 0.00000 0.85758 0.00000 ... 2.16790 0.072676 0.635890 0.249530 0.547600 0.58503 0.10893 0.12135 0.268260 0.41887
HCLS1 5.88820 7.68530 0.00000 0.00000 0.00000 0.00000 0.00000 1.98040 0.00000 0.00000 ... 0.00000 6.643100 4.323600 0.965700 6.834200 0.93640 5.30360 3.22540 6.160100 6.31980
GPR15 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC100093631 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TTTY17A 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CSF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MIR515-1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
SLC2A11 0.00000 0.00000 0.00000 0.00000 0.00000 2.03630 0.00000 0.00000 0.00000 0.00000 ... 0.82884 0.000000 0.000000 0.000000 0.000000 0.28620 0.00000 0.20950 0.000000 0.40439
SELO 0.00000 0.00000 0.00000 0.00000 1.34770 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GRIP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GOLGA8B 0.55385 1.45470 0.00000 3.25500 0.34369 1.27620 0.00000 0.00000 0.00000 0.11636 ... 0.00000 0.507200 0.066124 0.000000 0.403930 2.32910 0.00000 0.34443 0.000000 3.10200
MIR4691 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GPLD1 0.62667 1.05450 0.99639 0.23143 0.00000 0.44996 0.67536 0.98477 0.00000 0.15575 ... 2.61240 0.160170 0.894100 0.808400 0.482300 1.29620 0.99245 0.97516 0.492080 1.06820
SNORD115-39 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RAB8A 0.00000 0.00000 2.76340 4.19370 2.57050 3.75260 0.00000 0.00000 0.00000 0.00000 ... 4.08420 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PCIF1 0.00000 0.00000 3.67820 0.00000 0.00000 4.54490 0.00000 0.00000 1.87890 0.00000 ... 0.00000 0.000000 4.613600 5.142300 3.771200 0.00000 5.54650 0.00000 0.000000 5.00810
PIK3IP1 7.60690 0.00000 0.00000 0.00000 0.00000 0.00000 6.54570 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 6.408700 0.00000 3.73840 0.00000 0.000000 6.75350
SNRPD2 0.00000 0.00000 3.98710 5.26390 6.08240 5.64240 0.00000 0.00000 5.94510 6.52150 ... 4.44710 4.760700 0.000000 4.743300 5.404100 6.09860 0.00000 0.00000 0.000000 5.93130
SLC39A6 0.00000 0.00000 3.87770 3.76600 1.78160 4.36790 0.00000 0.00000 3.44980 0.43189 ... 0.00000 0.000000 0.475460 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.23980
CTSC 2.66380 6.99010 1.61260 4.84170 4.46070 1.88240 1.07180 1.69380 1.11970 0.88830 ... 2.33570 1.315000 2.067100 3.192600 1.461500 3.64640 7.00040 1.96150 7.191800 6.15880
AQP7 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000

23689 rows × 4645 columns

For example, 'RPS11' or 'ELMO2' are the indices, or 'names' of two rows while 'Cy72_CD45_H02_S758_comb' is the name of the first column.

However, they are also the fifth and sixth rows or the first column, respectively.

Let's select the 'RPS11' row:


In [6]:
rps11 = sc_data.loc['RPS11']
rps11


Out[6]:
Cy72_CD45_H02_S758_comb         9.21720
CY58_1_CD45_B02_S974_comb       8.37450
Cy71_CD45_D08_S524_comb         9.31300
Cy81_FNA_CD45_B01_S301_comb     7.88760
Cy80_II_CD45_B07_S883_comb      8.32910
Cy81_Bulk_CD45_B10_S118_comb    7.83360
Cy72_CD45_D09_S717_comb         8.37370
Cy74_CD45_A03_S387_comb         8.13380
Cy71_CD45_B05_S497_comb         8.43730
Cy80_II_CD45_C09_S897_comb      7.59680
Cy74_CD45_F09_S453_comb         8.68750
CY58_1_CD45_D03_S999_comb       8.93280
Cy72_CD45_C01_S697_comb         8.74090
Cy71_CD45_D07_S523_comb         7.48190
Cy81_FNA_CD45_C12_S228_comb     7.37830
Cy81_FNA_CD45_E05_S341_comb     8.13500
Cy74_CD45_D04_S424_comb         6.74030
Cy74_CD45_C11_S419_comb         8.85560
Cy81_Bulk_CD45_E10_S154_comb    7.60490
Cy80_II_CD45_H07_S955_comb      8.54360
CY58_1_CD45_F01_S1021_comb      9.50820
Cy81_FNA_CD45_D09_S333_comb     8.85780
Cy71_CD45_E12_S540_comb         8.25750
Cy81_FNA_CD45_D07_S235_comb     7.85580
Cy72_CD45_D07_S715_comb         7.44300
Cy72_CD45_E08_S728_comb         9.28850
Cy71_CD45_B07_S499_comb         7.28340
Cy71_CD45_G02_S554_comb         8.46210
Cy71_CD45_E03_S531_comb         7.75130
Cy74_CD45_H03_S471_comb         7.11580
                                 ...   
CY75_1_CD45_CD8_3__S113_comb    1.25290
CY75_1_CD45_CD8_1__S66_comb     1.54910
CY75_1_CD45_CD8_3__S129_comb    6.69060
CY75_1_CD45_CD8_8__S360_comb    6.83000
CY75_1_CD45_CD8_8__S300_comb    6.03340
CY75_1_CD45_CD8_1__S58_comb     4.58920
CY75_1_CD45_CD8_1__S7_comb      0.00000
CY75_1_CD45_CD8_3__S117_comb    5.65600
CY75_1_CD45_CD8_3__S134_comb    3.98630
CY75_1_CD45_CD8_1__S95_comb     5.64330
CY75_1_CD45_CD8_7__S258_comb    5.05350
CY75_1_CD45_CD8_8__S348_comb    6.23060
CY75_1_CD45_CD8_7__S222_comb    6.02180
CY75_1_CD45_CD8_3__S132_comb    6.06410
CY75_1_CD45_CD8_3__S139_comb    5.82400
CY75_1_CD45_CD8_3__S187_comb    0.00000
CY75_1_CD45_CD8_7__S278_comb    5.29550
CY75_1_CD45_CD8_7__S242_comb    7.06450
CY75_1_CD45_CD8_8__S334_comb    6.12320
CY75_1_CD45_CD8_8__S384_comb    4.97080
CY75_1_CD45_CD8_7__S265_comb    0.00000
CY75_1_CD45_CD8_3__S127_comb    7.86390
CY75_1_CD45_CD8_1__S61_comb     5.85050
CY75_1_CD45_CD8_1__S12_comb     0.62639
CY75_1_CD45_CD8_1__S25_comb     6.27340
CY75_1_CD45_CD8_7__S223_comb    5.48890
CY75_1_CD45_CD8_1__S65_comb     4.92620
CY75_1_CD45_CD8_1__S93_comb     7.09580
CY75_1_CD45_CD8_1__S76_comb     3.99700
CY75_1_CD45_CD8_7__S274_comb    3.98970
Name: RPS11, Length: 4645, dtype: float64

Now let's select the fifth row.

Keep in mind that Python is zero-indexed (i.e. the index of the first element is 0) thus, we need to pick the '4th' element.


In [7]:
fifth = sc_data.iloc[4,:]
fifth


Out[7]:
Cy72_CD45_H02_S758_comb         9.21720
CY58_1_CD45_B02_S974_comb       8.37450
Cy71_CD45_D08_S524_comb         9.31300
Cy81_FNA_CD45_B01_S301_comb     7.88760
Cy80_II_CD45_B07_S883_comb      8.32910
Cy81_Bulk_CD45_B10_S118_comb    7.83360
Cy72_CD45_D09_S717_comb         8.37370
Cy74_CD45_A03_S387_comb         8.13380
Cy71_CD45_B05_S497_comb         8.43730
Cy80_II_CD45_C09_S897_comb      7.59680
Cy74_CD45_F09_S453_comb         8.68750
CY58_1_CD45_D03_S999_comb       8.93280
Cy72_CD45_C01_S697_comb         8.74090
Cy71_CD45_D07_S523_comb         7.48190
Cy81_FNA_CD45_C12_S228_comb     7.37830
Cy81_FNA_CD45_E05_S341_comb     8.13500
Cy74_CD45_D04_S424_comb         6.74030
Cy74_CD45_C11_S419_comb         8.85560
Cy81_Bulk_CD45_E10_S154_comb    7.60490
Cy80_II_CD45_H07_S955_comb      8.54360
CY58_1_CD45_F01_S1021_comb      9.50820
Cy81_FNA_CD45_D09_S333_comb     8.85780
Cy71_CD45_E12_S540_comb         8.25750
Cy81_FNA_CD45_D07_S235_comb     7.85580
Cy72_CD45_D07_S715_comb         7.44300
Cy72_CD45_E08_S728_comb         9.28850
Cy71_CD45_B07_S499_comb         7.28340
Cy71_CD45_G02_S554_comb         8.46210
Cy71_CD45_E03_S531_comb         7.75130
Cy74_CD45_H03_S471_comb         7.11580
                                 ...   
CY75_1_CD45_CD8_3__S113_comb    1.25290
CY75_1_CD45_CD8_1__S66_comb     1.54910
CY75_1_CD45_CD8_3__S129_comb    6.69060
CY75_1_CD45_CD8_8__S360_comb    6.83000
CY75_1_CD45_CD8_8__S300_comb    6.03340
CY75_1_CD45_CD8_1__S58_comb     4.58920
CY75_1_CD45_CD8_1__S7_comb      0.00000
CY75_1_CD45_CD8_3__S117_comb    5.65600
CY75_1_CD45_CD8_3__S134_comb    3.98630
CY75_1_CD45_CD8_1__S95_comb     5.64330
CY75_1_CD45_CD8_7__S258_comb    5.05350
CY75_1_CD45_CD8_8__S348_comb    6.23060
CY75_1_CD45_CD8_7__S222_comb    6.02180
CY75_1_CD45_CD8_3__S132_comb    6.06410
CY75_1_CD45_CD8_3__S139_comb    5.82400
CY75_1_CD45_CD8_3__S187_comb    0.00000
CY75_1_CD45_CD8_7__S278_comb    5.29550
CY75_1_CD45_CD8_7__S242_comb    7.06450
CY75_1_CD45_CD8_8__S334_comb    6.12320
CY75_1_CD45_CD8_8__S384_comb    4.97080
CY75_1_CD45_CD8_7__S265_comb    0.00000
CY75_1_CD45_CD8_3__S127_comb    7.86390
CY75_1_CD45_CD8_1__S61_comb     5.85050
CY75_1_CD45_CD8_1__S12_comb     0.62639
CY75_1_CD45_CD8_1__S25_comb     6.27340
CY75_1_CD45_CD8_7__S223_comb    5.48890
CY75_1_CD45_CD8_1__S65_comb     4.92620
CY75_1_CD45_CD8_1__S93_comb     7.09580
CY75_1_CD45_CD8_1__S76_comb     3.99700
CY75_1_CD45_CD8_7__S274_comb    3.98970
Name: RPS11, Length: 4645, dtype: float64

How can we compare these?

Let's retrieve the values as an array and see if they are all equal to each other.


In [8]:
rps11.values


Out[8]:
array([ 9.2172,  8.3745,  9.313 , ...,  7.0958,  3.997 ,  3.9897])

In [9]:
fifth.values


Out[9]:
array([ 9.2172,  8.3745,  9.313 , ...,  7.0958,  3.997 ,  3.9897])

In [10]:
fifth.values == rps11.values


Out[10]:
array([ True,  True,  True, ...,  True,  True,  True], dtype=bool)

We get a lot of values boolean values...


In [11]:
len(fifth.values)


Out[11]:
4645

4645 values in fact---one for each column (i.e. cell).

The first and last three look identical, but there are thousands of others. If just one of them is different, however, the equality check should fail. How can we check them all?


In [12]:
np.all(fifth.values == rps11.values)


Out[12]:
True

All the entries are identical.

What if the numbers were just so slightly off, but not enough that we'd mind.

Example: same data but with a different (less memory consuming) data structure


In [13]:
fifth_full = fifth.values
fifth_full


Out[13]:
array([ 9.2172,  8.3745,  9.313 , ...,  7.0958,  3.997 ,  3.9897])

In [14]:
fifth_full.dtype


Out[14]:
dtype('float64')

In [15]:
fifth_small = np.float16(fifth.values)
fifth_small


Out[15]:
array([ 9.21875   ,  8.375     ,  9.3125    , ...,  7.09765625,
        3.99609375,  3.99023438], dtype=float16)

In [16]:
fifth_small.dtype


Out[16]:
dtype('float16')

In [17]:
np.all(fifth_small == fifth_full)


Out[17]:
False

In [18]:
np.any(fifth_small == fifth_full)


Out[18]:
True

In [19]:
np.allclose(fifth_full, fifth_small, rtol=1e-3)


Out[19]:
True

This is our way of checking if all values are close to each other.

'Close' is defined as within some fixed relative or absolute tolerance.

Pandas Series

What if we wanted to sort the cells by their expression of a particular gene?

First we need to get the expression of that gene in every cell.

Let's do this for an example gene, CD4.


In [20]:
cd4 = sc_data.loc['CD4']
cd4


Out[20]:
Cy72_CD45_H02_S758_comb         0.0000
CY58_1_CD45_B02_S974_comb       0.0000
Cy71_CD45_D08_S524_comb         0.0000
Cy81_FNA_CD45_B01_S301_comb     0.0000
Cy80_II_CD45_B07_S883_comb      0.0000
Cy81_Bulk_CD45_B10_S118_comb    0.0000
Cy72_CD45_D09_S717_comb         0.0000
Cy74_CD45_A03_S387_comb         0.0000
Cy71_CD45_B05_S497_comb         0.0000
Cy80_II_CD45_C09_S897_comb      0.0000
Cy74_CD45_F09_S453_comb         0.0000
CY58_1_CD45_D03_S999_comb       1.6121
Cy72_CD45_C01_S697_comb         0.0000
Cy71_CD45_D07_S523_comb         0.0000
Cy81_FNA_CD45_C12_S228_comb     0.0000
Cy81_FNA_CD45_E05_S341_comb     0.0000
Cy74_CD45_D04_S424_comb         0.0000
Cy74_CD45_C11_S419_comb         0.0000
Cy81_Bulk_CD45_E10_S154_comb    0.0000
Cy80_II_CD45_H07_S955_comb      0.0000
CY58_1_CD45_F01_S1021_comb      0.0000
Cy81_FNA_CD45_D09_S333_comb     0.0000
Cy71_CD45_E12_S540_comb         0.0000
Cy81_FNA_CD45_D07_S235_comb     0.0000
Cy72_CD45_D07_S715_comb         6.8683
Cy72_CD45_E08_S728_comb         0.0000
Cy71_CD45_B07_S499_comb         0.0000
Cy71_CD45_G02_S554_comb         0.0000
Cy71_CD45_E03_S531_comb         6.2290
Cy74_CD45_H03_S471_comb         0.0000
                                 ...  
CY75_1_CD45_CD8_3__S113_comb    0.0000
CY75_1_CD45_CD8_1__S66_comb     0.0000
CY75_1_CD45_CD8_3__S129_comb    0.0000
CY75_1_CD45_CD8_8__S360_comb    0.0000
CY75_1_CD45_CD8_8__S300_comb    0.0000
CY75_1_CD45_CD8_1__S58_comb     0.0000
CY75_1_CD45_CD8_1__S7_comb      0.0000
CY75_1_CD45_CD8_3__S117_comb    0.0000
CY75_1_CD45_CD8_3__S134_comb    0.0000
CY75_1_CD45_CD8_1__S95_comb     0.0000
CY75_1_CD45_CD8_7__S258_comb    0.0000
CY75_1_CD45_CD8_8__S348_comb    0.0000
CY75_1_CD45_CD8_7__S222_comb    0.0000
CY75_1_CD45_CD8_3__S132_comb    0.0000
CY75_1_CD45_CD8_3__S139_comb    0.0000
CY75_1_CD45_CD8_3__S187_comb    0.0000
CY75_1_CD45_CD8_7__S278_comb    3.6080
CY75_1_CD45_CD8_7__S242_comb    0.0000
CY75_1_CD45_CD8_8__S334_comb    0.0000
CY75_1_CD45_CD8_8__S384_comb    0.0000
CY75_1_CD45_CD8_7__S265_comb    0.0000
CY75_1_CD45_CD8_3__S127_comb    0.0000
CY75_1_CD45_CD8_1__S61_comb     0.0000
CY75_1_CD45_CD8_1__S12_comb     0.0000
CY75_1_CD45_CD8_1__S25_comb     0.0000
CY75_1_CD45_CD8_7__S223_comb    0.0000
CY75_1_CD45_CD8_1__S65_comb     0.0000
CY75_1_CD45_CD8_1__S93_comb     0.0000
CY75_1_CD45_CD8_1__S76_comb     0.0000
CY75_1_CD45_CD8_7__S274_comb    0.0000
Name: CD4, Length: 4645, dtype: float64

In [21]:
type(sc_data)


Out[21]:
pandas.core.frame.DataFrame

In [22]:
type(cd4)


Out[22]:
pandas.core.series.Series

Let's find the cells with the highest expression


In [23]:
cd4.sort_values(ascending=False)


Out[23]:
CY67_NEG_A_CAGAGAGG_AAGGAGTA                        8.5618
Cy81_FNA_CD45_B07_S211_comb                         8.3902
cy60_1_cd_45_pos_3_E11_S347_comb                    8.2329
cy84_Primary_CD45_pos_H02_S470_comb                 8.0477
cy94_cd45pos_4_C07_S31_comb                         7.9178
Cy74_CD45_B06_S402_comb                             7.9126
Cy72_CD45_F09_S741_comb                             7.8871
cy60_1_cd_45_pos_3_D09_S333_comb                    7.8352
cy60_1_cd_45_pos_4_H02_S86_comb                     7.8248
cy94_cd45pos_4_D02_S38_comb                         7.7832
Cy72_CD45_E01_S721_comb                             7.7783
cy60_1_cd_45_pos_3_B04_S304_comb                    7.7727
cy79-p1-CD45-pos-PD1-pos-AS-C5-R1-B02-S206-comb     7.7467
cy79-p1-CD45-pos-PD1-neg-AS-C1-R2-B06-S498-comb     7.7088
CY94CD45POS_1_C07_S127_comb                         7.7087
cy80-CD45-pos-PD1-pos-G08-S176-comb                 7.7018
CY89A_Core_15_B03_S15_comb                          7.7003
cy88_cd_45_pos_G05_S461_comb                        7.6726
cy60_1_cd_45_pos_3_G05_S365_comb                    7.6284
cy84_Primary_CD45_pos_B09_S405_comb                 7.5779
Cy81_FNA_CD45_H07_S283_comb                         7.5529
cy80-CD45-neg-G12-S468-comb                         7.5460
cy80-CD45-pos-PD1-pos-F08-S164-comb                 7.5276
Cy74_CD45_G08_S464_comb                             7.4999
CY88CD45_150813_B09_S309_comb                       7.4931
CY94CD45POS_1_D01_S133_comb                         7.4926
cy80-CD45-pos-PD1-pos-B09-S117-comb                 7.4776
cy79-p3-CD45-pos-PD1-pos-AS-C2-R1-C08-S416-comb     7.4726
cy60_1_cd_45_pos_4_B09_S21_comb                     7.4721
CY94CD45POS_1_B10_S118_comb                         7.4714
                                                     ...  
cy84_Primary_CD45_pos_C09_S417_comb                 0.0000
cy84_Primary_CD45_pos_B05_S401_comb                 0.0000
cy60_1_cd_45_pos_3_C06_S318_comb                    0.0000
cy60_1_cd_45_pos_HC_pos_HLDAR_pos_C11_S995_comb     0.0000
cy60_1_cd_45_pos_3_E01_S337_comb                    0.0000
cy60_1_cd_45_pos_HC_pos_HLDAR_pos_A09_S969_comb     0.0000
cy60_1_cd_45_pos_HC_pos_HLDAR_pos_E05_S1013_comb    0.0000
cy60_1_cd_45_pos_HC_pos_HLDAR_pos_C07_S991_comb     0.0000
cy60_1_cd_45_pos_3_F04_S352_comb                    0.0000
cy60_1_cd_45_pos_3_C12_S324_comb                    0.0000
cy60_1_cd_45_pos_HC_pos_HLDAR_pos_A02_S962_comb     0.0000
cy60_1_cd_45_pos_3_A05_S293_comb                    0.0000
cy80_CD_90_pos_F08_S932_comb                        0.0000
cy80_CD_90_pos_A04_S868_comb                        0.0000
cy60_1_cd_45_pos_4_A06_S6_comb                      0.0000
cy60_1_cd_45_pos_4_B04_S16_comb                     0.0000
cy84_Primary_CD45_pos_G03_S459_comb                 0.0000
cy60_1_cd_45_pos_3_E10_S346_comb                    0.0000
cy84_Primary_CD45_pos_H07_S475_comb                 0.0000
cy84_Primary_CD45_pos_G11_S467_comb                 0.0000
cy84_Primary_CD45_pos_E07_S439_comb                 0.0000
cy84_Primary_CD45_pos_B02_S398_comb                 0.0000
cy84_Primary_CD45_pos_D06_S426_comb                 0.0000
cy84_Primary_CD45_pos_G08_S464_comb                 0.0000
cy84_Primary_CD45_pos_C11_S419_comb                 0.0000
cy84_Primary_CD45_pos_B06_S402_comb                 0.0000
cy84_Primary_CD45_pos_H03_S471_comb                 0.0000
cy84_Primary_CD45_pos_G10_S466_comb                 0.0000
cy84_Primary_CD45_pos_D08_S428_comb                 0.0000
Cy72_CD45_H02_S758_comb                             0.0000
Name: CD4, Length: 4645, dtype: float64

Let's find the index of the 10 cells with the highest CD4 expression


In [24]:
cd4.sort_values(ascending=False).index[:10]


Out[24]:
Index(['CY67_NEG_A_CAGAGAGG_AAGGAGTA', 'Cy81_FNA_CD45_B07_S211_comb',
       'cy60_1_cd_45_pos_3_E11_S347_comb',
       'cy84_Primary_CD45_pos_H02_S470_comb', 'cy94_cd45pos_4_C07_S31_comb',
       'Cy74_CD45_B06_S402_comb', 'Cy72_CD45_F09_S741_comb',
       'cy60_1_cd_45_pos_3_D09_S333_comb', 'cy60_1_cd_45_pos_4_H02_S86_comb',
       'cy94_cd45pos_4_D02_S38_comb'],
      dtype='object')

How about the 10 highest and 10 lowest?

First, a distinction.


In [25]:
type(cd4.sort_values(ascending=False).index[:10])


Out[25]:
pandas.core.indexes.base.Index

In [26]:
type(cd4.sort_values(ascending=False).index[:10].values)


Out[26]:
numpy.ndarray

In [27]:
[cd4.sort_values(ascending=False).index[:10].values, 
                cd4.sort_values(ascending=False).index[-10:].values]


Out[27]:
[array(['CY67_NEG_A_CAGAGAGG_AAGGAGTA', 'Cy81_FNA_CD45_B07_S211_comb',
        'cy60_1_cd_45_pos_3_E11_S347_comb',
        'cy84_Primary_CD45_pos_H02_S470_comb',
        'cy94_cd45pos_4_C07_S31_comb', 'Cy74_CD45_B06_S402_comb',
        'Cy72_CD45_F09_S741_comb', 'cy60_1_cd_45_pos_3_D09_S333_comb',
        'cy60_1_cd_45_pos_4_H02_S86_comb', 'cy94_cd45pos_4_D02_S38_comb'], dtype=object),
 array(['cy84_Primary_CD45_pos_E07_S439_comb',
        'cy84_Primary_CD45_pos_B02_S398_comb',
        'cy84_Primary_CD45_pos_D06_S426_comb',
        'cy84_Primary_CD45_pos_G08_S464_comb',
        'cy84_Primary_CD45_pos_C11_S419_comb',
        'cy84_Primary_CD45_pos_B06_S402_comb',
        'cy84_Primary_CD45_pos_H03_S471_comb',
        'cy84_Primary_CD45_pos_G10_S466_comb',
        'cy84_Primary_CD45_pos_D08_S428_comb', 'Cy72_CD45_H02_S758_comb'], dtype=object)]

These are disjoint, so let's unite them.


In [28]:
np.concatenate([cd4.sort_values(ascending=False).index[:10].values, 
                cd4.sort_values(ascending=False).index[-10:].values])


Out[28]:
array(['CY67_NEG_A_CAGAGAGG_AAGGAGTA', 'Cy81_FNA_CD45_B07_S211_comb',
       'cy60_1_cd_45_pos_3_E11_S347_comb',
       'cy84_Primary_CD45_pos_H02_S470_comb',
       'cy94_cd45pos_4_C07_S31_comb', 'Cy74_CD45_B06_S402_comb',
       'Cy72_CD45_F09_S741_comb', 'cy60_1_cd_45_pos_3_D09_S333_comb',
       'cy60_1_cd_45_pos_4_H02_S86_comb', 'cy94_cd45pos_4_D02_S38_comb',
       'cy84_Primary_CD45_pos_E07_S439_comb',
       'cy84_Primary_CD45_pos_B02_S398_comb',
       'cy84_Primary_CD45_pos_D06_S426_comb',
       'cy84_Primary_CD45_pos_G08_S464_comb',
       'cy84_Primary_CD45_pos_C11_S419_comb',
       'cy84_Primary_CD45_pos_B06_S402_comb',
       'cy84_Primary_CD45_pos_H03_S471_comb',
       'cy84_Primary_CD45_pos_G10_S466_comb',
       'cy84_Primary_CD45_pos_D08_S428_comb', 'Cy72_CD45_H02_S758_comb'], dtype=object)

Let's find the average expression over all the cells


In [29]:
cd4.mean()


Out[29]:
1.0229234858988172

Alternatively


In [30]:
np.mean(cd4)


Out[30]:
1.0229234858988172

or


In [31]:
sum(cd4.values)/len(cd4)


Out[31]:
1.0229234858988172

Let's see if they are all close:


In [32]:
np.allclose(sum(cd4.values)/len(cd4),[cd4.mean(),np.mean(cd4.values)])


Out[32]:
True

Something interesting happened there...

Broadcasting

Broadcasting works with Pandas Series or DataFrame objects as well.

Let's see some examples:


In [33]:
cd4+10


Out[33]:
Cy72_CD45_H02_S758_comb         10.0000
CY58_1_CD45_B02_S974_comb       10.0000
Cy71_CD45_D08_S524_comb         10.0000
Cy81_FNA_CD45_B01_S301_comb     10.0000
Cy80_II_CD45_B07_S883_comb      10.0000
Cy81_Bulk_CD45_B10_S118_comb    10.0000
Cy72_CD45_D09_S717_comb         10.0000
Cy74_CD45_A03_S387_comb         10.0000
Cy71_CD45_B05_S497_comb         10.0000
Cy80_II_CD45_C09_S897_comb      10.0000
Cy74_CD45_F09_S453_comb         10.0000
CY58_1_CD45_D03_S999_comb       11.6121
Cy72_CD45_C01_S697_comb         10.0000
Cy71_CD45_D07_S523_comb         10.0000
Cy81_FNA_CD45_C12_S228_comb     10.0000
Cy81_FNA_CD45_E05_S341_comb     10.0000
Cy74_CD45_D04_S424_comb         10.0000
Cy74_CD45_C11_S419_comb         10.0000
Cy81_Bulk_CD45_E10_S154_comb    10.0000
Cy80_II_CD45_H07_S955_comb      10.0000
CY58_1_CD45_F01_S1021_comb      10.0000
Cy81_FNA_CD45_D09_S333_comb     10.0000
Cy71_CD45_E12_S540_comb         10.0000
Cy81_FNA_CD45_D07_S235_comb     10.0000
Cy72_CD45_D07_S715_comb         16.8683
Cy72_CD45_E08_S728_comb         10.0000
Cy71_CD45_B07_S499_comb         10.0000
Cy71_CD45_G02_S554_comb         10.0000
Cy71_CD45_E03_S531_comb         16.2290
Cy74_CD45_H03_S471_comb         10.0000
                                 ...   
CY75_1_CD45_CD8_3__S113_comb    10.0000
CY75_1_CD45_CD8_1__S66_comb     10.0000
CY75_1_CD45_CD8_3__S129_comb    10.0000
CY75_1_CD45_CD8_8__S360_comb    10.0000
CY75_1_CD45_CD8_8__S300_comb    10.0000
CY75_1_CD45_CD8_1__S58_comb     10.0000
CY75_1_CD45_CD8_1__S7_comb      10.0000
CY75_1_CD45_CD8_3__S117_comb    10.0000
CY75_1_CD45_CD8_3__S134_comb    10.0000
CY75_1_CD45_CD8_1__S95_comb     10.0000
CY75_1_CD45_CD8_7__S258_comb    10.0000
CY75_1_CD45_CD8_8__S348_comb    10.0000
CY75_1_CD45_CD8_7__S222_comb    10.0000
CY75_1_CD45_CD8_3__S132_comb    10.0000
CY75_1_CD45_CD8_3__S139_comb    10.0000
CY75_1_CD45_CD8_3__S187_comb    10.0000
CY75_1_CD45_CD8_7__S278_comb    13.6080
CY75_1_CD45_CD8_7__S242_comb    10.0000
CY75_1_CD45_CD8_8__S334_comb    10.0000
CY75_1_CD45_CD8_8__S384_comb    10.0000
CY75_1_CD45_CD8_7__S265_comb    10.0000
CY75_1_CD45_CD8_3__S127_comb    10.0000
CY75_1_CD45_CD8_1__S61_comb     10.0000
CY75_1_CD45_CD8_1__S12_comb     10.0000
CY75_1_CD45_CD8_1__S25_comb     10.0000
CY75_1_CD45_CD8_7__S223_comb    10.0000
CY75_1_CD45_CD8_1__S65_comb     10.0000
CY75_1_CD45_CD8_1__S93_comb     10.0000
CY75_1_CD45_CD8_1__S76_comb     10.0000
CY75_1_CD45_CD8_7__S274_comb    10.0000
Name: CD4, Length: 4645, dtype: float64

In [34]:
list(zip(cd4, cd4+10, cd4*10, np.power(cd4,2)))


Out[34]:
[(0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.6121, 11.6121, 16.121000000000002, 2.5988664100000003),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.8683, 16.868299999999998, 68.68299999999999, 47.173544889999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.229, 16.229, 62.29, 38.800441),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.3262, 16.3262, 63.262, 40.02080644),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.25096, 10.25096, 2.5096000000000003, 0.0629809216),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.3087, 15.3087, 53.087, 28.18229569),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9637, 15.9637, 59.637, 35.56571769),
 (7.3498, 17.349800000000002, 73.498, 54.01956004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (3.8989, 13.8989, 38.989, 15.201421209999998),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.1467, 11.1467, 11.467, 1.3149208900000002),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.1459, 17.1459, 71.459, 51.06388681),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.0365, 16.0365, 60.365, 36.43933225),
 (3.5389, 13.5389, 35.388999999999996, 12.52381321),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.7435, 15.743500000000001, 57.435, 32.98779225),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.18395999999999998, 10.18396, 1.8396, 0.033841281599999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (3.4985, 13.4985, 34.985, 12.23950225),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.3433, 16.3433, 63.433, 40.23745489),
 (5.2837, 15.2837, 52.836999999999996, 27.917485689999996),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.8589, 16.8589, 68.589, 47.04450921),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.4567, 14.4567, 44.56699999999999, 19.86217489),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (2.9776, 12.977599999999999, 29.775999999999996, 8.86610176),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.2868, 15.2868, 52.868, 27.950254240000003),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.3722, 17.3722, 73.72200000000001, 54.34933284),
 (0.0, 10.0, 0.0, 0.0),
 (4.5999, 14.5999, 45.998999999999995, 21.15908001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.1038, 16.1038, 61.038, 37.256374439999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.4999, 17.4999, 74.999, 56.24850001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.1607, 15.1607, 51.607, 26.632824490000004),
 (0.0, 10.0, 0.0, 0.0),
 (6.7699, 16.7699, 67.699, 45.83154601),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.9474, 16.947400000000002, 69.474, 48.26636676),
 (0.0, 10.0, 0.0, 0.0),
 (6.7087, 16.7087, 67.087, 45.00665569),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.9126, 17.9126, 79.126, 62.609238760000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.4797, 11.4797, 14.797, 2.18951209),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.6581, 16.6581, 66.581, 44.33029561),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.4297, 17.4297, 74.297, 55.20044209),
 (5.8158, 15.8158, 58.158, 33.823529640000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.5145, 15.5145, 55.144999999999996, 30.40971025),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.2372, 16.2372, 62.372, 38.902663839999995),
 (4.3346, 14.3346, 43.346000000000004, 18.78875716),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (2.654, 12.654, 26.54, 7.043716),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.9206700000000001, 10.92067, 9.206700000000001, 0.8476332489000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.7783, 17.7783, 77.783, 60.501950889999996),
 (0.0, 10.0, 0.0, 0.0),
 (4.3379, 14.337900000000001, 43.379000000000005, 18.81737641),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.9115, 10.9115, 9.115, 0.8308322499999999),
 (7.2987, 17.2987, 72.987, 53.271021690000005),
 (4.7862, 14.786200000000001, 47.862, 22.90771044),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.359, 11.359, 13.59, 1.846881),
 (5.0027, 15.0027, 50.027, 25.02700729),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.6759, 16.6759, 66.759, 44.56764081000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.0795, 17.0795, 70.795, 50.11932025000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.7175, 16.7175, 67.175, 45.124806250000006),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.5947, 15.5947, 55.946999999999996, 31.300668089999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.2445, 17.244500000000002, 72.44500000000001, 52.482780250000005),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.3743, 14.3743, 43.742999999999995, 19.134500489999997),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.5214, 15.5214, 55.214, 30.485857959999997),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.50589, 10.50589, 5.0588999999999995, 0.25592469209999996),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (2.2992, 12.299199999999999, 22.991999999999997, 5.28632064),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.2729, 16.2729, 62.729, 39.34927441),
 (0.0, 10.0, 0.0, 0.0),
 (6.2216, 16.2216, 62.215999999999994, 38.70830656),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.9773, 14.9773, 49.772999999999996, 24.773515289999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.7256, 15.7256, 57.256, 32.78249536),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.9635, 16.9635, 69.63499999999999, 48.490332249999994),
 (5.061, 15.061, 50.61, 25.613720999999998),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.7328, 16.7328, 67.328, 45.33059584),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.2225, 16.2225, 62.225, 38.71950625),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.8871, 17.8871, 78.87100000000001, 62.20634641),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.5529, 17.5529, 75.529, 57.046298410000006),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (8.3902, 18.3902, 83.902, 70.39545604),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.0426, 11.0426, 10.426, 1.08701476),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.5271, 16.5271, 65.271, 42.60303441),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.8729, 16.8729, 68.729, 47.236754409999996),
 (4.8938, 14.893799999999999, 48.937999999999995, 23.949278439999997),
 (5.8639, 15.863900000000001, 58.639, 34.38532321),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.8859, 15.8859, 58.859, 34.643818810000006),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.3143, 17.3143, 73.143, 53.498984490000005),
 (6.6823, 16.682299999999998, 66.823, 44.65313328999999),
 (4.263999999999999, 14.264, 42.63999999999999, 18.181695999999995),
 (0.0, 10.0, 0.0, 0.0),
 (6.3458, 16.3458, 63.458, 40.269177639999995),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.3962, 14.3962, 43.962, 19.32657444),
 (4.8835, 14.8835, 48.834999999999994, 23.848572249999997),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.1109, 15.1109, 51.109, 26.12129881),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.8157, 14.8157, 48.157, 23.190966489999997),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.6392, 15.639199999999999, 56.391999999999996, 31.800576639999996),
 (6.7307, 16.7307, 67.307, 45.302322489999995),
 (6.3741, 16.3741, 63.741, 40.629150810000006),
 (5.3644, 15.3644, 53.644, 28.776787359999997),
 (4.6823, 14.6823, 46.82299999999999, 21.923933289999997),
 (6.0308, 16.0308, 60.308, 36.37054864),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.6053, 14.6053, 46.053, 21.20878809),
 (0.0, 10.0, 0.0, 0.0),
 (4.726, 14.725999999999999, 47.26, 22.335076),
 (0.8278200000000001,
  10.827820000000001,
  8.278200000000002,
  0.6852859524000002),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.8003, 11.8003, 18.003, 3.24108009),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.5336, 11.5336, 15.336, 2.3519289600000004),
 (6.0275, 16.0275, 60.275, 36.33075625),
 (0.0, 10.0, 0.0, 0.0),
 (4.9043, 14.9043, 49.043, 24.05215849),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (2.0656, 12.0656, 20.656, 4.266703359999999),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.1791, 15.1791, 51.791, 26.82307681),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.7755, 16.7755, 67.755, 45.90740025),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.9067, 16.9067, 69.067, 47.70250489),
 (6.9317, 16.9317, 69.31700000000001, 48.048464890000005),
 (0.0, 10.0, 0.0, 0.0),
 (0.5109600000000001, 10.51096, 5.1096, 0.2610801216000001),
 (0.0, 10.0, 0.0, 0.0),
 (2.5135, 12.5135, 25.135, 6.317682250000001),
 (0.0, 10.0, 0.0, 0.0),
 (6.6041, 16.6041, 66.041, 43.61413681),
 (3.7012, 13.7012, 37.012, 13.698881440000001),
 (7.2297, 17.2297, 72.297, 52.26856209),
 (5.6042, 15.604199999999999, 56.041999999999994, 31.407057639999994),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.8525, 15.8525, 58.525, 34.25175625),
 (4.852, 14.852, 48.52, 23.541904000000002),
 (0.0, 10.0, 0.0, 0.0),
 (4.7581, 14.758099999999999, 47.580999999999996, 22.639515609999997),
 (5.562, 15.562000000000001, 55.620000000000005, 30.935844000000003),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.3357, 14.3357, 43.357, 18.79829449),
 (6.0099, 16.009900000000002, 60.099000000000004, 36.11889801),
 (0.0, 10.0, 0.0, 0.0),
 (4.2815, 14.281500000000001, 42.815000000000005, 18.331242250000003),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.4366, 15.4366, 54.366, 29.556619560000005),
 (0.0, 10.0, 0.0, 0.0),
 (7.3797, 17.3797, 73.797, 54.459972089999994),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.7206, 15.720600000000001, 57.206, 32.725264360000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.5897, 15.5897, 55.897, 31.244746089999996),
 (0.0, 10.0, 0.0, 0.0),
 (0.8710600000000001, 10.87106, 8.710600000000001, 0.7587455236000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9132, 15.9132, 59.132, 34.965934239999996),
 (6.1637, 16.1637, 61.637, 37.99119769000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (3.8315, 13.8315, 38.315, 14.68039225),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.7553, 15.7553, 57.553, 33.12347809),
 (5.3478, 15.3478, 53.478, 28.598964840000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.4726, 17.4726, 74.726, 55.83975076),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.4555, 15.4555, 54.555, 29.76248025),
 (5.7834, 15.7834, 57.834, 33.447715560000006),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.4364, 16.4364, 64.364, 41.427244959999996),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.2961, 15.2961, 52.961, 28.04867521),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.3732, 15.3732, 53.732, 28.87127824),
 (0.0, 10.0, 0.0, 0.0),
 (2.6774, 12.6774, 26.774, 7.16847076),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.0025, 16.0025, 60.025000000000006, 36.03000625000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (1.4777, 11.4777, 14.777000000000001, 2.1835972900000002),
 (0.0, 10.0, 0.0, 0.0),
 (7.7467, 17.7467, 77.467, 60.01136089),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.707000000000001, 15.707, 57.07000000000001, 32.569849000000005),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9471, 15.947099999999999, 59.471, 35.36799841),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.4978, 16.497799999999998, 64.978, 42.22140484),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9199, 15.9199, 59.199, 35.045216010000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.7088, 17.7088, 77.088, 59.425597440000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.0599, 14.059899999999999, 40.599, 16.48278801),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.8381, 16.8381, 68.381, 46.75961161),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.8831, 15.883099999999999, 58.830999999999996, 34.61086561),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.7091, 14.7091, 47.091, 22.175622810000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.0779, 16.0779, 60.778999999999996, 36.94086840999999),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.317, 15.317, 53.17, 28.270489),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.2369, 16.2369, 62.369, 38.89892161),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9555, 15.9555, 59.555, 35.46798025),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.6451, 14.6451, 46.451, 21.57695401),
 (0.0, 10.0, 0.0, 0.0),
 (1.0942, 11.0942, 10.942, 1.1972736400000001),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.9931, 15.9931, 59.931, 35.917247610000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (3.3759, 13.3759, 33.759, 11.39670081),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.6316, 16.6316, 66.316, 43.97811856),
 (5.3845, 15.3845, 53.845, 28.99284025),
 (5.7119, 15.7119, 57.119, 32.625801609999996),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.2005, 17.200499999999998, 72.005, 51.84720025),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.3278, 15.3278, 53.278, 28.38545284),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (7.4123, 17.412300000000002, 74.123, 54.942191290000004),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (5.728, 15.728, 57.28, 32.809984),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (4.1982, 14.1982, 41.982, 17.62488324),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (6.3431, 16.3431, 63.431, 40.23491761),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 (0.0, 10.0, 0.0, 0.0),
 ...]

DataFrame level summary operations

Thus far, we have mostly summarized information for the Series object.

This is easier because Series is like a simple list.

What if we want to operate on the whole expression dataset?

Example goal: normalize expression

Let's replace expression values with values in the range 0 to 1, such that 0 and 1 respectively correspond to the least and highest expression values in the whole dataset.

We can't use the data frame as it stands because there is non-expression data. We need to strip them away.


In [35]:
sc_data


Out[35]:
Cy72_CD45_H02_S758_comb CY58_1_CD45_B02_S974_comb Cy71_CD45_D08_S524_comb Cy81_FNA_CD45_B01_S301_comb Cy80_II_CD45_B07_S883_comb Cy81_Bulk_CD45_B10_S118_comb Cy72_CD45_D09_S717_comb Cy74_CD45_A03_S387_comb Cy71_CD45_B05_S497_comb Cy80_II_CD45_C09_S897_comb ... CY75_1_CD45_CD8_7__S265_comb CY75_1_CD45_CD8_3__S127_comb CY75_1_CD45_CD8_1__S61_comb CY75_1_CD45_CD8_1__S12_comb CY75_1_CD45_CD8_1__S25_comb CY75_1_CD45_CD8_7__S223_comb CY75_1_CD45_CD8_1__S65_comb CY75_1_CD45_CD8_1__S93_comb CY75_1_CD45_CD8_1__S76_comb CY75_1_CD45_CD8_7__S274_comb
Cell
tumor 72.00000 58.00000 71.00000 81.00000 80.00000 81.00000 72.00000 74.00000 71.00000 80.00000 ... 75.00000 75.000000 75.000000 75.000000 75.000000 75.00000 75.00000 75.00000 75.000000 75.00000
malignant(1=no,2=yes,0=unresolved) 1.00000 1.00000 2.00000 2.00000 2.00000 2.00000 1.00000 1.00000 2.00000 2.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 1.00000
non-malignant cell type (1=T,2=B,3=Macro.4=Endo.,5=CAF;6=NK) 2.00000 1.00000 0.00000 0.00000 0.00000 0.00000 1.00000 1.00000 0.00000 0.00000 ... 1.00000 1.000000 1.000000 1.000000 1.000000 1.00000 1.00000 1.00000 1.000000 0.00000
C9orf152 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS11 9.21720 8.37450 9.31300 7.88760 8.32910 7.83360 8.37370 8.13380 8.43730 7.59680 ... 0.00000 7.863900 5.850500 0.626390 6.273400 5.48890 4.92620 7.09580 3.997000 3.98970
ELMO2 0.00000 0.00000 2.12630 0.00000 0.00000 0.77400 0.00000 0.00000 0.00000 0.38294 ... 0.00000 0.000000 3.157200 4.793200 0.000000 0.00000 5.52960 0.00000 0.000000 0.00000
CREB3L1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PNMA1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.51420 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MMP2 0.00000 0.00000 0.73812 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2.86970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TMEM216 0.00000 0.00000 0.00000 0.00000 3.79490 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 3.682900 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.29007
TRAF3IP2-AS1 2.85140 2.09830 0.61730 0.96495 1.47350 3.15800 1.30800 1.38020 0.00000 1.12630 ... 3.75450 1.747100 1.787700 1.384000 1.340900 2.40520 1.69880 1.55890 0.471250 1.67460
LRRC37A5P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC653712 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.54798 0.000000 0.000000 0.000000 0.000721 0.00000 0.00000 0.00000 0.000000 0.00000
C10orf90 0.00000 0.00000 0.00000 3.40690 1.44680 2.78760 0.00000 0.00000 1.73980 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
ZHX3 0.00000 0.52907 0.00000 0.51197 0.00000 1.49520 1.10970 0.00000 1.28210 0.00000 ... 1.89240 0.282530 0.329430 0.878360 0.336740 0.53614 0.32554 0.46722 0.449290 0.35422
ERCC5 0.00000 0.00000 0.00000 0.00000 2.28660 2.37410 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 5.072500 0.53382 0.66077 5.27270 0.000000 2.92570
GPR98 0.00000 0.00000 0.00000 0.00000 0.00000 0.02290 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP3 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CTAGE10P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
APBB2 0.17377 0.00000 0.00000 1.11570 0.00000 0.29278 0.00000 0.28688 0.00000 0.00000 ... 0.43193 0.000000 0.091346 0.033131 0.000000 0.00000 0.00000 0.25625 0.044071 0.19094
KLHL13 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
KRTAP10-8 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61900 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PDCL3 0.00000 0.00000 0.00000 4.76640 2.99000 2.35220 0.00000 0.00000 3.89220 2.84320 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
AEN 0.00000 0.00000 0.00000 0.00000 2.31610 3.47680 0.00000 0.00000 4.10310 4.56970 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 3.38020 0.000000 6.49550
FRG2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.26504 0.000000 0.000000 0.000000 0.158650 0.21004 0.17355 0.00000 0.000000 0.00000
DECR1 0.00000 0.00000 5.09060 0.00000 3.30030 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 1.366000 0.000000 0.00000 0.00000 0.00000 3.826700 0.00000
SALL1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.172980 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GGT3P 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CADM4 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RPS18 10.07200 7.47900 10.98600 9.61720 9.98130 10.30500 9.75130 8.74800 9.24680 9.57730 ... 8.19610 8.415500 7.091700 0.000000 8.029200 8.36490 8.65190 7.12710 8.997400 1.92120
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
CNST 0.00000 0.00000 0.00000 0.00000 1.92640 2.18080 0.00000 0.00000 3.62840 0.00000 ... 0.14580 0.000000 0.000000 0.789670 0.000000 0.00000 3.88480 0.62217 0.000000 0.00000
ERCC2 0.00000 0.00000 0.00000 0.00000 0.00000 2.35810 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.47660
EAF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
VPS13B 0.47093 0.00000 4.09810 1.64660 0.00000 1.20730 0.00000 0.00000 0.00000 0.00000 ... 0.16349 6.656700 0.000000 5.809900 3.463600 0.44281 0.00000 0.00000 0.000000 5.81820
TP53TG3B 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 3.50170 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
C14orf101 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 1.61340 0.000000 0.000000 0.000000 0.000000 0.00000 2.57820 0.00000 1.787200 0.00000
ST18 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PSMB9 0.00000 0.94261 0.00000 0.00000 3.31090 2.47200 0.00000 0.00000 3.02010 3.83630 ... 1.57680 6.442400 7.282700 5.912200 5.644600 6.58150 7.42070 5.68740 6.471700 6.50470
ADRBK2 0.27023 0.64708 3.89930 0.00000 0.00000 0.10969 0.00000 0.00000 0.85758 0.00000 ... 2.16790 0.072676 0.635890 0.249530 0.547600 0.58503 0.10893 0.12135 0.268260 0.41887
HCLS1 5.88820 7.68530 0.00000 0.00000 0.00000 0.00000 0.00000 1.98040 0.00000 0.00000 ... 0.00000 6.643100 4.323600 0.965700 6.834200 0.93640 5.30360 3.22540 6.160100 6.31980
GPR15 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
LOC100093631 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
TTTY17A 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
CSF2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
MIR515-1 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
SLC2A11 0.00000 0.00000 0.00000 0.00000 0.00000 2.03630 0.00000 0.00000 0.00000 0.00000 ... 0.82884 0.000000 0.000000 0.000000 0.000000 0.28620 0.00000 0.20950 0.000000 0.40439
SELO 0.00000 0.00000 0.00000 0.00000 1.34770 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GRIP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GOLGA8B 0.55385 1.45470 0.00000 3.25500 0.34369 1.27620 0.00000 0.00000 0.00000 0.11636 ... 0.00000 0.507200 0.066124 0.000000 0.403930 2.32910 0.00000 0.34443 0.000000 3.10200
MIR4691 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
GPLD1 0.62667 1.05450 0.99639 0.23143 0.00000 0.44996 0.67536 0.98477 0.00000 0.15575 ... 2.61240 0.160170 0.894100 0.808400 0.482300 1.29620 0.99245 0.97516 0.492080 1.06820
SNORD115-39 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RAB8A 0.00000 0.00000 2.76340 4.19370 2.57050 3.75260 0.00000 0.00000 0.00000 0.00000 ... 4.08420 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
RXFP2 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000
PCIF1 0.00000 0.00000 3.67820 0.00000 0.00000 4.54490 0.00000 0.00000 1.87890 0.00000 ... 0.00000 0.000000 4.613600 5.142300 3.771200 0.00000 5.54650 0.00000 0.000000 5.00810
PIK3IP1 7.60690 0.00000 0.00000 0.00000 0.00000 0.00000 6.54570 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 6.408700 0.00000 3.73840 0.00000 0.000000 6.75350
SNRPD2 0.00000 0.00000 3.98710 5.26390 6.08240 5.64240 0.00000 0.00000 5.94510 6.52150 ... 4.44710 4.760700 0.000000 4.743300 5.404100 6.09860 0.00000 0.00000 0.000000 5.93130
SLC39A6 0.00000 0.00000 3.87770 3.76600 1.78160 4.36790 0.00000 0.00000 3.44980 0.43189 ... 0.00000 0.000000 0.475460 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 5.23980
CTSC 2.66380 6.99010 1.61260 4.84170 4.46070 1.88240 1.07180 1.69380 1.11970 0.88830 ... 2.33570 1.315000 2.067100 3.192600 1.461500 3.64640 7.00040 1.96150 7.191800 6.15880
AQP7 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ... 0.00000 0.000000 0.000000 0.000000 0.000000 0.00000 0.00000 0.00000 0.000000 0.00000

23689 rows × 4645 columns

Let's strip the information in the first three rows


In [36]:
patient = sc_data.iloc[0,:]
malignant = sc_data.iloc[1,:]
celltype = sc_data.iloc[2,:]
exp = sc_data.iloc[3:,:]

We now have celltype stored in two different ways:

  1. malignant
  2. cell type

Let's unite them under celltype


In [37]:
celltype[malignant==2] = 7

Retrieving values

To rank normalize, first we need all the non-zero expression values.


In [38]:
vals = exp.values.flatten()

In [39]:
plt.figure(figsize=(9,6))
_=sns.distplot(vals[vals>0])
plt.savefig("Figures"+os.sep+"tirosh_exp_hist.png")


Note: We'll cover plots later. For now, take this code for granted.

Now we need to normalize each expression value to the 0-1 interval.

We need the minimum and maximum expression values.


In [40]:
min_val = np.min(vals)
max_val = np.max(vals)
print('Min: {0:.2f}, Max: {1:.2f}'.format(min_val,max_val))


Min: 0.00, Max: 15.92

Now let's normalize each expression value:


In [41]:
norm_exp = (exp-min_val) / max_val
norm_exp


Out[41]:
Cy72_CD45_H02_S758_comb CY58_1_CD45_B02_S974_comb Cy71_CD45_D08_S524_comb Cy81_FNA_CD45_B01_S301_comb Cy80_II_CD45_B07_S883_comb Cy81_Bulk_CD45_B10_S118_comb Cy72_CD45_D09_S717_comb Cy74_CD45_A03_S387_comb Cy71_CD45_B05_S497_comb Cy80_II_CD45_C09_S897_comb ... CY75_1_CD45_CD8_7__S265_comb CY75_1_CD45_CD8_3__S127_comb CY75_1_CD45_CD8_1__S61_comb CY75_1_CD45_CD8_1__S12_comb CY75_1_CD45_CD8_1__S25_comb CY75_1_CD45_CD8_7__S223_comb CY75_1_CD45_CD8_1__S65_comb CY75_1_CD45_CD8_1__S93_comb CY75_1_CD45_CD8_1__S76_comb CY75_1_CD45_CD8_7__S274_comb
Cell
C9orf152 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
RPS11 0.578861 0.525937 0.584877 0.495359 0.523086 0.491968 0.525887 0.510821 0.529881 0.477096 ... 0.000000 0.493871 0.367424 0.039339 0.393984 0.344715 0.309376 0.445632 0.251021 0.250562
ELMO2 0.000000 0.000000 0.133536 0.000000 0.000000 0.048609 0.000000 0.000000 0.000000 0.024049 ... 0.000000 0.000000 0.198279 0.301024 0.000000 0.000000 0.347271 0.000000 0.000000 0.000000
CREB3L1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
PNMA1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.157897 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
MMP2 0.000000 0.000000 0.046356 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.180224 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
TMEM216 0.000000 0.000000 0.000000 0.000000 0.238328 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.231294 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.018217
TRAF3IP2-AS1 0.179074 0.131778 0.038768 0.060601 0.092539 0.198329 0.082145 0.086680 0.000000 0.070734 ... 0.235791 0.109722 0.112272 0.086918 0.084212 0.151052 0.106688 0.097902 0.029596 0.105169
LRRC37A5P 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
LOC653712 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.034414 0.000000 0.000000 0.000000 0.000045 0.000000 0.000000 0.000000 0.000000 0.000000
C10orf90 0.000000 0.000000 0.000000 0.213961 0.090862 0.175068 0.000000 0.000000 0.109263 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
ZHX3 0.000000 0.033227 0.000000 0.032153 0.000000 0.093902 0.069692 0.000000 0.080519 0.000000 ... 0.118847 0.017744 0.020689 0.055163 0.021148 0.033671 0.020445 0.029342 0.028216 0.022246
ERCC5 0.000000 0.000000 0.000000 0.000000 0.143604 0.149099 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.318564 0.033525 0.041498 0.331137 0.000000 0.183741
GPR98 0.000000 0.000000 0.000000 0.000000 0.000000 0.001438 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
RXFP3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
CTAGE10P 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
APBB2 0.010913 0.000000 0.000000 0.070068 0.000000 0.018387 0.000000 0.018017 0.000000 0.000000 ... 0.027126 0.000000 0.005737 0.002081 0.000000 0.000000 0.000000 0.016093 0.002768 0.011991
KLHL13 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
KRTAP10-8 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.101677 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
PDCL3 0.000000 0.000000 0.000000 0.299341 0.187779 0.147723 0.000000 0.000000 0.244439 0.178559 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
AEN 0.000000 0.000000 0.000000 0.000000 0.145456 0.218351 0.000000 0.000000 0.257684 0.286987 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.212284 0.000000 0.407932
FRG2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.016645 0.000000 0.000000 0.000000 0.009964 0.013191 0.010899 0.000000 0.000000 0.000000
DECR1 0.000000 0.000000 0.319701 0.000000 0.207266 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.085788 0.000000 0.000000 0.000000 0.000000 0.240325 0.000000
SALL1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.010864 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
GGT3P 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
CADM4 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
RPS18 0.632544 0.469698 0.689945 0.603982 0.626848 0.647177 0.612403 0.549394 0.580720 0.601476 ... 0.514733 0.528512 0.445375 0.000000 0.504252 0.525334 0.543359 0.447598 0.565057 0.120656
SLC10A7 0.000000 0.000000 0.220631 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.369648 0.397362 0.000000 0.000000
CFHR5 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
OR2K2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
CNST 0.000000 0.000000 0.000000 0.000000 0.120982 0.136959 0.000000 0.000000 0.227872 0.000000 ... 0.009157 0.000000 0.000000 0.049593 0.000000 0.000000 0.243974 0.039074 0.000000 0.000000
ERCC2 0.000000 0.000000 0.000000 0.000000 0.000000 0.148094 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.343943
EAF2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
VPS13B 0.029575 0.000000 0.257370 0.103410 0.000000 0.075821 0.000000 0.000000 0.000000 0.000000 ... 0.010268 0.418056 0.000000 0.364875 0.217522 0.027809 0.000000 0.000000 0.000000 0.365396
TP53TG3B 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.219915 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
C14orf101 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.101325 0.000000 0.000000 0.000000 0.000000 0.000000 0.161917 0.000000 0.112240 0.000000
ST18 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
PSMB9 0.000000 0.059198 0.000000 0.000000 0.207932 0.155247 0.000000 0.000000 0.189669 0.240928 ... 0.099027 0.404597 0.457370 0.371299 0.354493 0.413333 0.466037 0.357181 0.406437 0.408510
ADRBK2 0.016971 0.040638 0.244885 0.000000 0.000000 0.006889 0.000000 0.000000 0.053858 0.000000 ... 0.136149 0.004564 0.039935 0.015671 0.034391 0.036741 0.006841 0.007621 0.016847 0.026306
HCLS1 0.369792 0.482654 0.000000 0.000000 0.000000 0.000000 0.000000 0.124374 0.000000 0.000000 ... 0.000000 0.417202 0.271532 0.060648 0.429203 0.058808 0.333078 0.202562 0.386868 0.396898
GPR15 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
LOC100093631 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
TTTY17A 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
CSF2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
MIR515-1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
SLC2A11 0.000000 0.000000 0.000000 0.000000 0.000000 0.127884 0.000000 0.000000 0.000000 0.000000 ... 0.052053 0.000000 0.000000 0.000000 0.000000 0.017974 0.000000 0.013157 0.000000 0.025397
SELO 0.000000 0.000000 0.000000 0.000000 0.084639 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
GRIP2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
GOLGA8B 0.034783 0.091358 0.000000 0.204421 0.021585 0.080148 0.000000 0.000000 0.000000 0.007308 ... 0.000000 0.031853 0.004153 0.000000 0.025368 0.146273 0.000000 0.021631 0.000000 0.194813
MIR4691 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
GPLD1 0.039356 0.066225 0.062576 0.014534 0.000000 0.028258 0.042414 0.061846 0.000000 0.009781 ... 0.164065 0.010059 0.056151 0.050769 0.030290 0.081404 0.062328 0.061242 0.030904 0.067085
SNORD115-39 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
RAB8A 0.000000 0.000000 0.173548 0.263374 0.161433 0.235672 0.000000 0.000000 0.000000 0.000000 ... 0.256497 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
RXFP2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
PCIF1 0.000000 0.000000 0.230999 0.000000 0.000000 0.285430 0.000000 0.000000 0.117999 0.000000 ... 0.000000 0.000000 0.289744 0.322948 0.236840 0.000000 0.348333 0.000000 0.000000 0.314520
PIK3IP1 0.477730 0.000000 0.000000 0.000000 0.000000 0.000000 0.411085 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.402481 0.000000 0.234780 0.000000 0.000000 0.424135
SNRPD2 0.000000 0.000000 0.250399 0.330585 0.381988 0.354355 0.000000 0.000000 0.373366 0.409565 ... 0.279288 0.298983 0.000000 0.297890 0.339390 0.383006 0.000000 0.000000 0.000000 0.372499
SLC39A6 0.000000 0.000000 0.243528 0.236513 0.111888 0.274314 0.000000 0.000000 0.216655 0.027124 ... 0.000000 0.000000 0.029860 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.329071
CTSC 0.167293 0.438994 0.101275 0.304070 0.280142 0.118219 0.067311 0.106374 0.070320 0.055787 ... 0.146687 0.082585 0.129819 0.200502 0.091785 0.229002 0.439641 0.123187 0.451661 0.386786
AQP7 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000

23686 rows × 4645 columns


In [42]:
print('Normalized min: {0:.1f}, max: {1:.1f}'.format(np.min(norm_exp.values), np.amax(norm_exp.values)))


Normalized min: 0.0, max: 1.0

Can we do this in one-line?


In [43]:
norm_exp2 = (exp - exp.values.flatten().min()) / exp.values.flatten().max()

Is the output equivalent to the other method?


In [44]:
np.all(norm_exp == norm_exp2)


Out[44]:
True

Yes, these are equivalent.

What would the distribution be?


In [45]:
norm_vals = norm_exp.values.flatten()
plt.figure(figsize=(9,6))
_=sns.distplot(norm_vals[norm_vals>0])
plt.savefig("Figures"+os.sep+"tirosh_normexp_hist.png")


It's distributed identically, except now it is normalized to the 0-1 interval.

What if we want to analyze groups of data?

Example: calculate average expression of T-cell marker CD3D in each cell type

Our expression dataset does not have the cell type information any more.

First, let's replace cell type indices with human readable names.


In [46]:
celltype = sc_data.iloc[2,:]
celltype[malignant==2] = 7
celltype_dict = {1:"T",2:"B",3:"Macro",4:"Endo",5:"CAF",6:"NK",7:"Mal",0:"unspec"}
celltype.rename('celltype',inplace=True)
celltype = celltype.apply(lambda x:celltype_dict[x])

Let's get a cell count using the value_counts() method of a Series


In [47]:
celltype.value_counts()


Out[47]:
T         2064
Mal       1257
B          515
unspec     506
Macro      125
Endo        65
CAF         61
NK          52
Name: celltype, dtype: int64

Now let's group...

Grouping

Now, let's use groupby

This useful functionality allows us to group elements.

First we'll transpose the data, and then apply the grouping.


In [48]:
exp = pd.DataFrame(sc_data.iloc[3:,:], dtype=np.float)
print(exp.shape)
expT = exp.transpose()
print(expT.shape)


(23686, 4645)
(4645, 23686)

In [49]:
exp_g = expT.groupby(celltype.values)
print(type(exp_g))


<class 'pandas.core.groupby.groupby.DataFrameGroupBy'>

In [50]:
exp_g.groups


Out[50]:
{'B': Index(['Cy72_CD45_H02_S758_comb', 'Cy72_CD45_D03_S711_comb',
        'Cy72_CD45_A07_S679_comb', 'Cy74_CD45_B08_S404_comb',
        'Cy74_CD45_C03_S411_comb', 'Cy72_CD45_A05_S677_comb',
        'Cy74_CD45_D08_S428_comb', 'Cy72_CD45_B09_S693_comb',
        'Cy81_FNA_CD45_H05_S281_comb', 'Cy72_CD45_F10_S742_comb',
        ...
        'CY94CD45POS_1_E08_S152_comb', 'CY94CD45POS_1_G12_S180_comb',
        'CY88CD45POS_7_G03_S267_comb', 'CY94CD45POS_1_H03_S183_comb',
        'CY88CD45POS_7_C05_S221_comb', 'CY94CD45POS_1_D03_S135_comb',
        'CY94_CD45NEG_CD90POS_2_B04_S16_comb',
        'CY94_CD45NEG_CD90POS_2_F09_S69_comb',
        'CY94_CD45NEG_CD90POS_2_B11_S23_comb',
        'CY94_CD45NEG_CD90POS_2_F04_S64_comb'],
       dtype='object', length=515),
 'CAF': Index(['Cy80_II_CD45_G05_S941_comb', 'Cy80_II_CD45_H03_S951_comb',
        'cy80-CD45-neg-D12-S912-comb', 'cy82-CD45-pos-1-C12-S516-comb',
        'cy82-CD45-pos-1-D06-S522-comb', 'cy82-CD45-pos-3-D04-S40-comb',
        'cy82-CD45-neg-2-E02-S242-comb', 'cy82-CD45-neg-1-B09-S213-comb',
        'cy53-1-CD45-neg-C01-S313-comb', 'cy53-1-CD45-neg-D07-S331-comb',
        'cy53-1-CD45-neg-F09-S357-comb', 'cy53-1-CD45-neg-B04-S304-comb',
        'cy82-CD45-neg-3-C08-S800-comb',
        'cy79-p3-CD45-neg-PDL1-neg-E09-S249-comb', 'Cy59_16', 'Cy59_18',
        'Cy59_2', 'Cy59_52', 'Cy59_55', 'Cy59_7', 'Cy59_95',
        'CY80-neg-C11-S35-comb', 'CY80-neg-G06-S78-comb',
        'cy84-Primary-CD45-neg-G03-S1035-comb', 'cy78-CD45-neg-3-G01-S745-comb',
        'cy80_CD_90_pos_A04_S868_comb', 'cy80_CD_90_pos_F08_S932_comb',
        'cy80_CD_90_pos_E04_S916_comb', 'cy80_CD_90_pos_E09_S921_comb',
        'cy80_CD_90_pos_D10_S910_comb', 'cy80_CD_90_pos_C11_S899_comb',
        'cy80_CD_90_pos_B04_S880_comb', 'cy80_CD_90_pos_F09_S933_comb',
        'cy80_CD_90_pos_F07_S931_comb', 'cy80_CD_90_pos_C05_S893_comb',
        'cy80_CD_90_pos_B02_S878_comb', 'cy80_CD_90_pos_G06_S942_comb',
        'cy80_CD_90_pos_D03_S903_comb', 'cy80_CD_90_pos_B06_S882_comb',
        'cy80_CD_90_pos_E05_S917_comb', 'cy80_CD_90_pos_C02_S890_comb',
        'cy80_CD_90_pos_A02_S866_comb', 'cy80_CD_90_pos_H08_S956_comb',
        'cy94_cd45neg_cd90pos_H06_S378_comb',
        'cy94_cd45neg_cd90pos_C10_S322_comb',
        'cy94_cd45neg_cd90pos_G05_S365_comb',
        'cy94_cd45neg_cd90pos_F11_S359_comb',
        'cy94_cd45neg_cd90pos_D07_S331_comb',
        'cy94_cd45neg_cd90pos_H08_S380_comb',
        'cy94_cd45neg_cd90pos_A08_S296_comb',
        'cy94_cd45neg_cd90pos_A07_S295_comb',
        'cy94_cd45neg_cd90pos_C05_S317_comb',
        'cy94_cd45neg_cd90pos_B07_S307_comb', 'CY88CD45POS_7_B01_S205_comb',
        'CY88CD45_150813_D07_S331_comb', 'CY94_CD45NEG_CD90POS_2_B10_S22_comb',
        'CY94_CD45NEG_CD90POS_2_H07_S91_comb',
        'CY94_CD45NEG_CD90POS_2_F06_S66_comb',
        'CY94_CD45NEG_CD90POS_2_G06_S78_comb',
        'CY94_CD45NEG_CD90POS_2_C11_S35_comb',
        'CY94_CD45NEG_CD90POS_2_C06_S30_comb'],
       dtype='object'),
 'Endo': Index(['Cy80_II_CD45_C12_S900_comb', 'Cy80_II_CD45_G10_S946_comb',
        'Cy80_II_CD45_F08_S932_comb', 'Cy80_II_CD45_C03_S891_comb',
        'Cy80_II_CD45_A04_S868_comb', 'Cy80_II_CD45_A08_S872_comb',
        'Cy81_Bulk_CD45_F10_S166_comb', 'Cy80_II_CD45_B04_S880_comb',
        'cy79-p3-CD45-neg-PD1-pos-AS-C3-R1-C07-S319-comb',
        'cy80-CD45-neg-G11-S947-comb', 'cy80-CD45-neg-G07-S943-comb',
        'cy80-CD45-neg-C06-S894-comb', 'cy80-CD45-neg-E12-S924-comb',
        'cy80-CD45-neg-F12-S936-comb', 'cy80-CD45-neg-F10-S934-comb',
        'cy80-CD45-neg-H05-S953-comb', 'cy80-CD45-neg-A04-S868-comb',
        'cy81-Bulk-CD45-neg-E03-S147-comb', 'cy80-CD45-neg-F07-S931-comb',
        'cy80-CD45-neg-B06-S882-comb', 'cy80-CD45-neg-C05-S893-comb',
        'cy80-CD45-neg-D05-S905-comb', 'cy53-1-CD45-neg-D12-S336-comb',
        'cy53-1-CD45-neg-E02-S338-comb', 'cy53-1-CD45-neg-A10-S298-comb',
        'cy53-1-CD45-neg-C04-S316-comb', 'cy53-1-CD45-neg-H05-S377-comb',
        'cy53-1-CD45-neg-H03-S375-comb', 'cy53-1-CD45-neg-B11-S311-comb',
        'cy53-1-CD45-neg-H07-S379-comb', 'cy53-1-CD45-neg-E03-S339-comb',
        'cy53-1-CD45-neg-D08-S332-comb', 'cy53-1-CD45-neg-G07-S367-comb',
        'cy79-p3-CD45-neg-PDL1-neg-A12-S204-comb', 'CY80-neg-D06-S42-comb',
        'CY80-neg-G04-S76-comb', 'CY80-neg-B06-S18-comb',
        'CY80-neg-E03-S51-comb', 'CY80-neg-C10-S34-comb',
        'CY80-neg-E01-S49-comb', 'CY80-neg-C04-S28-comb',
        'CY80-neg-C02-S26-comb', 'CY80-neg-H09-S93-comb',
        'CY80-neg-H10-S94-comb', 'cy84-Primary-CD45-neg-A12-S972-comb',
        'CY89NEG_C11_S35_comb', 'cy94_cd45neg_cd90pos_G04_S364_comb',
        'cy94_cd45neg_cd90pos_H04_S376_comb',
        'cy94_cd45neg_cd90pos_A10_S298_comb',
        'cy94_cd45neg_cd90pos_F04_S352_comb',
        'cy94_cd45neg_cd90pos_A01_S289_comb',
        'cy94_cd45neg_cd90pos_D03_S327_comb',
        'cy94_cd45neg_cd90pos_F02_S350_comb',
        'cy94_cd45neg_cd90pos_F09_S357_comb',
        'cy94_cd45neg_cd90pos_A09_S297_comb',
        'CY94_CD45NEG_CD90POS_2_F08_S68_comb',
        'CY94_CD45NEG_CD90POS_2_A08_S8_comb',
        'CY94_CD45NEG_CD90POS_2_E05_S53_comb',
        'CY94_CD45NEG_CD90POS_2_G10_S82_comb',
        'CY94_CD45NEG_CD90POS_2_D05_S41_comb',
        'CY94_CD45NEG_CD90POS_2_B12_S24_comb',
        'CY94_CD45NEG_CD90POS_2_B06_S18_comb',
        'CY94_CD45NEG_CD90POS_2_G07_S79_comb',
        'CY94_CD45NEG_CD90POS_2_F01_S61_comb',
        'CY94_CD45NEG_CD90POS_2_A03_S3_comb'],
       dtype='object'),
 'Macro': Index(['Cy74_CD45_H08_S476_comb', 'Cy74_CD45_A06_S390_comb',
        'Cy74_CD45_C07_S415_comb', 'Cy71_CD45_A09_S489_comb',
        'Cy71_CD45_D02_S518_comb', 'cy82-CD45-pos-1-G02-S554-comb',
        'cy82-CD45-pos-3-B12-S24-comb', 'cy53-1-CD45-pos-2-B03-S975-comb',
        'cy53-1-CD45-pos-1-A05-S5-comb', 'cy53-1-CD45-pos-2-B12-S984-comb',
        ...
        'CY84_PRIM_POS_All_6_A11_S395_comb',
        'CY94_CD45NEG_CD90POS_2_E06_S54_comb',
        'CY84_PRIM_POS_All_8_A02_S98_comb', 'CY84_PRIM_POS_All_6_D12_S432_comb',
        'CY84_PRIM_POS_All_6_E07_S439_comb',
        'CY84_PRIM_POS_All_7_B04_S208_comb',
        'CY84_PRIM_POS_All_7_E12_S252_comb',
        'CY84_PRIM_POS_All_8_C07_S127_comb',
        'CY84_PRIM_POS_All_7_A07_S199_comb',
        'CY84_PRIM_POS_All_7_D01_S229_comb'],
       dtype='object', length=125),
 'Mal': Index(['Cy71_CD45_D08_S524_comb', 'Cy81_FNA_CD45_B01_S301_comb',
        'Cy80_II_CD45_B07_S883_comb', 'Cy81_Bulk_CD45_B10_S118_comb',
        'Cy71_CD45_B05_S497_comb', 'Cy80_II_CD45_C09_S897_comb',
        'Cy81_FNA_CD45_E05_S341_comb', 'Cy81_Bulk_CD45_E10_S154_comb',
        'Cy80_II_CD45_H07_S955_comb', 'Cy81_FNA_CD45_D09_S333_comb',
        ...
        'CY88CD45POS_7_C07_S223_comb', 'CY88CD45_150813_B05_S305_comb',
        'CY88CD45POS_2_B11_S407_comb', 'CY94_CD45NEG_CD90POS_2_G09_S81_comb',
        'CY94_CD45NEG_CD90POS_2_B09_S21_comb',
        'CY84_PRIM_POS_All_6_D06_S426_comb',
        'CY84_PRIM_POS_All_6_G01_S457_comb',
        'CY94_CD45NEG_CD90POS_2_D04_S40_comb',
        'CY84_PRIM_POS_All_6_E04_S436_comb',
        'CY94_CD45NEG_CD90POS_2_D06_S42_comb'],
       dtype='object', length=1257),
 'NK': Index(['CY58_1_CD45_F08_S1028_comb', 'CY58_1_CD45_A05_S965_comb',
        'CY58_1_CD45_D01_S997_comb', 'cy80-Cd45-pos-Pd1-neg-S293-E05-S293-comb',
        'cy82-CD45-pos-3-A07-S7-comb', 'cy82-CD45-pos-3-B08-S20-comb',
        'cy53-1-CD45-pos-2-A04-S964-comb', 'cy53-1-CD45-pos-1-D01-S37-comb',
        'cy53-1-CD45-pos-2-B08-S980-comb', 'cy53-1-CD45-pos-1-F04-S64-comb',
        'cy80-CD45-pos-PD1-pos-H01-S181-comb', 'cy53-1-CD45-pos-1-A02-S2-comb',
        'cy58-1-CD45-pos-C04-S604-comb', 'cy53-1-CD45-neg-G06-S366-comb',
        'cy53-1-CD45-pos-1-A08-S8-comb', 'cy72-CD45-pos-H08-S956-comb',
        'cy53-1-CD45-pos-2-G09-S1041-comb', 'cy53-1-CD45-pos-2-D03-S999-comb',
        'cy53-1-CD45-pos-2-A06-S966-comb', 'cy74-CD45-pos-A10-S682-comb',
        'cy79-p3-CD45-pos-PD1-neg-E09-S153-comb', 'Cy67-CD45pos-S2-C4_S28',
        'cy80-CD45-neg-E09-S441-comb', 'cy80-CD45-neg-D01-S421-comb',
        'cy84_Primary_CD45_pos_E08_S440_comb',
        'cy84_Primary_CD45_pos_C07_S415_comb',
        'cy84_Primary_CD45_pos_G03_S459_comb',
        'cy84_Primary_CD45_pos_C11_S419_comb',
        'cy60_1_cd_45_pos_3_E10_S346_comb', 'cy60_1_cd_45_pos_3_A03_S291_comb',
        'cy60_1_cd_45_pos_3_B01_S301_comb', 'cy60_1_cd_45_pos_3_G01_S361_comb',
        'cy60_1_cd_45_pos_4_E08_S56_comb', 'cy60_1_cd_45_pos_4_C11_S35_comb',
        'cy88_cd_45_pos_F02_S446_comb', 'cy60_1_cd_45_pos_3_A07_S295_comb',
        'cy60_1_cd_45_pos_4_C07_S31_comb', 'cy88_cd_45_pos_3_F05_S641_comb',
        'cy88_cd_45_pos_D01_S421_comb', 'cy88_cd_45_pos_3_F11_S647_comb',
        'cy60_1_cd_45_pos_4_G03_S75_comb', 'cy88_cd_45_pos_H05_S473_comb',
        'cy60_1_cd_45_pos_3_C04_S316_comb', 'CY89FNA_A03_S195_comb',
        'CY88CD45POS_2_F07_S451_comb', 'CY88CD45_150813_D05_S329_comb',
        'CY88CD45POS_2_G06_S462_comb', 'CY94CD45POS_1_E04_S148_comb',
        'CY88CD45POS_2_F09_S453_comb', 'CY84_PRIM_POS_All_7_E07_S247_comb',
        'CY84_PRIM_POS_All_7_B06_S210_comb',
        'CY84_PRIM_POS_All_8_E10_S154_comb'],
       dtype='object'),
 'T': Index(['CY58_1_CD45_B02_S974_comb', 'Cy72_CD45_D09_S717_comb',
        'Cy74_CD45_A03_S387_comb', 'Cy74_CD45_F09_S453_comb',
        'CY58_1_CD45_D03_S999_comb', 'Cy72_CD45_C01_S697_comb',
        'Cy71_CD45_D07_S523_comb', 'Cy74_CD45_D04_S424_comb',
        'Cy74_CD45_C11_S419_comb', 'Cy81_FNA_CD45_D07_S235_comb',
        ...
        'CY75_1_CD45_CD8_8__S384_comb', 'CY75_1_CD45_CD8_7__S265_comb',
        'CY75_1_CD45_CD8_3__S127_comb', 'CY75_1_CD45_CD8_1__S61_comb',
        'CY75_1_CD45_CD8_1__S12_comb', 'CY75_1_CD45_CD8_1__S25_comb',
        'CY75_1_CD45_CD8_7__S223_comb', 'CY75_1_CD45_CD8_1__S65_comb',
        'CY75_1_CD45_CD8_1__S93_comb', 'CY75_1_CD45_CD8_1__S76_comb'],
       dtype='object', length=2064),
 'unspec': Index(['Cy81_FNA_CD45_C12_S228_comb', 'CY58_1_CD45_F01_S1021_comb',
        'Cy81_Bulk_CD45_B11_S119_comb', 'Cy72_CD45_E02_S722_comb',
        'Cy81_Bulk_CD45_B07_S115_comb', 'Cy72_CD45_H07_S763_comb',
        'Cy81_Bulk_CD45_D09_S141_comb', 'Cy74_CD45_E06_S438_comb',
        'Cy71_CD45_H05_S569_comb', 'Cy71_CD45_H07_S571_comb',
        ...
        'CY94_CD45NEG_CD90POS_2_E09_S57_comb',
        'CY94_CD45NEG_CD90POS_2_C01_S25_comb',
        'CY84_PRIM_POS_All_7_F10_S262_comb',
        'CY84_PRIM_POS_All_6_A01_S385_comb',
        'CY84_PRIM_POS_All_7_E03_S243_comb', 'CY84_PRIM_POS_All_8_A01_S97_comb',
        'CY84_PRIM_POS_All_8_E11_S155_comb',
        'CY94_CD45NEG_CD90POS_2_D08_S44_comb',
        'CY84_PRIM_POS_All_7_A01_S193_comb', 'CY75_1_CD45_CD8_7__S274_comb'],
       dtype='object', length=506)}

What do we do with this?

Handling groups

The most straightforward way is to get a group and work with it.


In [51]:
t_cells = exp_g.get_group('T')
print(t_cells.shape)


(2064, 23686)

That's the right size.


In [52]:
print('Average CD3D expression in T cells: {0:.2f}'.format(t_cells['CD3D'].mean()))


Average CD3D expression in T cells: 6.51

A straightforward extension of this to calculating per-group average is:


In [53]:
for g_ind in exp_g.groups.keys():
    group = exp_g.get_group(g_ind)
    print('Average CD3D expression in {2:s} cells is {0:.2f} (n={1:d} cells)'.format(
        group['CD3D'].mean(), group.shape[0], g_ind))


Average CD3D expression in B cells is 0.07 (n=515 cells)
Average CD3D expression in CAF cells is 0.07 (n=61 cells)
Average CD3D expression in Endo cells is 0.00 (n=65 cells)
Average CD3D expression in Macro cells is 0.08 (n=125 cells)
Average CD3D expression in Mal cells is 0.10 (n=1257 cells)
Average CD3D expression in NK cells is 0.15 (n=52 cells)
Average CD3D expression in T cells is 6.51 (n=2064 cells)
Average CD3D expression in unspec cells is 1.45 (n=506 cells)

A more elegant way to achieve this:


In [54]:
exp_g.aggregate({'CD3D': ['count', 'mean', 'min', 'max']})


Out[54]:
CD3D
count mean min max
B 515 0.072389 0.0 6.5381
CAF 61 0.072125 0.0 3.7842
Endo 65 0.000000 0.0 0.0000
Macro 125 0.080136 0.0 3.9692
Mal 1257 0.101038 0.0 7.8954
NK 52 0.152696 0.0 7.2221
T 2064 6.513088 0.0 10.1230
unspec 506 1.446838 0.0 10.0610

What if we wanted to see a few different marker genes?

Here's a rough (and inaccurate) guideline:

  • CD14: macrophages
  • CD20: B cells (aka MS4A1)
  • CD56: NK

In [55]:
exp_g.aggregate({
    'CD3D': ['count', 'mean', 'min', 'max'],
    'CD14': ['mean', 'min', 'max'],
    'MS4A1': ['mean', 'min', 'max'],
    'NCAM1': ['mean', 'min', 'max'],})


Out[55]:
CD3D CD14 MS4A1 NCAM1
count mean min max mean min max mean min max mean min max
B 515 0.072389 0.0 6.5381 0.010099 0.0 3.1032 6.475399 0.0 9.28330 0.022270 0.0 3.884400
CAF 61 0.072125 0.0 3.7842 0.351683 0.0 4.4998 0.194591 0.0 6.44500 0.142036 0.0 4.739900
Endo 65 0.000000 0.0 0.0000 0.662045 0.0 5.7231 0.067491 0.0 3.88920 0.455900 0.0 3.343300
Macro 125 0.080136 0.0 3.9692 7.191868 0.0 10.6250 0.001027 0.0 0.10836 0.000442 0.0 0.055196
Mal 1257 0.101038 0.0 7.8954 0.118596 0.0 9.6615 0.034413 0.0 4.91650 0.265723 0.0 3.937100
NK 52 0.152696 0.0 7.2221 0.093625 0.0 4.8685 0.344963 0.0 7.83110 2.333908 0.0 5.722400
T 2064 6.513088 0.0 10.1230 0.055831 0.0 8.7551 0.250117 0.0 7.58650 0.030788 0.0 5.229600
unspec 506 1.446838 0.0 10.0610 0.427557 0.0 8.2257 1.752007 0.0 8.49240 0.186851 0.0 5.480200

What if we wanted to do something more complicated?

For example, all these marker genes have zero expression at minimum in their respective cell populations.

How many cells actually have zero expression of their markers?


In [56]:
f = lambda x: sum(x==0)

In [57]:
exp_g.aggregate({
    'CD3D': ['count',f],
    'CD14': f,
    'MS4A1': f,
    'NCAM1': f,})


Out[57]:
CD3D CD14 MS4A1 NCAM1
count <lambda> <lambda> <lambda> <lambda>
B 515 499.0 512.0 16.0 505.0
CAF 61 59.0 54.0 57.0 58.0
Endo 65 65.0 54.0 63.0 49.0
Macro 125 119.0 4.0 123.0 124.0
Mal 1257 1220.0 1206.0 1225.0 1062.0
NK 52 50.0 51.0 48.0 20.0
T 2064 181.0 2024.0 1892.0 2008.0
unspec 506 373.0 453.0 338.0 455.0

Output

Finally, let's see how we can write Pandas data frames to files.

For example, let's write the CD3D expression of T cells into an Excel file called T-cells.xlsx in the appropriate format.

Any ideas?


In [60]:
t_cells['CD3D'].to_excel('T-cells.xlsx')

In [61]:
!open T-cells.xlsx