Three metadata files for this analysis:
tara_metadata_SRF.csv
-- generated by singlecell_tara_stats.ipynbgenome_metadata.csv
-- generated hereog_metadata.csv
-- generated here
In [2]:
import pandas as pd
/Users/luke/anaconda/envs/qiime190/lib/python2.7/site-packages/pandas/computation/__init__.py:19: UserWarning: The installed version of numexpr 2.4.4 is not supported in pandas and will be not be used
UserWarning)
In [114]:
# read tara_metadata_SRF.tsv
df_tara_metadata = pd.read_csv('/Users/luke/singlecell/notebooks/tara_metadata_SRF.tsv', sep='\t', index_col=0)
In [115]:
df_tara_metadata
Out[115]:
PANGAEA Sample ID
Sample label [TARA_station#_environmental-feature_size-fraction].1
Mean_Date [YY/MM/DD hh:mm]*
Mean_Lat*
Mean_Long*
Mean_Depth [m]*
Mean_Temperature [deg C]*
Mean_Salinity [PSU]*
Mean_Oxygen [umol/kg]*
Mean_Nitrates[umol/L]*
...
OG.Richness
OG.Evenness
FC - heterotrophs [cells/mL]
FC - autotrophs [cells/mL]
FC - bacteria [cells/mL]
FC - picoeukaryotes [cells/mL]
minimum generation time [h]
category_latitude
category_temperature
category_redsea
Sample label [TARA_station#_environmental-feature_size-fraction]
TARA_100_SRF_0.22-3
TARA_B100000963
TARA_100_SRF_0.22-3
4/15/11 13:10
-12.99
-95.99
5.5
25.3
35.8
200.2
4.6
...
14283.0
0.749
464267.0
155789.0
620056.0
7780.0
14.0
tropical
tropical
False
TARA_102_SRF_0.22-3
TARA_B100000900
TARA_102_SRF_0.22-3
4/21/11 20:16
-5.25
-85.16
5.5
24.9
34.7
206.0
11.7
...
13566.0
0.750
1032721.0
192335.0
1225056.0
19091.0
12.0
tropical
tropical
False
TARA_109_SRF_0.22-3
TARA_B100000925
TARA_109_SRF_0.22-3
5/12/11 13:38
1.99
-84.58
5.4
27.6
33.4
198.6
1.2
...
15162.0
0.745
1053096.0
0.0
1053096.0
0.0
11.0
tropical
tropical
False
TARA_110_SRF_0.22-3
TARA_B100001109
TARA_110_SRF_0.22-3
5/21/11 12:37
-2.01
-84.59
5.5
23.9
35.0
190.8
6.7
...
15814.0
0.743
646867.0
189912.0
836779.0
12656.0
13.0
tropical
tropical
False
TARA_111_SRF_0.22-3
TARA_B100000575
TARA_111_SRF_0.22-3
5/31/11 14:12
-16.96
-100.63
5.9
22.8
36.0
208.9
2.6
...
14494.0
0.744
498513.0
112358.0
610871.0
5839.0
13.0
tropical
tropical
False
TARA_112_SRF_0.22-3
TARA_B100000941
TARA_112_SRF_0.22-3
6/14/11 16:54
-23.28
-129.40
5.4
24.2
36.5
202.2
-0.9
...
16502.0
0.733
493328.0
118066.0
611394.0
321.0
15.0
tropical
tropical
False
TARA_122_SRF_0.22-3
TARA_B100001115
TARA_122_SRF_0.22-3
7/26/11 17:34
-8.99
-139.21
5.9
26.5
35.4
186.2
4.0
...
16069.0
0.736
NaN
150974.0
150974.0
5770.0
13.0
tropical
tropical
False
TARA_123_SRF_0.22-3
TARA_B100000683
TARA_123_SRF_0.22-3
7/31/11 17:24
-8.91
-140.28
5.5
26.6
35.4
189.8
3.9
...
19169.0
0.731
721664.0
31112.0
752776.0
10279.0
10.0
tropical
tropical
False
TARA_124_SRF_0.22-3
TARA_B100000674
TARA_124_SRF_0.22-3
8/5/11 0:27
-9.11
-140.58
9.9
26.5
35.4
190.7
5.1
...
17533.0
0.738
705265.0
250340.0
955605.0
10669.0
13.0
tropical
tropical
False
TARA_125_SRF_0.22-3
TARA_B100001121
TARA_125_SRF_0.22-3
8/8/11 17:38
-8.91
-142.56
5.5
26.8
35.4
187.3
5.0
...
17303.0
0.734
800561.0
146848.0
947408.0
5808.0
14.0
tropical
tropical
False
TARA_128_SRF_0.22-3
TARA_B100000609
TARA_128_SRF_0.22-3
9/4/11 18:04
0.00
-153.68
5.4
26.1
35.1
179.9
2.5
...
18140.0
0.732
435093.0
179847.0
614940.0
7872.0
12.0
tropical
tropical
False
TARA_132_SRF_0.22-3
TARA_B100001248
TARA_132_SRF_0.22-3
10/4/11 17:49
31.52
-159.00
5.5
25.2
35.2
197.7
-1.0
...
14422.0
0.741
416576.0
121337.0
537913.0
932.0
13.0
subtropical
tropical
False
TARA_133_SRF_0.22-3
TARA_B100001093
TARA_133_SRF_0.22-3
10/18/11 18:58
35.41
-127.74
5.5
19.2
33.1
224.4
-0.7
...
15279.0
0.742
1000349.0
169828.0
1170177.0
6068.0
13.0
subtropical
temperate
False
TARA_137_SRF_0.22-3
TARA_B100001287
TARA_137_SRF_0.22-3
12/2/11 14:10
14.20
-116.63
5.4
26.4
33.9
195.1
3.2
...
13939.0
0.747
755425.0
273894.0
1029319.0
10134.0
12.0
tropical
tropical
False
TARA_138_SRF_0.22-3
TARA_B100001989
TARA_138_SRF_0.22-3
12/10/11 13:58
6.33
-102.94
5.4
26.6
33.4
196.9
-1.2
...
17717.0
0.728
394198.0
222835.0
617033.0
1567.0
15.0
tropical
tropical
False
TARA_140_SRF_0.22-3
TARA_B100002019
TARA_140_SRF_0.22-3
12/21/11 16:25
7.41
-79.30
5.4
26.6
28.9
205.3
-3.8
...
17614.0
0.735
867923.0
264990.0
1132914.0
5808.0
13.0
tropical
tropical
False
TARA_141_SRF_0.22-3
TARA_B100001939
TARA_141_SRF_0.22-3
12/30/11 12:56
9.85
-80.04
5.4
27.1
34.3
195.5
-1.7
...
16828.0
0.743
439099.0
139450.0
578549.0
2277.0
10.0
tropical
tropical
False
TARA_142_SRF_0.22-3
TARA_B100002051
TARA_142_SRF_0.22-3
1/9/12 12:41
25.51
-88.38
5.4
25.0
36.2
194.3
-2.2
...
17734.0
0.735
436017.0
89629.0
525647.0
1597.0
11.0
subtropical
tropical
False
TARA_145_SRF_0.22-3
TARA_B100001142
TARA_145_SRF_0.22-3
2/2/12 12:02
39.23
-70.03
5.5
14.1
35.2
233.9
3.5
...
14527.0
0.747
374730.0
22789.0
397520.0
7749.0
10.0
subtropical
temperate
False
TARA_146_SRF_0.22-3
TARA_B100001540
TARA_146_SRF_0.22-3
2/15/12 12:59
34.68
-71.30
11.7
19.1
36.5
214.4
0.9
...
15917.0
0.739
338145.0
27459.0
365604.0
4165.0
11.0
subtropical
temperate
False
TARA_148_SRF_0.22-3
TARA_B100001741
TARA_148_SRF_0.22-3
2/24/12 11:41
31.70
-64.25
6.1
20.4
36.6
212.6
NaN
...
15943.0
0.741
318391.0
45701.0
364093.0
1911.0
10.0
subtropical
tropical
False
TARA_149_SRF_0.22-3
TARA_B100001758
TARA_149_SRF_0.22-3
3/1/12 11:55
34.10
-49.89
5.5
18.7
36.4
220.2
-1.2
...
15146.0
0.742
379018.0
54719.0
433737.0
7375.0
10.0
subtropical
temperate
False
TARA_150_SRF_0.22-3
TARA_B100001769
TARA_150_SRF_0.22-3
3/5/12 11:02
35.91
-37.26
5.5
17.6
36.3
228.4
-0.3
...
14688.0
0.743
514205.0
146022.0
660227.0
7940.0
11.0
subtropical
temperate
False
TARA_151_SRF_0.22-3
TARA_B100001564
TARA_151_SRF_0.22-3
3/9/12 10:13
36.16
-29.01
5.4
17.3
36.2
232.1
0.3
...
14772.0
0.742
377257.0
66886.0
444143.0
3569.0
9.0
subtropical
temperate
False
TARA_152_SRF_0.22-3
TARA_B100001173
TARA_152_SRF_0.22-3
3/19/12 10:42
43.69
-16.85
5.4
14.3
36.0
243.1
3.0
...
14843.0
0.748
555269.0
104028.0
659297.0
7551.0
11.0
temperate
temperate
False
TARA_018_SRF_0.22-1.6
TARA_A100000164
TARA_018_SRF_0.22-1.6
11/2/09 8:18
35.76
14.26
5.4
21.4
37.9
207.8
NaN
...
14161.0
0.747
633618.0
39180.0
672798.0
947.0
14.0
subtropical
tropical
False
TARA_023_SRF_0.22-1.6
TARA_E500000075
TARA_023_SRF_0.22-1.6
11/18/09 8:23
42.21
17.71
5.5
17.6
38.2
220.0
NaN
...
14477.0
0.745
746759.0
57854.0
804613.0
1499.0
18.0
temperate
temperate
False
TARA_025_SRF_0.22-1.6
TARA_E500000178
TARA_025_SRF_0.22-1.6
11/23/09 9:14
39.39
19.39
5.5
18.3
38.2
218.0
NaN
...
14408.0
0.749
264404.0
34000.0
298404.0
1123.0
21.0
subtropical
temperate
False
TARA_030_SRF_0.22-1.6
TARA_A100001015
TARA_030_SRF_0.22-1.6
12/15/09 10:40
33.92
32.89
5.4
20.5
39.4
207.6
-0.7
...
12612.0
0.754
232380.0
36418.0
268797.0
1225.0
15.0
subtropical
tropical
False
TARA_031_SRF_0.22-1.6
TARA_A100001388
TARA_031_SRF_0.22-1.6
1/9/10 7:25
27.16
34.83
5.4
25.1
40.0
188.8
-0.1
...
14134.0
0.740
473351.0
204723.0
678074.0
1192.0
13.0
subtropical
tropical
True
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
TARA_036_SRF_0.22-1.6
TARA_Y100000022
TARA_036_SRF_0.22-1.6
3/12/10 6:19
20.82
63.50
5.5
25.6
36.5
210.9
NaN
...
12504.0
0.754
2216959.0
231702.0
2448661.0
4807.0
12.0
tropical
tropical
False
TARA_038_SRF_0.22-1.6
TARA_Y100000287
TARA_038_SRF_0.22-1.6
3/15/10 3:44
19.04
64.49
5.5
26.2
36.6
199.9
0.9
...
13741.0
0.749
1626279.0
285872.0
1912151.0
8040.0
11.0
tropical
tropical
False
TARA_004_SRF_0.22-1.6
TARA_Y200000002
TARA_004_SRF_0.22-1.6
9/15/09 10:15
36.55
-6.57
10.0
20.5
36.6
NaN
NaN
...
14591.0
0.745
839531.0
54593.0
894124.0
1222.0
15.0
subtropical
tropical
False
TARA_041_SRF_0.22-1.6
TARA_B100000282
TARA_041_SRF_0.22-1.6
3/30/10 2:56
14.60
69.98
5.5
29.1
36.0
187.4
-1.3
...
16594.0
0.731
491899.0
249308.0
741207.0
818.0
14.0
tropical
tropical
False
TARA_042_SRF_0.22-1.6
TARA_B100000123
TARA_042_SRF_0.22-1.6
4/4/10 3:13
6.03
73.89
5.4
30.0
34.6
189.3
-1.5
...
15541.0
0.735
808564.0
188746.0
997310.0
909.0
14.0
tropical
tropical
False
TARA_045_SRF_0.22-1.6
TARA_B100000161
TARA_045_SRF_0.22-1.6
4/13/10 3:21
0.00
71.64
5.5
30.5
35.1
185.2
2.6
...
14553.0
0.736
393599.0
305984.0
699583.0
818.0
13.0
tropical
tropical
False
TARA_048_SRF_0.22-1.6
TARA_B100000242
TARA_048_SRF_0.22-1.6
4/19/10 7:54
-9.39
66.42
5.4
29.8
34.2
187.3
-2.2
...
13177.0
0.744
313030.0
148147.0
461176.0
566.0
13.0
tropical
tropical
False
TARA_052_SRF_0.22-1.6
TARA_B100000212
TARA_052_SRF_0.22-1.6
5/17/10 3:56
-16.96
53.99
5.5
27.9
34.6
191.7
NaN
...
14573.0
0.739
425006.0
206037.0
631044.0
3913.0
15.0
tropical
tropical
False
TARA_056_SRF_0.22-3
TARA_B000000609
TARA_056_SRF_0.22-3
6/26/10 7:12
-15.34
43.29
5.5
27.3
35.0
193.1
NaN
...
13543.0
0.744
409420.0
207207.0
616626.0
634.0
13.0
tropical
tropical
False
TARA_057_SRF_0.22-3
TARA_B000000565
TARA_057_SRF_0.22-3
6/27/10 12:07
-17.02
42.74
5.6
27.0
35.1
190.1
NaN
...
14117.0
0.749
NaN
NaN
NaN
NaN
10.0
tropical
tropical
False
TARA_062_SRF_0.22-3
TARA_B000000532
TARA_062_SRF_0.22-3
7/2/10 10:17
-21.38
39.56
5.4
25.1
35.3
199.9
-0.2
...
13750.0
0.746
371432.0
149622.0
521054.0
1949.0
12.0
tropical
tropical
False
TARA_064_SRF_0.22-3
TARA_B100000401
TARA_064_SRF_0.22-3
7/7/10 4:56
-29.50
37.99
5.5
22.2
35.3
210.0
NaN
...
13216.0
0.748
570751.0
403151.0
973902.0
7078.0
11.0
subtropical
tropical
False
TARA_065_SRF_0.22-3
TARA_B000000437
TARA_065_SRF_0.22-3
7/12/10 6:46
-35.19
26.29
5.9
21.8
35.4
207.0
NaN
...
13735.0
0.748
538371.0
66905.0
605276.0
6590.0
12.0
subtropical
tropical
False
TARA_066_SRF_0.22-3
TARA_B000000475
TARA_066_SRF_0.22-3
7/15/10 12:26
-34.94
17.94
5.4
15.0
35.3
238.9
2.5
...
13082.0
0.752
564191.0
32703.0
596894.0
15397.0
11.0
subtropical
temperate
False
TARA_067_SRF_0.22-3
TARA_B100000497
TARA_067_SRF_0.22-3
9/7/10 6:35
-32.23
17.71
5.5
12.8
34.8
249.4
0.4
...
14423.0
0.755
1840174.0
113647.0
1953821.0
46597.0
13.0
subtropical
temperate
False
TARA_068_SRF_0.22-3
TARA_B100000475
TARA_068_SRF_0.22-3
9/14/10 9:14
-31.03
4.69
5.4
16.8
35.7
231.9
NaN
...
16906.0
0.738
439221.0
192777.0
631998.0
18473.0
10.0
subtropical
temperate
False
TARA_007_SRF_0.22-1.6
TARA_A200000113
TARA_007_SRF_0.22-1.6
9/23/09 15:06
37.02
1.95
7.5
23.8
37.5
NaN
NaN
...
15495.0
0.746
1004684.0
50586.0
1055270.0
813.0
14.0
subtropical
tropical
False
TARA_070_SRF_0.22-3
TARA_B100000459
TARA_070_SRF_0.22-3
9/21/10 7:05
-20.44
-3.19
5.4
19.8
36.4
215.7
0.2
...
16493.0
0.737
572821.0
117261.0
690082.0
9740.0
9.0
tropical
temperate
False
TARA_072_SRF_0.22-3
TARA_B100000424
TARA_072_SRF_0.22-3
10/5/10 8:10
-8.78
-17.91
5.8
25.0
36.4
199.1
0.3
...
17255.0
0.730
NaN
NaN
NaN
NaN
16.0
tropical
tropical
False
TARA_076_SRF_0.22-3
TARA_B100000513
TARA_076_SRF_0.22-3
10/16/10 10:07
-20.94
-35.20
5.5
23.3
37.1
206.2
0.0
...
17695.0
0.733
418710.0
191229.0
609939.0
2215.0
16.0
tropical
tropical
False
TARA_078_SRF_0.22-3
TARA_B100000524
TARA_078_SRF_0.22-3
11/4/10 10:24
-30.14
-43.29
5.6
19.9
36.3
221.5
-0.4
...
15863.0
0.736
447396.0
213507.0
660903.0
5498.0
14.0
subtropical
temperate
False
TARA_082_SRF_0.22-3
TARA_B100000768
TARA_082_SRF_0.22-3
12/6/10 10:25
-47.18
-58.30
5.5
7.3
34.0
305.0
17.1
...
12924.0
0.761
10639.0
0.0
10639.0
0.0
10.0
temperate
polar
False
TARA_084_SRF_0.22-3
TARA_B100000780
TARA_084_SRF_0.22-3
1/3/11 11:10
-60.23
-60.65
5.9
1.8
33.7
338.3
24.4
...
12175.0
0.766
265149.0
0.0
265149.0
1506.0
9.0
temperate
polar
False
TARA_085_SRF_0.22-3
TARA_B100000787
TARA_085_SRF_0.22-3
1/6/11 10:26
-62.03
-49.54
5.9
0.7
34.4
343.4
27.5
...
10111.0
0.776
411065.0
0.0
411065.0
0.0
8.0
temperate
polar
False
TARA_009_SRF_0.22-1.6
TARA_X000000950
TARA_009_SRF_0.22-1.6
9/28/09 13:36
39.07
5.86
5.8
23.9
38.0
NaN
NaN
...
15315.0
0.750
901463.0
39330.0
940793.0
1231.0
17.0
subtropical
tropical
False
TARA_093_SRF_0.22-3
TARA_B100001063
TARA_093_SRF_0.22-3
3/12/11 11:49
-34.05
-73.08
5.3
18.0
34.3
244.9
-1.5
...
17398.0
0.751
1495550.0
223179.0
1718729.0
8666.0
9.0
subtropical
temperate
False
TARA_094_SRF_0.22-3
TARA_B100001057
TARA_094_SRF_0.22-3
3/18/11 14:16
-32.78
-87.09
5.4
21.1
34.7
219.2
-0.4
...
16192.0
0.741
NaN
NaN
NaN
NaN
17.0
subtropical
tropical
False
TARA_096_SRF_0.22-3
TARA_B100000989
TARA_096_SRF_0.22-3
3/24/11 13:07
-29.72
-101.16
5.5
23.8
35.8
204.1
-0.4
...
14496.0
0.741
384614.0
0.0
384614.0
948.0
16.0
subtropical
tropical
False
TARA_098_SRF_0.22-3
TARA_B100001027
TARA_098_SRF_0.22-3
4/3/11 18:10
-25.83
-111.78
5.6
25.1
36.4
200.5
-1.6
...
14079.0
0.742
387346.0
0.0
387346.0
436.0
15.0
subtropical
tropical
False
TARA_099_SRF_0.22-3
TARA_B100000886
TARA_099_SRF_0.22-3
4/9/11 14:09
-21.15
-104.79
5.4
23.8
36.1
204.0
-1.7
...
13751.0
0.746
NaN
NaN
NaN
NaN
15.0
tropical
tropical
False
63 rows × 41 columns
In [5]:
# proch
# get RS-only OG list ...
df_rsonly = pd.read_csv('/Users/luke/singlecell/notebooks/table_pro_RSonly.tsv', sep='\t')
df_rsonly['Red_Sea_only'] = True
# ... and add to table of all OGs
df_all = pd.read_csv('/Users/luke/singlecell/notebooks/table_pro_all.tsv', sep='\t')
df_og_metadata = df_all.merge(df_rsonly.loc[:,['Ortholog group','Red_Sea_only']], how='left', left_on='Ortholog group', right_on='Ortholog group')
df_og_metadata.drop_duplicates(inplace=True)
df_og_metadata.index = df_og_metadata['Ortholog group']
df_og_metadata.drop(['Ortholog group'], axis=1, inplace=True)
df_og_metadata['Red_Sea_only'].fillna(False, inplace=True)
df_og_metadata['genus'] = 'Prochlorococcus'
df_og_metadata_proch = df_og_metadata
In [6]:
# pelag
# get RS-only OG list ...
df_rsonly = pd.read_csv('/Users/luke/singlecell/notebooks/table_sar_RSonly.tsv', sep='\t')
df_rsonly['Red_Sea_only'] = True
# ... and add to table of all OGs
df_all = pd.read_csv('/Users/luke/singlecell/notebooks/table_sar_all.tsv', sep='\t')
df_og_metadata = df_all.merge(df_rsonly.loc[:,['Ortholog group','Red_Sea_only']], how='left', left_on='Ortholog group', right_on='Ortholog group')
df_og_metadata.drop_duplicates(inplace=True)
df_og_metadata.index = df_og_metadata['Ortholog group']
df_og_metadata.drop(['Ortholog group'], axis=1, inplace=True)
df_og_metadata['Red_Sea_only'].fillna(False, inplace=True)
df_og_metadata['genus'] = 'Pelagibacter'
df_og_metadata_pelag = df_og_metadata
In [7]:
# combine and write to og_metadata.tsv
df_og_metadata = pd.concat([df_og_metadata_pelag, df_og_metadata_proch])
df_og_metadata.to_csv('/Users/luke/singlecell/notebooks/og_metadata.tsv', sep='\t')
In [8]:
df_og_metadata
Out[8]:
Number of genomes
Example accession
Description
Red_Sea_only
genus
Ortholog group
pelag10000
75
AAA024N17_00354
GDP-perosamine synthase
False
Pelagibacter
pelag10001
71
AAA288G21_00327
Cold shock-like protein CspLA
False
Pelagibacter
pelag10002
55
HIMB083_00082
Sarcosine oxidase subunit alpha
False
Pelagibacter
pelag10003
55
HIMB083_00539
3-succinoylsemialdehyde-pyridine dehydrogenase
False
Pelagibacter
pelag10004
52
HIMB083_00335
Ammonia channel
False
Pelagibacter
pelag10005
48
AAA280B11_00334
Sulfate/thiosulfate import ATP-binding protein...
False
Pelagibacter
pelag10006
46
AAA024N17_00229
Acetylornithine aminotransferase
False
Pelagibacter
pelag10007
46
AAA280B11_00519
Putative peroxiredoxin
False
Pelagibacter
pelag10008
46
AAA280B11_00536
Peptide methionine sulfoxide reductase MsrB
False
Pelagibacter
pelag10009
45
AAA024N17_00179
Release factor glutamine methyltransferase
False
Pelagibacter
pelag10010
45
AAA024N17_00228
Inner membrane transport permease YadH
False
Pelagibacter
pelag10011
44
Pelagibacter_AAA795-C10_00011
Daunorubicin/doxorubicin resistance ATP-bindin...
False
Pelagibacter
pelag10012
44
AAA024N17_00178
hypothetical protein
False
Pelagibacter
pelag10013
44
AAA024N17_00206
Penicillin-binding protein 2
False
Pelagibacter
pelag10014
44
AAA024N17_00207
Rod shape-determining protein RodA
False
Pelagibacter
pelag10015
44
AAA280P20_00391
Thymidylate synthase ThyX
False
Pelagibacter
pelag10016
43
AAA288N07_00669
Signal recognition particle receptor FtsY
False
Pelagibacter
pelag10017
43
AAA024N17_00095
UDP-N-acetylglucosamine 1-carboxyvinyltransferase
False
Pelagibacter
pelag10018
43
AAA024N17_00121
Pyruvate, phosphate dikinase
False
Pelagibacter
pelag10019
43
AAA024N17_00256
Cytochrome b/c1
False
Pelagibacter
pelag10020
43
AAA024N17_00269
2-oxoglutarate dehydrogenase E1 component
False
Pelagibacter
pelag10021
43
QL1_00137
Dihydrolipoyllysine-residue succinyltransferas...
False
Pelagibacter
pelag10022
43
AAA280B11_00771
Protein translocase subunit SecY
False
Pelagibacter
pelag10023
43
AAA280P20_00520
Protein translocase subunit SecA
False
Pelagibacter
pelag10024
42
HIMB114_00223
Aconitate hydratase 1
False
Pelagibacter
pelag10025
42
AAA024N17_00079
Lipoprotein signal peptidase
False
Pelagibacter
pelag10026
42
AAA024N17_00086
ATP phosphoribosyltransferase
False
Pelagibacter
pelag10027
42
AAA024N17_00254
Ribosome-binding ATPase YchF
False
Pelagibacter
pelag10028
42
AAA024N17_00255
Cytochrome b/c1
False
Pelagibacter
pelag10029
42
AAA024N17_00257
Ubiquinol-cytochrome c reductase iron-sulfur s...
False
Pelagibacter
...
...
...
...
...
...
proch20409
1
Prochlorococcus_AAA795-J16_01544
UDP-4-amino-4-deoxy-L-arabinose--oxoglutarate ...
True
Prochlorococcus
proch20410
1
Prochlorococcus_AAA795-J16_01545
hypothetical protein
True
Prochlorococcus
proch20411
1
Prochlorococcus_AAA795-J16_01547
hypothetical protein
True
Prochlorococcus
proch20412
1
Prochlorococcus_AAA795-J16_01550
hypothetical protein
True
Prochlorococcus
proch20413
1
Prochlorococcus_AAA795-J16_01648
hypothetical protein
True
Prochlorococcus
proch20414
1
Prochlorococcus_AAA795-M23_00140
hypothetical protein
True
Prochlorococcus
proch20415
1
Prochlorococcus_AAA795-M23_00149
hypothetical protein
True
Prochlorococcus
proch20416
1
Prochlorococcus_AAA795-M23_00293
hypothetical protein
True
Prochlorococcus
proch20417
1
Prochlorococcus_AAA795-M23_00318
hypothetical protein
True
Prochlorococcus
proch20418
1
Prochlorococcus_AAA795-M23_00405
hypothetical protein
True
Prochlorococcus
proch20419
1
Prochlorococcus_AAA795-M23_00526
hypothetical protein
True
Prochlorococcus
proch20420
1
Prochlorococcus_AAA795-M23_00566
hypothetical protein
True
Prochlorococcus
proch20421
1
Prochlorococcus_AAA795-M23_01055
Glucosylglycerol-phosphate synthase
True
Prochlorococcus
proch20422
1
Prochlorococcus_AAA795-M23_01060
hypothetical protein
True
Prochlorococcus
proch20423
1
Prochlorococcus_AAA795-M23_01063
hypothetical protein
True
Prochlorococcus
proch20424
1
Prochlorococcus_AAA795-M23_01182
hypothetical protein
True
Prochlorococcus
proch20425
1
Prochlorococcus_AAA795-M23_01207
UvrABC system protein B
True
Prochlorococcus
proch20426
1
Prochlorococcus_AAA795-M23_01473
hypothetical protein
True
Prochlorococcus
proch20427
1
Prochlorococcus_AAA795-M23_01474
hypothetical protein
True
Prochlorococcus
proch20428
1
Prochlorococcus_AAA795-M23_01483
hypothetical protein
True
Prochlorococcus
proch20429
1
Prochlorococcus_AAA795-M23_01489
hypothetical protein
True
Prochlorococcus
proch20430
1
Prochlorococcus_AAA795-M23_01493
hypothetical protein
True
Prochlorococcus
proch20431
1
Prochlorococcus_AAA795-M23_01494
hypothetical protein
True
Prochlorococcus
proch20432
1
Prochlorococcus_AAA795-M23_01495
hypothetical protein
True
Prochlorococcus
proch20433
1
Prochlorococcus_AAA795-M23_01500
hypothetical protein
True
Prochlorococcus
proch20434
1
Prochlorococcus_AAA795-M23_01510
hypothetical protein
True
Prochlorococcus
proch20435
1
Prochlorococcus_AAA795-M23_01511
hypothetical protein
True
Prochlorococcus
proch20436
1
Prochlorococcus_AAA795-M23_01521
hypothetical protein
True
Prochlorococcus
proch20437
1
Prochlorococcus_AAA795-M23_01523
hypothetical protein
True
Prochlorococcus
proch20438
1
Prochlorococcus_AAA795-M23_01524
hypothetical protein
True
Prochlorococcus
15711 rows × 5 columns
In [99]:
# read genome_metadata.tsv
df_genome_metadata = pd.read_csv('/Users/luke/singlecell/notebooks/genome_metadata.tsv', sep='\t', index_col=0)
In [100]:
df_genome_metadata
Out[100]:
genus
orthomcl-v3
orthomcl-v4
jellyfish
clade
red_sea
latitude
longitude
depth_meters
reference
strain
AAA024N17
Pelagibacter
NaN
S001
NaN
IIIb
False
NaN
NaN
NaN
NaN
AAA280B11
Pelagibacter
NaN
S002
NaN
IIIb
False
NaN
NaN
NaN
NaN
AAA280P20
Pelagibacter
NaN
S003
NaN
IIIb
False
NaN
NaN
NaN
NaN
AAA288G21
Pelagibacter
NaN
S004
NaN
Ic
False
NaN
NaN
NaN
NaN
AAA288N07
Pelagibacter
NaN
S005
NaN
Ic
False
NaN
NaN
NaN
NaN
AAA288E13
Pelagibacter
NaN
NaN
NaN
Ic
False
NaN
NaN
NaN
NaN
AAA240E13
Pelagibacter
NaN
NaN
NaN
Ic
False
NaN
NaN
NaN
NaN
HIMB058
Pelagibacter
NaN
S006
NaN
II
False
NaN
NaN
NaN
NaN
HIMB083
Pelagibacter
NaN
S007
Pelub83DRAFT
Ia
False
NaN
NaN
NaN
NaN
HIMB114
Pelagibacter
SARE
S008
HIMB114
IIIa
False
21.46
-157.79
NaN
NaN
HIMB122
Pelagibacter
NaN
S009
NaN
Ia
False
NaN
NaN
NaN
NaN
HIMB1321
Pelagibacter
NaN
S010
NaN
Ia
False
NaN
NaN
NaN
NaN
HIMB140
Pelagibacter
NaN
S011
HIMB140
Ia
False
NaN
NaN
NaN
NaN
HIMB4
Pelagibacter
NaN
S012
HIMB4
Ia
False
NaN
NaN
NaN
NaN
HIMB5
Pelagibacter
SARF
S013
CP003809
Ia
False
21.46
-157.79
NaN
NaN
HIMB59
Pelagibacter
SARG
NaN
HIMB59b
V
False
21.46
-157.79
NaN
NaN
HTCC1002
Pelagibacter
SARA
S014
Ga0076703
Ia
False
44.65
-124.05
10.0
NaN
HTCC1013
Pelagibacter
NaN
S015
HTCC1013
Ia
False
NaN
NaN
NaN
NaN
HTCC1016
Pelagibacter
NaN
S016
NaN
Ia
False
NaN
NaN
NaN
NaN
HTCC1040
Pelagibacter
NaN
S017
NaN
Ia
False
NaN
NaN
NaN
NaN
HTCC1062
Pelagibacter
SARB
S018
Ga0076388
Ia
False
NaN
NaN
NaN
Giovannoni et al., 2005
HTCC7211
Pelagibacter
SARC
S019
HTCC7211
Ia
False
NaN
NaN
NaN
NaN
HTCC7214
Pelagibacter
NaN
S020
NaN
Ia
False
32.10
-64.30
NaN
NaN
HTCC7217
Pelagibacter
NaN
S021
NaN
Ia
False
NaN
NaN
NaN
NaN
HTCC8051
Pelagibacter
NaN
S022
HTC8051
Ia
False
NaN
NaN
NaN
NaN
HTCC9022
Pelagibacter
NaN
S023
HTCC9022
Ia
False
NaN
NaN
NaN
NaN
HTCC9565
Pelagibacter
SARH
S024
HTCC9565
Ia
False
NaN
NaN
NaN
NaN
IMCC9063
Pelagibacter
SARD
S025
NC_015380
IIIa
False
79.00
11.31
0.0
Oh et al. (2011)
Pelagibacter_SCGC_AAA795-A08
Pelagibacter
SA08
SA08
Pelagibacter_SCGC_AAA795-A08_contigs
Ia
True
19.75
40.05
0.0
This study
Pelagibacter_SCGC_AAA795-A20
Pelagibacter
SA20
SA20
Pelagibacter_SCGC_AAA795-A20_contigs
Ia
True
19.75
40.05
0.0
This study
...
...
...
...
...
...
...
...
...
...
...
MIT9201
Prochlorococcus
NaN
P117
Ga0062500
HLII
False
-12.00
-145.42
0.0
Moore et al. 1999
MIT9202
Prochlorococcus
PROC
P118
Ga0076803
HLII
False
-12.00
-145.42
79.0
Moore et al. 1999
MIT9211
Prochlorococcus
PROD
P119
Ga0076499
LLII/III
False
0.00
-139.97
83.0
Kettler et al. 2007
MIT9215
Prochlorococcus
PROE
P120
Ga0065692
HLII
False
0.00
-139.97
0.0
Kettler et al. 2007
MIT9301
Prochlorococcus
PROF
P121
Ga0076500
HLII
False
NaN
NaN
90.0
Kettler et al. 2007
MIT9302
Prochlorococcus
NaN
P122
Ga0062504
HLII
False
34.76
-66.19
100.0
Moore et al. 1998
MIT9303
Prochlorococcus
PROG
P123
Ga0076501
LLIV
False
34.76
-66.19
100.0
Kettler et al. 2007
MIT9311
Prochlorococcus
NaN
P124
Ga0062462
HLII
False
37.51
-64.24
135.0
Rocap et al., 2002
MIT9312
Prochlorococcus
PROH
P125
NC_007577
HLII
False
37.50
-68.23
135.0
Kettler et al. 2007
MIT9313
Prochlorococcus
PROI
P126
NC_005071
LLIV
False
37.50
-68.23
135.0
Rocap et al. 2003, Kettler et al. 2007
MIT9314
Prochlorococcus
NaN
P127
Ga0062474
HLII
False
37.51
-64.24
180.0
Rocap et al., 2002
MIT9321
Prochlorococcus
NaN
P128
Ga0062467
HLII
False
1.00
-92.00
50.0
Rocap et al., 2002
MIT9322
Prochlorococcus
NaN
P129
Ga0062463
HLII
False
0.27
-93.00
0.0
Rocap et al., 2002
MIT9401
Prochlorococcus
NaN
P130
Ga0062470
HLII
False
35.50
-70.40
0.0
Rocap et al., 2002
MIT9515
Prochlorococcus
PROJ
P131
Ga0067215
HLI
False
NaN
NaN
15.0
Kettler et al. 2007
NATL1A
Prochlorococcus
PROK
P132
Ga0067212
LLI
False
38.54
-39.85
30.0
Kettler et al. 2007
NATL2A
Prochlorococcus
PROL
P133
NC_007335
LLI
False
38.98
-40.55
10.0
Kettler et al. 2007
PAC1
Prochlorococcus
NaN
P134
Ga0062468
LLI
False
22.75
-158.00
100.0
Parpais et al. 1996, Penno et al. 2000
Prochlorococcus_SCGC_AAA795-F05
Prochlorococcus
PF05
PF05
Prochlorococcus_SCGC_AAA795-F05_contigs
HLII
True
19.75
40.05
0.0
This study
Prochlorococcus_SCGC_AAA795-I06
Prochlorococcus
PI06
PI06
Prochlorococcus_SCGC_AAA795-I06_contigs
HLII
True
19.75
40.05
0.0
This study
Prochlorococcus_SCGC_AAA795-I15
Prochlorococcus
PI15
PI15
Prochlorococcus_SCGC_AAA795-I15_contigs
HLII
True
19.75
40.05
0.0
This study
Prochlorococcus_SCGC_AAA795-J16
Prochlorococcus
PJ16
PJ16
Prochlorococcus_SCGC_AAA795-J16_contigs
HLII
True
19.75
40.05
0.0
This study
Prochlorococcus_SCGC_AAA795-M23
Prochlorococcus
PM23
PM23
Prochlorococcus_SCGC_AAA795-M23_contigs
HLII
True
19.75
40.05
0.0
This study
PRS50
Prochlorococcus
PR50
NaN
NaN
HLII
True
19.75
40.05
0.0
Shibl et al. (unpublished)
SB
Prochlorococcus
NaN
P135
Ga0062469
HLII
False
35.00
138.30
40.0
Shimada et al. 1995
SS120
Prochlorococcus
PROM
P136
Ga0076506
LLII/III
False
28.99
-64.35
120.0
Dufresne et al. 2003, Kettler et al. 2007
SS2
Prochlorococcus
NaN
P137
Ga0062475
LLII/III
False
28.98
-64.35
120.0
Rocap et al., 2002
SS35
Prochlorococcus
NaN
P138
Ga0062478
LLII/III
False
28.98
-64.35
120.0
Rocap et al., 2002
SS51
Prochlorococcus
NaN
P139
Ga0062477
LLII/III
False
28.98
-64.35
120.0
Rocap et al., 2002
SS52
Prochlorococcus
NaN
P140
Ga0062476
LLII/III
False
28.98
-64.35
120.0
Rocap et al., 2002
200 rows × 10 columns
In [ ]:
Content source: cuttlefishh/papers
Similar notebooks: