Summary:

This notebook is for visualizing antibiotic resistance gene tables generated by ABRicate and SRST2.

Example Use Case:

In this example, the complete Shakya et al. 2013 metagenome is being compared to small, medium, and large subsamples of itself after conservative or aggressive read filtering and assembly with SPAdes or MEGAHIT. The datasets used in this example are named according to their metagenome content, relative degree of read filtering, and assembler used where appropriate. SRST2 is appropriate for analysis of antibiotic resistance genes (ARG) in reads while is ABRicate is useful for analysis of ABR in contigs.

  • SRR606249 = Accession number for the complete Shakya et al. 2013 metagenome
  • subset50 = 50% of the complete Shakya et al. 2013 metagenome
  • subset25 = 25% of the complete Shakya et al. 2013 metagenome
  • subset10 = 10% of the complete Shakya et al. 2013 metagenome
  • pe.trim2 = Conservative read filtering
  • pe.trim30 = Aggressive read filtering
  • megahit = MEGHIT assembly
  • spades = SPAdes assembly

Objectives:

  • Create table with all of the genes found
  • Count the total number of genes found for each dataset
  • Count the number of unique genes found per dataset
  • Compare unique genes found using a presence/absence table
  • Compare results from reads and assemblies

In [1]:
from antibiotic_res import *

Analysis of antibiotic resistance genes in contigs using ABRicate


In [23]:
concat_abricate_files('*tab').to_csv('concatenated_abricate_results.txt')

In [3]:
calc_total_genes_abricate()


Out[3]:
GENE
filename
SRR606249_1.trim30_spades_abricate.tab 15
SRR606249_subset10_1.trim2_spades_abricate.tab 15
SRR606249_subset25_1.trim2_spades_abricate.tab 15
SRR606249_1.trim2_spades_abricate.tab 14
SRR606249_subset25_1.trim30_megahit_abricate.tab 14
SRR606249_subset25_1.trim30_spades_abricate.tab 14
SRR606249_1.trim30_megahit_abricate.tab 13
SRR606249_subset25_1.trim2_megahit_abricate.tab 13
SRR606249_subset10_1.trim30_spades_abricate.tab 12
SRR606249_subset50_1.trim2_spades_abricate.tab 12
SRR606249_subset50_1.trim2_megahit_abricate.tab 11
SRR606249_subset50_1.trim30_spades_abricate.tab 11
SRR606249_1.trim2_megahit_abricate.tab 10
SRR606249_subset10_1.trim2_megahit_abricate.tab 10
SRR606249_subset10_1.trim30_megahit_abricate.tab 9
SRR606249_subset50_1.trim30_megahit_abricate.tab 8

In [4]:
calculate_unique_genes_abricate()


Out[4]:
GENE
filename
SRR606249_1.trim2_spades_abricate.tab 12
SRR606249_1.trim30_spades_abricate.tab 12
SRR606249_subset25_1.trim30_spades_abricate.tab 12
SRR606249_subset25_1.trim2_spades_abricate.tab 11
SRR606249_subset25_1.trim30_megahit_abricate.tab 11
SRR606249_1.trim30_megahit_abricate.tab 10
SRR606249_subset25_1.trim2_megahit_abricate.tab 10
SRR606249_subset50_1.trim2_spades_abricate.tab 10
SRR606249_1.trim2_megahit_abricate.tab 9
SRR606249_subset50_1.trim30_spades_abricate.tab 9
SRR606249_subset10_1.trim2_spades_abricate.tab 8
SRR606249_subset10_1.trim30_spades_abricate.tab 7
SRR606249_subset50_1.trim2_megahit_abricate.tab 7
SRR606249_subset50_1.trim30_megahit_abricate.tab 7
SRR606249_subset10_1.trim2_megahit_abricate.tab 5
SRR606249_subset10_1.trim30_megahit_abricate.tab 5

In [5]:
create_abricate_presence_absence_gene_table()


Out[5]:
SRR606249_subset10_1.trim30_megahit_abricate.tab SRR606249_1.trim2_megahit_abricate.tab SRR606249_subset25_1.trim30_spades_abricate.tab SRR606249_subset25_1.trim2_spades_abricate.tab SRR606249_1.trim30_spades_abricate.tab SRR606249_subset25_1.trim2_megahit_abricate.tab SRR606249_subset25_1.trim30_megahit_abricate.tab SRR606249_subset50_1.trim30_spades_abricate.tab SRR606249_subset10_1.trim30_spades_abricate.tab SRR606249_subset50_1.trim2_megahit_abricate.tab SRR606249_1.trim30_megahit_abricate.tab SRR606249_subset10_1.trim2_spades_abricate.tab SRR606249_subset10_1.trim2_megahit_abricate.tab SRR606249_1.trim2_spades_abricate.tab SRR606249_subset50_1.trim30_megahit_abricate.tab SRR606249_subset50_1.trim2_spades_abricate.tab
qepA_1 False False False False False False False True False False False False False False True False
blaTEM-116_4 False False False False True False False False False False False False False True False False
blaOXA-48_2 False True False False True True True False False False True False False False False True
catB7_1 True False True False False False True False True False False True True False False False
lsa(A)_2 True True True True True True True True True True True True True True True True
blaOXA-181_1 False True True True True True True True False True True False False True True True
vat(A)_1 True True True True True True True True True True True True True True True True
vat(F)_1 False False False False True False False False False False False False False True False False
tet(O)_3 False False False False True False False True False False False False False True False True
msr(D)_2 False True True True True False False True False False True False False True False True
aph(6)-Ic_1 False True False True True False False False False True True False False True False True
vat(B)_1 False True True True True True True True True True True True False True True True
car(A)_1 False True True True True True True True False True True True False True True True
tet(33)_2 False False True True False True True False False False False False False False False False
blaOXA-54_1 False False True True False False False False True False True True False False False False
oqxB_1 True True True True True True True True True True True True True True True True
cepA_1 True False True False False True True False True False False True True False False False
otr(C)_1 False False True True False False True False False False False False False False False False

In [6]:
np.version.version


Out[6]:
'1.14.5'

In [7]:
interactive_map_abricate()



In [8]:
interactive_table_abricate()



In [24]:
df = pd.read_csv('concatenated_abricate_results.csv')
qgrid.show_grid(df, show_toolbar=True)


Analysis of SRST2 results


In [9]:
concat_srst2_txt("srst2/*results.txt")


Out[9]:
DB LsaA_MLS MphD_MLS Sample TEM-1D_Bla allele annotation clusterid coverage depth diffs divergence filename gene length maxMAF seqid uncertainty
0 ARGannot.r1 NaN NaN SRR606249_subset10 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 90.528 3.798 5snp79holes 0.662 srst2/SRR606249_subset10_1.trim2.fq.gz__fullge... MphD_MLS 834.0 0.250 1613.0 edge0.0
1 ARGannot.r1 NaN NaN SRR606249_subset50 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 17.648 26snp 1.737 srst2/SRR606249_subset50_1.trim30.fq.gz__fullg... LsaA_MLS 1497.0 0.071 298.0 NaN
2 ARGannot.r1 NaN NaN SRR606249_subset50 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 17.692 6snp 0.719 srst2/SRR606249_subset50_1.trim30.fq.gz__fullg... MphD_MLS 834.0 0.040 1613.0 NaN
3 NaN LsaA_298* MphD_1613* SRR606249_subset50 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset50_1.trim30.fq.gz__genes... NaN NaN NaN NaN NaN
4 NaN NaN NaN SRR606249_subset10 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset10_1.trim30.fq.gz__genes... NaN NaN NaN NaN NaN
5 NaN LsaA_298* MphD_1613* SRR606249 TEM-116_967*? NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_1.trim30.fq.gz__genes__ARGanno... NaN NaN NaN NaN NaN
6 NaN LsaA_298* MphD_1613* SRR606249 TEM-116_967*? NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_1.trim2.fq.gz__genes__ARGannot... NaN NaN NaN NaN NaN
7 ARGannot.r1 NaN NaN SRR606249_subset25 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 8.917 26snp 1.737 srst2/SRR606249_subset25_1.trim2.fq.gz__fullge... LsaA_MLS 1497.0 0.500 298.0 NaN
8 ARGannot.r1 NaN NaN SRR606249_subset25 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 9.616 6snp 0.719 srst2/SRR606249_subset25_1.trim2.fq.gz__fullge... MphD_MLS 834.0 0.167 1613.0 NaN
9 ARGannot.r1 NaN NaN SRR606249_subset25 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 8.010 26snp 1.737 srst2/SRR606249_subset25_1.trim30.fq.gz__fullg... LsaA_MLS 1497.0 0.125 298.0 NaN
10 ARGannot.r1 NaN NaN SRR606249_subset25 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 8.970 6snp 0.719 srst2/SRR606249_subset25_1.trim30.fq.gz__fullg... MphD_MLS 834.0 0.000 1613.0 NaN
11 NaN NaN MphD_1613*? SRR606249_subset10 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset10_1.trim2.fq.gz__genes_... NaN NaN NaN NaN NaN
12 ARGannot.r1 NaN NaN SRR606249_subset50 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 19.349 26snp 1.737 srst2/SRR606249_subset50_1.trim2.fq.gz__fullge... LsaA_MLS 1497.0 0.393 298.0 NaN
13 ARGannot.r1 NaN NaN SRR606249_subset50 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 19.503 6snp 0.719 srst2/SRR606249_subset50_1.trim2.fq.gz__fullge... MphD_MLS 834.0 0.083 1613.0 NaN
14 NaN LsaA_298* MphD_1613* SRR606249_subset25 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset25_1.trim30.fq.gz__genes... NaN NaN NaN NaN NaN
15 NaN LsaA_298* MphD_1613* SRR606249_subset50 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset50_1.trim2.fq.gz__genes_... NaN NaN NaN NaN NaN
16 ARGannot.r1 NaN NaN SRR606249 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 28.247 26snp 1.737 srst2/SRR606249_1.trim30.fq.gz__fullgenes__ARG... LsaA_MLS 1497.0 0.056 298.0 NaN
17 ARGannot.r1 NaN NaN SRR606249 NaN TEM-116_967 no;no;TEM-116;Bla;AY425988;6-866;861 205.0 92.683 2.320 1snp63holes 0.125 srst2/SRR606249_1.trim30.fq.gz__fullgenes__ARG... TEM-1D_Bla 861.0 0.250 967.0 edge1.0
18 ARGannot.r1 NaN NaN SRR606249 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 27.141 6snp 0.719 srst2/SRR606249_1.trim30.fq.gz__fullgenes__ARG... MphD_MLS 834.0 0.029 1613.0 NaN
19 ARGannot.r1 NaN NaN SRR606249 NaN LsaA_298 no;no;LsaA;MLS;AY225127;41-1537;1497 33.0 100.000 31.116 26snp 1.737 srst2/SRR606249_1.trim2.fq.gz__fullgenes__ARGa... LsaA_MLS 1497.0 0.434 298.0 NaN
20 ARGannot.r1 NaN NaN SRR606249 NaN TEM-116_967 no;no;TEM-116;Bla;AY425988;6-866;861 205.0 93.148 2.675 1snp59holes 0.125 srst2/SRR606249_1.trim2.fq.gz__fullgenes__ARGa... TEM-1D_Bla 861.0 0.333 967.0 edge1.0
21 ARGannot.r1 NaN NaN SRR606249 NaN MphD_1613 no;no;MphD;MLS;NC_017312;2292413-2291580;834 228.0 100.000 29.399 6snp 0.719 srst2/SRR606249_1.trim2.fq.gz__fullgenes__ARGa... MphD_MLS 834.0 0.045 1613.0 NaN
22 NaN LsaA_298* MphD_1613* SRR606249_subset25 NaN NaN NaN NaN NaN NaN NaN NaN srst2/SRR606249_subset25_1.trim2.fq.gz__genes_... NaN NaN NaN NaN NaN

In [15]:
calc_total_genes_srst2()#.to_csv('')

In [11]:
calculate_unique_genes_srst2()#.to_csv('')


Out[11]:
gene
filename
srst2/SRR606249_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt 3
srst2/SRR606249_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt 3
srst2/SRR606249_subset25_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt 2
srst2/SRR606249_subset25_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt 2
srst2/SRR606249_subset50_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt 2
srst2/SRR606249_subset50_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt 2
srst2/SRR606249_subset10_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt 1
srst2/SRR606249_1.trim2.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_1.trim30.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset10_1.trim2.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset10_1.trim30.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset25_1.trim2.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset25_1.trim30.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset50_1.trim2.fq.gz__genes__ARGannot.r1__results.txt 0
srst2/SRR606249_subset50_1.trim30.fq.gz__genes__ARGannot.r1__results.txt 0

In [12]:
create_srst2_presence_absence_gene_table()


Out[12]:
srst2/SRR606249_subset50_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_subset50_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_subset25_1.trim30.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_subset10_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt srst2/SRR606249_subset25_1.trim2.fq.gz__fullgenes__ARGannot.r1__results.txt
LsaA_MLS True True True True True False True
MphD_MLS True True True True True True True
TEM-1D_Bla False True False False True False False

In [13]:
interactive_map_srst2()



In [14]:
interactive_table_srst2()


Conclusions:

We analyzed and compared predicted antibiotic resistance genes (ABRs) in reads and contigs. To determine whether quality filtering and sequencing depth affected detection of ABRs we compared light and agressive trimming. A greater number of genes were detected with following assembly. Three genes, vat(F), tet(O), and blaTEM-116 4, were only detected in the SPAdes assembly.