This workflow creates a hybrid phylogenetic tree. This can be run in place with no modifications. All of the files needed to generate the tree are publicly available from their respective databases. This workflow is specific to the Silva and UNITE databases, however the commands may be adapted to create hybrid trees from other marker genes.
ghost-tree git@5f5d5b868fa951cecc7731ecc82f8d2798359c82
SUMACLUST 1.0.01
MUSCLE 3.8.31
FastTree 2.1.8
scikit-bio 0.2.3
In [6]:
#Silva files
!wget 'http://www.arb-silva.de/fileadmin/silva_databases/release_119/Exports/SILVA_119_SSURef_Nr99_tax_silva_full_align_trunc.fasta.gz'
!wget 'http://www.arb-silva.de/fileadmin/silva_databases/release_119/Exports/taxonomy/tax_slv_ssu_nr_119.acc_taxid'
!wget 'http://www.arb-silva.de/fileadmin/silva_databases/release_119/Exports/taxonomy/tax_slv_ssu_nr_119.txt'
!gunzip SILVA_119_SSURef_Nr99_tax_silva_full_align_trunc.fasta.gz
In [7]:
#UNITE Files
!wget 'https://github.com/qiime/its-reference-otus/raw/master/taxonomy/97_otu_taxonomy.txt.gz'
!wget 'https://github.com/qiime/its-reference-otus/raw/master/rep_set/97_otus.fasta.gz'
!gunzip 97_otu_taxonomy.txt.gz
!gunzip 97_otus.fasta.gz
In [1]:
silva_aligned = 'SILVA_119_SSURef_Nr99_tax_silva_full_align_trunc.fasta'
silva_accession = 'tax_slv_ssu_nr_119.acc_taxid'
silva_taxonomy = 'tax_slv_ssu_nr_119.txt'
silva_fungi_only = 'silva_fungi_only.txt'
silva_fungi_filtered = 'silva_fungi_only_filtered.txt'
ITS_seqs = '97_otus.fasta'
ITS_otu_map_80 = 'ITS_otu_map_80.txt'
ITS_tax = '97_otu_taxonomy.txt'
In [3]:
!time ghost-tree silva extract-fungi $silva_aligned $silva_accession $silva_taxonomy $silva_fungi_only
In [4]:
!time ghost-tree filter-alignment-positions $silva_fungi_only 0.9 0.8 $silva_fungi_filtered
In [2]:
!time ghost-tree extensions group-extensions $ITS_seqs 0.8 $ITS_otu_map_80
Steps involved in ghost-tree scaffold hybrid-tree:
The ghost-tree scaffold hybrid-tree command uses FastTree which ignores non-nucleotide characters. FastTree generates warnings that these characters are being ignored, and while these warnings do not present a problem in the creation of the tree they can slow down the IPython notebok to the point where it crashes. This is not a problem when the command is run directly from the command line. In order to avoid this issue we recommend running the following command directly from the command line.
This is an open issue in a git hub; issue-#25
In [ ]:
#ghost-tree scaffold hybrid-tree ITS_otu_map_80.txt 97_otu_taxonomy.txt 97_otus.fasta silva_fungi_only_filtered.txt ghost-tree-output2.nwk