ipyrad-analysis toolkit: Astral species trees

Once you've run treeslider you now have a distribution of gene trees that you can use as input to a multi-locu species tree or network inference program. Here I demonstrate how to input your data to astral.

Required software


In [1]:
# conda install ipyrad -c bioconda
# conda install toytree -c eaton-lab

In [2]:
import pandas as pd
import toytree

Short Tutorial:


In [4]:
# load the tree table from CSV
tree_table = pd.read_csv(
    "./analysis-treeslider/test.tree_table.csv",
    index_col=0,
)


---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-6a21e2a50b73> in <module>
      2 tree_table = pd.read_csv(
      3     "./analysis-treeslider/test.tree_table.csv",
----> 4     index_col=0,
      5 )

~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    683         )
    684 
--> 685         return _read(filepath_or_buffer, kwds)
    686 
    687     parser_f.__name__ = name

~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    455 
    456     # Create the parser.
--> 457     parser = TextFileReader(fp_or_buf, **kwds)
    458 
    459     if chunksize or iterator:

~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
    893             self.options["has_index_names"] = kwds["has_index_names"]
    894 
--> 895         self._make_engine(self.engine)
    896 
    897     def close(self):

~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
   1133     def _make_engine(self, engine="c"):
   1134         if engine == "c":
-> 1135             self._engine = CParserWrapper(self.f, **self.options)
   1136         else:
   1137             if engine == "python":

~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
   1915         kwds["usecols"] = self.usecols
   1916 
-> 1917         self._reader = parsers.TextReader(src, **kwds)
   1918         self.unnamed_cols = self._reader.unnamed_cols
   1919 

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

FileNotFoundError: [Errno 2] File b'./analysis-treeslider/test.tree_table.csv' does not exist: b'./analysis-treeslider/test.tree_table.csv'

In [59]:
# examine top of table
tree_table.head()


Out[59]:
scaffold start end sites snps samples missing tree
0 2 0 2000000 13263 155 9 0.0 (sagr:0.00343708,(oleo:0.00266064,(mini:0.0020...
1 2 2000000 4000000 10544 112 9 0.0 (fusi-N:0.00441769,reference:0.0186764,((bran:...
2 2 4000000 6000000 5544 46 9 0.0 (virg:0.00297301,fusi-N:0.00243431,(oleo:0.003...
3 2 6000000 8000000 12777 138 9 0.0 (fusi-N:0.00283693,(bran:0.00363545,reference:...
4 2 8000000 10000000 14441 166 9 0.0 (bran:0.00446094,reference:0.0119105,(fusi-S:0...

Write the trees column to a file


In [11]:
outfile = open("trees.nwk", "w")
outfile.write("\n".join(tree_table.tree.tolist()))
outfile.close()

Get Astral

The following command will download and unzip Astral into the current directory. A binary to call the program astral will be located in the Astral directory.


In [12]:
%%bash
wget -q https://github.com/smirarab/ASTRAL/raw/master/Astral.5.6.3.zip 
unzip -qo Astral.5.6.3

Run Astral

This will write the results (stdout) to a file called astral.tre, and it will write other information printed to the screen (stderr) to a file called astral.err. By default astral runs 100 bootstrap replicates by gene-tree resampling. To turn on gene and site resampling would require inputting the bootstrap files, which we did not currently save (we could...).


In [13]:
%%bash
java -jar Astral/astral.5.6.3.jar -i trees.nwk > astral.tre 2>astral.err

Plot astral species tree


In [61]:
tre = toytree.rtree.coaltree(10)
tre.


Out[61]:
<toytree.Toytree.ToyTree at 0x7f0474bb2160>

In [71]:
tre.rotate_node(names=["r0", "r1", "r2"])


Out[71]:
<toytree.Toytree.ToyTree at 0x7f047506c358>

In [68]:
tre.draw(node_labels=True);


r0r1r2r3r4r5r6r7r8r9idx: 0 name: r0 dist: 0.0048 support: 100 height: 0.00000idx: 1 name: r1 dist: 0.0048 support: 100 height: 0.00001idx: 2 name: r2 dist: 0.0067 support: 100 height: 0.00002idx: 3 name: r3 dist: 0.0550 support: 100 height: 0.00003idx: 4 name: r4 dist: 0.1530 support: 100 height: 0.00004idx: 5 name: r5 dist: 0.1547 support: 100 height: 0.00005idx: 6 name: r6 dist: 0.0084 support: 100 height: 0.00006idx: 7 name: r7 dist: 0.0084 support: 100 height: 0.00007idx: 8 name: r8 dist: 0.0336 support: 100 height: 0.00008idx: 9 name: r9 dist: 0.0375 support: 100 height: 0.00009idx: 10 name: 10 dist: 0.0020 support: 100 height: 0.004810idx: 11 name: 11 dist: 0.0483 support: 100 height: 0.006711idx: 12 name: 12 dist: 0.0980 support: 100 height: 0.055012idx: 13 name: 13 dist: 0.0253 support: 100 height: 0.008413idx: 14 name: 14 dist: 0.0016 support: 100 height: 0.153014idx: 15 name: 15 dist: 0.0039 support: 100 height: 0.033615idx: 16 name: 16 dist: 0.0405 support: 100 height: 0.154716idx: 17 name: 17 dist: 0.1577 support: 100 height: 0.037517idx: 18 name: 18 dist: 0.0001 support: 100 height: 0.195218

In [60]:
tre = toytree.tree("astral.tre").root('reference')
tre.draw(node_labels="support", tip_labels_align=True);


---------------------------------------------------------------------------
NewickError                               Traceback (most recent call last)
<ipython-input-60-40b2dc83ebdf> in <module>
----> 1 tre = toytree.tree("astral.tre").root('reference')
      2 tre.draw(node_labels="support", tip_labels_align=True);

~/Documents/toytree/toytree/Toytree.py in __init__(self, newick, tree_format, fixed_order)
     54         # prase a str, URL, or file
     55         elif isinstance(newick, (str, bytes)):
---> 56             self.treenode = TreeParser(newick, tree_format).treenodes[0]
     57 
     58         # make an empty tree

~/Documents/toytree/toytree/TreeParser.py in __init__(self, intree, tree_format, multitree, debug)
     64         # parse intree
     65         if not self.debug:
---> 66             self._run()
     67 
     68 

~/Documents/toytree/toytree/TreeParser.py in _run(self)
     80 
     81             # parse newick strings to treenodes list
---> 82             self.get_treenodes()
     83 
     84             # apply names from tdict

~/Documents/toytree/toytree/TreeParser.py in get_treenodes(self)
    148 
    149             # extract one tree
--> 150             self.treenodes.append(extractor.newick_from_string())
    151 
    152         else:

~/Documents/toytree/toytree/TreeParser.py in newick_from_string(self)
    251                         closing_internal = closing_internal.rstrip(";")
    252                         # read internal node data and go up one level
--> 253                         self.apply_node_data(closing_internal, "internal")
    254                         self.current_parent = self.current_parent.up
    255         return self.root

~/Documents/toytree/toytree/TreeParser.py in apply_node_data(self, subnw, node_type)
    306                     self.current_node.add_feature(fname, fvalue)
    307         else:
--> 308             raise NewickError("Unexpected newick format {}".format(subnw))
    309 
    310 

NewickError: Unexpected newick format '[q1=0.4761904761904762;q2=0.2619047619047619;q3=0.2619047619047619;f1=13.333333333333334;f2=7.333333333333334;f3=7.333333333333334;pp1=0.8444981624253073;pp2=0.07775091878734611;pp3=0.07775091878734611;QC=6;EN=28.0]':0.2102954088363608

Plot astral species tree with quartet supports

Not yet supported in toytree, what a wonky file format...