Load libraries
In [1]:
using LightXML
using DataArrays
using DataFrames
Read in data; these were obtained from NCBI based on the link from the Gire et al. paper.
In [2]:
xdoc = parse_file("ebola-sle-2014.gbc.xml");
I start by identifying the root element, which is an INSDSet.
In [3]:
xroot = root(xdoc)
println(name(xroot));
INSDSet
I extract all the sequences and accession numbers as lists, the latter using a comprehension.
In [4]:
sequences = get_elements_by_tagname(xroot, "INSDSeq")
accessions = [content(find_element(s,"INSDSeq_primary-accession")) for s in sequences];
In [5]:
numseq=length(sequences)
Out[5]:
249
This is way more than we have annotations for.
Let's look at the first entry.
In [6]:
sequences[1]
Out[6]:
<INSDSeq>
<INSDSeq_locus>KM034549</INSDSeq_locus>
<INSDSeq_length>18835</INSDSeq_length>
<INSDSeq_strandedness>single</INSDSeq_strandedness>
<INSDSeq_moltype>cRNA</INSDSeq_moltype>
<INSDSeq_topology>linear</INSDSeq_topology>
<INSDSeq_division>VRL</INSDSeq_division>
<INSDSeq_update-date>15-DEC-2014</INSDSeq_update-date>
<INSDSeq_create-date>30-JUN-2014</INSDSeq_create-date>
<INSDSeq_definition>Zaire ebolavirus isolate Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095B, complete genome</INSDSeq_definition>
<INSDSeq_primary-accession>KM034549</INSDSeq_primary-accession>
<INSDSeq_accession-version>KM034549.1</INSDSeq_accession-version>
<INSDSeq_other-seqids>
<INSDSeqid>gb|KM034549.1|</INSDSeqid>
<INSDSeqid>gi|661348595</INSDSeqid>
</INSDSeq_other-seqids>
<INSDSeq_project>PRJNA257197</INSDSeq_project>
<INSDSeq_source>Zaire ebolavirus</INSDSeq_source>
<INSDSeq_organism>Zaire ebolavirus</INSDSeq_organism>
<INSDSeq_taxonomy>Viruses; ssRNA viruses; ssRNA negative-strand viruses; Mononegavirales; Filoviridae; Ebolavirus</INSDSeq_taxonomy>
<INSDSeq_references>
<INSDReference>
<INSDReference_reference>1</INSDReference_reference>
<INSDReference_position>1..18835</INSDReference_position>
<INSDReference_authors>
<INSDAuthor>Gire,S.K.</INSDAuthor>
<INSDAuthor>Goba,A.</INSDAuthor>
<INSDAuthor>Andersen,K.G.</INSDAuthor>
<INSDAuthor>Sealfon,R.S.</INSDAuthor>
<INSDAuthor>Park,D.J.</INSDAuthor>
<INSDAuthor>Kanneh,L.</INSDAuthor>
<INSDAuthor>Jalloh,S.</INSDAuthor>
<INSDAuthor>Momoh,M.</INSDAuthor>
<INSDAuthor>Fullah,M.</INSDAuthor>
<INSDAuthor>Dudas,G.</INSDAuthor>
<INSDAuthor>Wohl,S.</INSDAuthor>
<INSDAuthor>Moses,L.M.</INSDAuthor>
<INSDAuthor>Yozwiak,N.L.</INSDAuthor>
<INSDAuthor>Winnicki,S.</INSDAuthor>
<INSDAuthor>Matranga,C.B.</INSDAuthor>
<INSDAuthor>Malboeuf,C.M.</INSDAuthor>
<INSDAuthor>Qu,J.</INSDAuthor>
<INSDAuthor>Gladden,A.D.</INSDAuthor>
<INSDAuthor>Schaffner,S.F.</INSDAuthor>
<INSDAuthor>Yang,X.</INSDAuthor>
<INSDAuthor>Jiang,P.P.</INSDAuthor>
<INSDAuthor>Nekoui,M.</INSDAuthor>
<INSDAuthor>Colubri,A.</INSDAuthor>
<INSDAuthor>Coomber,M.R.</INSDAuthor>
<INSDAuthor>Fonnie,M.</INSDAuthor>
<INSDAuthor>Moigboi,A.</INSDAuthor>
<INSDAuthor>Gbakie,M.</INSDAuthor>
<INSDAuthor>Kamara,F.K.</INSDAuthor>
<INSDAuthor>Tucker,V.</INSDAuthor>
<INSDAuthor>Konuwa,E.</INSDAuthor>
<INSDAuthor>Saffa,S.</INSDAuthor>
<INSDAuthor>Sellu,J.</INSDAuthor>
<INSDAuthor>Jalloh,A.A.</INSDAuthor>
<INSDAuthor>Kovoma,A.</INSDAuthor>
<INSDAuthor>Koninga,J.</INSDAuthor>
<INSDAuthor>Mustapha,I.</INSDAuthor>
<INSDAuthor>Kargbo,K.</INSDAuthor>
<INSDAuthor>Foday,M.</INSDAuthor>
<INSDAuthor>Yillah,M.</INSDAuthor>
<INSDAuthor>Kanneh,F.</INSDAuthor>
<INSDAuthor>Robert,W.</INSDAuthor>
<INSDAuthor>Massally,J.L.</INSDAuthor>
<INSDAuthor>Chapman,S.B.</INSDAuthor>
<INSDAuthor>Bochicchio,J.</INSDAuthor>
<INSDAuthor>Murphy,C.</INSDAuthor>
<INSDAuthor>Nusbaum,C.</INSDAuthor>
<INSDAuthor>Young,S.</INSDAuthor>
<INSDAuthor>Birren,B.W.</INSDAuthor>
<INSDAuthor>Grant,D.S.</INSDAuthor>
<INSDAuthor>Scheiffelin,J.S.</INSDAuthor>
<INSDAuthor>Lander,E.S.</INSDAuthor>
<INSDAuthor>Happi,C.</INSDAuthor>
<INSDAuthor>Gevao,S.M.</INSDAuthor>
<INSDAuthor>Gnirke,A.</INSDAuthor>
<INSDAuthor>Rambaut,A.</INSDAuthor>
<INSDAuthor>Garry,R.F.</INSDAuthor>
<INSDAuthor>Khan,S.H.</INSDAuthor>
<INSDAuthor>Sabeti,P.C.</INSDAuthor>
</INSDReference_authors>
<INSDReference_title>Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak</INSDReference_title>
<INSDReference_journal>Science 345 (6202), 1369-1372 (2014)</INSDReference_journal>
<INSDReference_xref>
<INSDXref>
<INSDXref_dbname>doi</INSDXref_dbname>
<INSDXref_id>10.1126/science.1259657</INSDXref_id>
</INSDXref>
</INSDReference_xref>
<INSDReference_pubmed>25214632</INSDReference_pubmed>
</INSDReference>
<INSDReference>
<INSDReference_reference>2</INSDReference_reference>
<INSDReference_position>1..18835</INSDReference_position>
<INSDReference_authors>
<INSDAuthor>Goba,A.</INSDAuthor>
<INSDAuthor>Khan,H.</INSDAuthor>
<INSDAuthor>Momoh,M.</INSDAuthor>
<INSDAuthor>Jalloh,S.</INSDAuthor>
<INSDAuthor>Fullah,M.</INSDAuthor>
<INSDAuthor>Kanneh,L.</INSDAuthor>
<INSDAuthor>Gevao,S.</INSDAuthor>
<INSDAuthor>Happi,C.</INSDAuthor>
<INSDAuthor>Gire,S.K.</INSDAuthor>
<INSDAuthor>Andersen,K.</INSDAuthor>
<INSDAuthor>Malboeuf,C.</INSDAuthor>
<INSDAuthor>Matranga,C.</INSDAuthor>
<INSDAuthor>Sealfon,R.</INSDAuthor>
<INSDAuthor>Wohl,S.</INSDAuthor>
<INSDAuthor>Gladden,A.</INSDAuthor>
<INSDAuthor>Yang,X.</INSDAuthor>
<INSDAuthor>Winnicki,S.</INSDAuthor>
<INSDAuthor>Park,D.</INSDAuthor>
<INSDAuthor>Qu,J.</INSDAuthor>
<INSDAuthor>Sabeti,P.</INSDAuthor>
<INSDAuthor>Garry,R.</INSDAuthor>
</INSDReference_authors>
<INSDReference_consortium>Viral Hemorrhagic Fever Consortium</INSDReference_consortium>
<INSDReference_title>Direct Submission</INSDReference_title>
<INSDReference_journal>Submitted (16-JUN-2014) Infectious Disease Initiative, Broad Institute of MIT and Harvard, 75 Ames St., Cambridge, MA 02142, USA</INSDReference_journal>
</INSDReference>
</INSDSeq_references>
<INSDSeq_comment>##Assembly-Data-START## Assembly Method Novoalign v. 3 Sequencing Technology Illumina ##Assembly-Data-END##</INSDSeq_comment>
<INSDSeq_feature-table>
<INSDFeature>
<INSDFeature_key>source</INSDFeature_key>
<INSDFeature_location>1..18835</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>1</INSDInterval_from>
<INSDInterval_to>18835</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>organism</INSDQualifier_name>
<INSDQualifier_value>Zaire ebolavirus</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>mol_type</INSDQualifier_name>
<INSDQualifier_value>viral cRNA</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>isolate</INSDQualifier_name>
<INSDQualifier_value>Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095B</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>isolation_source</INSDQualifier_name>
<INSDQualifier_value>serum</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>host</INSDQualifier_name>
<INSDQualifier_value>Homo sapiens</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>taxon:186538</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>country</INSDQualifier_name>
<INSDQualifier_value>Sierra Leone</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>collection_date</INSDQualifier_name>
<INSDQualifier_value>25-May-2014</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>11..2981</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11</INSDInterval_from>
<INSDInterval_to>2981</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>NP</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>11..2981</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11</INSDInterval_from>
<INSDInterval_to>2981</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>NP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>nucleoprotein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGAAATTGTTACTGTAATCATACCTGGTTTGTTTCAGAGCCATATCACCAAGATAGAGAACAACCTAGGTCTCCGGAGGGGGCAAGGGCATCAGTGTGCTCAGTTGAAAATCCCTTGTCAACATCTAGGCCTTATCACATCACAAGTTCCGCCTTAAACTCTGCAGGGTGATCCAACAACCTTAATAGCAACATTATTGTTAAAGGACAGCATTAGTTCACAGTCAAACAAGCAAGATTGAGAATTAACTTTGATTTTGAACCTGAACACCCAGAGGACTGGAGACTCAACAACCCTAAAGCCTGGGGTAAAACATTAGAAATAGTTTAAAGACAAATTGCTCGGAATCACAAAATTCCGAGTATGGATTCTCGTCCTCAGAAAGTCTGGATGACGCCGAGTCTCACTGAATCTGACATGGATTACCACAAGATCTTGACAGCAGGTCTGTCCGTTCAACAGGGGATTGTTCGGCAAAGAGTCATCCCAGTGTATCAAGTAAACAATCTTGAGGAAATTTGCCAACTTATCATACAGGCCTTTGAAGCTGGTGTTGATTTTCAAGAGAGTGCGGACAGTTTCCTTCTCATGCTTTGTCTTCATCATGCGTACCAAGGAGATTACAAACTTTTCTTGGAAAGTGGCGCAGTCAAGTATTTGGAAGGGCACGGGTTCCGTTTTGAAGTCAAGAAGCGTGATGGAGTGAAGCGCCTTGAGGAATTGCTGCCAGCAGTATCTAGTGGGAGAAACATTAAGAGAACACTTGCTGCCATGCCGGAAGAGGAGACGACTGAAGCTAATGCCGGTCAGTTCCTCTCCTTTGCAAGTCTATTCCTTCCGAAATTGGTAGTAGGAGAAAAGGCTTGCCTTGAGAAGGTTCAAAGGCAAATTCAAGTACATGCAGAGCAAGGACTGATACAATATCCAACAGCTTGGCAATCAGTAGGACACATGATGGTGATTTTCCGTTTGATGCGAACAAATTTTTTGATCAAATTTCTTCTAATACACCAAGGGATGCACATGGTTGCCGGACATGATGCCAACGATGCTGTGATTTCAAATTCAGTGGCTCAAGCTCGTTTTTCAGGTCTATTGATTGTCAAAACAGTACTTGATCATATCCTACAAAAGACAGAACGAGGAGTTCGTCTCCATCCTCTTGCAAGGACCGCCAAGGTAAAAAATGAGGTGAACTCCTTCAAGGCTGCACTCAGCTCCCTGGCCAAGCATGGAGAGTATGCTCCTTTCGCCCGACTTTTGAACCTTTCTGGAGTAAATAATCTTGAGCATGGTCTTTTCCCTCAACTGTCGGCAATTGCACTCGGAGTCGCCACAGCCCACGGGAGCACCCTCGCAGGAGTAAATGTTGGAGAACAGTATCAACAGCTCAGAGAGGCAGCCACTGAGGCTGAGAAGCAACTCCAACAATATGCGGAGTCTCGTGAACTTGACCATCTTGGACTTGATGATCAGGAAAAGAAAATTCTTATGAACTTCCATCAGAAAAAGAACGAAATCAGCTTCCAGCAAACAAACGCGATGGTAACTCTAAGAAAAGAGCGCCTGGCCAAGCTGACAGAAGCTATCACTGCTGCATCACTGCCCAAAACAAGTGGACATTACGATGATGATGACGACATTCCCTTTCCAGGACCCATCAATGATGACGACAATCCTGGCCATCAAGATGATGATCCGACTGACTCACAGGATACGACCATTCCCGATGTGGTAGTTGACCCCGATGATGGAGGCTACGGCGAATACCAAAGTTACTCGGAAAACGGCATGAGTGCACCAGATGACTTGGTCCTATTCGATCTAGACGAGGACGACGAGGACACCAAGCCAGTGCCTAACAGATCGACCAAGGGTGGACAACAGAAAAACAGTCAAAAGGGCCAGCATACAGAGGGCAGACAGACACAATCCACGCCAACTCAAAACGTCACAGGCCCTCGCAGAACAATCCACCATGCCAGTGCTCCACTCACGGACAATGACAGAAGAAACGAACCCTCCGGCTCAACCAGCCCTCGCATGCTGACCCCAATCAACGAAGAGGCAGACCCACTGGACGATGCCGACGACGAGACGTCTAGCCTTCCGCCCTTAGAGTCAGATGATGAAGAACAGGACAGGGACGGAACTTCTAACCGCACACCCACTGTCGCCCCACCGGCTCCCGTATACAGAGATCACTCCGAAAAGAAAGAACTCCCGCAAGATGAACAACAAGATCAGGACCACATTCAAGAGGCCAGGAACCAAGACAGTGACAACACCCAGCCAGAACATTCTTTTGAGGAGATGTATCGCCACATTCTAAGATCACAGGGGCCATTTGATGCCGTTTTGTATTATCATATGATGAAGGATGAGCCTGTAGTTTTCAGTACCAGTGATGGTAAAGAGTACACGTATCCGGACTCCCTTGAAGAGGAATATCCACCATGGCTCACTGAAAAAGAGGCCATGAATGATGAGAATAGATTTGTTACACTGGATGGTCAACAATTTTATTGGCCAGTAATGAATCACAGGAATAAATTCATGGCAATCCTGCAACATCATCAGTGAATGAGCATGTAATAATGGGATGATTTAATCGACAAATAGCTAACATTAAATAGTCAAGGAACGCAAACAGGAAGAATTTTTGATGTCTAAGGTGTGAATTATTATCACAATAAAAGTGATTCTTAGTTTTGAATTTAAAGCTAGCTTATTATTACTAGCCGTTTTTCAAAGTTCAATTTGAGTCTTAATGCAAATAAGCGTTAAGCCACAGTTATAGCCATAATGGTAACTCAATATCTTAGCCAGCGATTTATCTAAATTAAATTACATTATGCTTTTATAACTTACCTACTAGCCTGCCCAACATTTACACGATCGTTTTATAATTAAGAAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>11..22</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11</INSDInterval_from>
<INSDInterval_to>22</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>425..2644</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>425</INSDInterval_from>
<INSDInterval_to>2644</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>NP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>involved in encapsidation of genomic RNA</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>nucleoprotein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11797.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348596</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MDSRPQKVWMTPSLTESDMDYHKILTAGLSVQQGIVRQRVIPVYQVNNLEEICQLIIQAFEAGVDFQESADSFLLMLCLHHAYQGDYKLFLESGAVKYLEGHGFRFEVKKRDGVKRLEELLPAVSSGRNIKRTLAAMPEEETTEANAGQFLSFASLFLPKLVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKFLLIHQGMHMVAGHDANDAVISNSVAQARFSGLLIVKTVLDHILQKTERGVRLHPLARTAKVKNEVNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLFPQLSAIALGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKKNEISFQQTNAMVTLRKERLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQDTTIPDVVVDPDDGGYGEYQSYSENGMSAPDDLVLFDLDEDDEDTKPVPNRSTKGGQQKNSQKGQHTEGRQTQSTPTQNVTGPRRTIHHASAPLTDNDRRNEPSGSTSPRMLTPINEEADPLDDADDETSSLPPLESDDEEQDRDGTSNRTPTVAPPAPVYRDHSEKKELPQDEQQDQDHIQEARNQDSDNTQPEHSFEEMYRHILRSQGPFDAVLYYHMMKDEPVVFSTSDGKEYTYPDSLEEEYPPWLTEKEAMNDENRFVTLDGQQFYWPVMNHRNKFMAILQHHQ</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>2970..2981</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>2970</INSDInterval_from>
<INSDInterval_to>2981</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>NP</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>2987..4362</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>2987</INSDInterval_from>
<INSDInterval_to>4362</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP35</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>2987..4362</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>2987</INSDInterval_from>
<INSDInterval_to>4362</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP35</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>polymerase complex protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GATGAAGATTAAAACCTTCATCATCCTTACGTCAATTGAATTCTCTAGCACTAGAAGCTTATTGTCTTCAATGTAAAAGAAAAGCTGGCCTAACAAGATGACAACTAGAACAAAGGGCAGGGGCCATACTGTGGCCACGACTCAAAACGACAGAATGCCAGGCCCTGAGCTTTCGGGCTGGATCTCTGAGCAGCTAATGACCGGAAGGATTCCTGTAAACGACATCTTCTGTGATATTGAGAACAATCCAGGATTATGCTACGCATCCCAAATGCAACAAACGAAGCCAAACCCGAAGATGCGCAACAGTCAAACCCAAACGGACCCAATTTGCAATCATAGTTTTGAGGAGGTAGTACAAACATTGGCTTCATTGGCTACTGTTGTGCAACAACAAACCATCGCATCAGAATCATTAGAACAACGCATTACGAGTCTTGAGAATGGTCTAAAGCCAGTTTATGATATGGCAAAAACAATCTCCTCATTGAACAGGGTTTGTGCTGAGATGGTTGCAAAATATGATCTTCTGGTGATGACAACCGGTCGGGCAACAGCAACCGCTGCGGCAACTGAGGCTTATTGGGCTGAACATGGTCAACCACCACCTGGACCATCACTTTATGAAGAAAGTGCGATTCGGGGTAAGATTGAATCTAGAGATGAGACTGTCCCTCAAAGTGTTAGGGAGGCATTCAACAATCTAGACAGTACCACTTCACTAACTGAGGAAAATTTTGGGAAACCTGACATTTCGGCAAAGGATTTGAGAAACATTATGTATGATCACTTGCCTGGTTTTGGAACTGCTTTCCACCAATTAGTACAAGTGATTTGTAAATTGGGAAAAGATAGCAATTCATTGGACATTATTCATGCTGAGTTCCAGGCCAGCCTGGCTGAAGGAGACTCCCCTCAATGTGCCCTAATTCAAATTACAAAAAGAGTTCCAATCTTCCAAGATGCTGCTCCACCTGTCATCCACATCCGCTCTCGAGGTGACATTCCCCGAGCTTGCCAGAAGAGCTTGCGTCCAGTCCCACCATCACCCAAGATTGATCGAGGTTGGGTATGTGTTTTTCAGCTTCAAGATGGTAAAACACTTGGACTCAAAATTTGAGCCAATCTCTTTTCCCTCCGAAAGAGGCAACTAATAGCAGAGGCTTCAACTGCTGAACTATAGGGTATGTTACATTAATGATACACTTGTGAGTATCAGCCCTAGATAATATAAGTCAATTAAACAACCAAGATAAAATTGTTCATATCCCGCTAGCAGCTTTAAAGATAAATGTAATAGGAGCTATACCTCTGACAGTATTATAATTAATTGTTATTAAGTAACCCAAACCAAAAATGATGAAGATTAAGAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>2987..2998</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>2987</INSDInterval_from>
<INSDInterval_to>2998</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP35</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>3084..4106</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>3084</INSDInterval_from>
<INSDInterval_to>4106</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP35</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>involved in encapsidation of genomic RNA</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>polymerase complex protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11798.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348597</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MTTRTKGRGHTVATTQNDRMPGPELSGWISEQLMTGRIPVNDIFCDIENNPGLCYASQMQQTKPNPKMRNSQTQTDPICNHSFEEVVQTLASLATVVQQQTIASESLEQRITSLENGLKPVYDMAKTISSLNRVCAEMVAKYDLLVMTTGRATATAAATEAYWAEHGQPPPGPSLYEESAIRGKIESRDETVPQSVREAFNNLDSTTSLTEENFGKPDISAKDLRNIMYDHLPGFGTAFHQLVQVICKLGKDSNSLDIIHAEFQASLAEGDSPQCALIQITKRVPIFQDAAPPVIHIRSRGDIPRACQKSLRPVPPSPKIDRGWVCVFQLQDGKTLGLKI</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>4345..5849</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>4345</INSDInterval_from>
<INSDInterval_to>5849</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP40</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>4345..5849</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>4345</INSDInterval_from>
<INSDInterval_to>5849</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP40</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>matrix protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GATGAAGATTAAGAAAAACCTACCTCGACTGAGAGAGTGTTTTTTCATTAACCTTCATCTTGTAAACGTTGAGCAAAATTGTTAAAAATATGAGGCGGGTTATATTGCCTACTGCTCCTCCTGAATATATGGAGGCCATATACCCTGCCAGGTCAAATTCAACAATTGCTAGGGGTGGCAACAGCAATACAGGCTTCCTGACACCGGAGTCAGTCAATGGAGACACTCCATCGAATCCACTCAGGCCAATTGCTGATGACACCATCGACCATGCCAGCCACACACCAGGCAGTGTGTCATCAGCATTCATCCTCGAAGCTATGGTGAATGTCATATCGGGCCCCAAAGTGCTAATGAAGCAAATTCCAATTTGGCTTCCTCTAGGTGTCGCTGATCAAAAGACCTACAGCTTTGACTCAACTACGGCCGCCATCATGCTTGCTTCATATACTATCACCCATTTCGGCAAGGCAACCAATCCGCTTGTCAGAGTCAATCGGCTGGGTCCTGGAATCCCGGATCACCCCCTCAGGCTCCTGCGAATTGGAAACCAGGCTTTCCTCCAGGAGTTCGTTCTTCCACCAGTCCAACTACCCCAGTATTTCACCTTTGATTTGACAGCACTCAAACTGATCACTCAACCACTGCCTGCTGCAACATGGACCGATGACACTCCAACTGGATCAAATGGAGCGTTGCGTCCAGGAATTTCATTTCATCCAAAACTTCGCCCCATTCTTTTACCCAACAAAAGTGGGAAGAAGGGGAACAGTGCCGATCTAACATCTCCGGAGAAAATCCAAGCAATAATGACTTCACTCCAGGACTTTAAGATCGTTCCAATTGATCCAACCAAAAATATCATGGGTATCGAAGTGCCAGAAACTCTGGTCCACAAGCTGACCGGTAAGAAGGTGACTTCCAAAAATGGACAACCAATCATCCCTGTTCTTTTGCCAAAGTACATTGGGTTGGACCCGGTGGCTCCAGGAGACCTCACCATGGTAATCACACAGGATTGTGACACGTGTCATTCTCCTGCAAGTCTTCCAGCTGTGGTTGAGAAGTAATTGCAATAATTGACTCAGATCCAGTTTTACAGAATCTTCTCAGGGATAGTGATAACATCTTTTTAATAATCCGTCTACTAGAAGAGATACTTCTAATTGATCAATATACTAAAGGTGCTTTACACCATTGTCTCTTTTCTCTCCTAAATGTAGAGCTTAACAAAAGACTCATAATATACCTGTTTTTAAAAGATTGATTGATGAAAGATCATGACTAATAACATTACAAACAATCCTACTATAATCAATACGGTGATTCAAATGTCAATCTTTCTCATTGCACATACTCTTTGTCCTTATCCTCAAATTGCCTACATGCTTACATCTGAGGACAGCCAGTGTGACTTGGATTGGAGATGTGGAGGAAAAATCGGGGCCCATTTCTAAGTTGTTCACAATCTAAGTACAGACATTGCTCTTCTAATTAAGAAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>4345..4356</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>4345</INSDInterval_from>
<INSDInterval_to>4356</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP40</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>4352..4362</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>4352</INSDInterval_from>
<INSDInterval_to>4362</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP35</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>4434..5414</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>4434</INSDInterval_from>
<INSDInterval_to>5414</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP40</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>matrix protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11799.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348598</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MRRVILPTAPPEYMEAIYPARSNSTIARGGNSNTGFLTPESVNGDTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVNVISGPKVLMKQIPIWLPLGVADQKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQEFVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILLPNKSGKKGNSADLTSPEKIQAIMTSLQDFKIVPIDPTKNIMGIEVPETLVHKLTGKKVTSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASLPAVVEK</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>5838..5849</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5838</INSDInterval_from>
<INSDInterval_to>5849</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP40</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>5855..8260</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5855</INSDInterval_from>
<INSDInterval_to>8260</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>5855..5866</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5855</INSDInterval_from>
<INSDInterval_to>5866</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>join(<5994..6878,6878..>8023)</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>6878</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
<INSDInterval>
<INSDInterval_from>6878</INSDInterval_from>
<INSDInterval_to>8023</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_operator>join</INSDFeature_operator>
<INSDFeature_partial5 value="true"/>
<INSDFeature_partial3 value="true"/>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>virion spike glycoprotein precursor</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>ribosomal slippage</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>ATGGGTGTTACAGGAATATTGCAGTTACCTCGTGATCGATTCAAGAGGACATCATTCTTTCTTTGGGTAATTATCCTTTTCCAAAGAACATTTTCCATCCCGCTTGGAGTTATCCACAATAGTACATTACAGGTTAGTGATGTCGACAAACTAGTTTGTCGTGACAAACTGTCATCCACAAATCAATTGAGATCAGTTGGACTGAATCTCGAGGGGAATGGAGTGGCAACTGACGTGCCATCTGTGACTAAAAGATGGGGCTTCAGGTCCGGTGTCCCACCAAAGGTGGTCAATTATGAAGCTGGTGAATGGGCTGAAAACTGCTACAATCTTGAAATCAAAAAACCTGACGGGAGTGAGTGTCTACCAGCAGCGCCAGACGGGATTCGGGGCTTCCCCCGGTGCCGGTATGTGCACAAAGTATCAGGAACGGGACCATGTGCCGGAGACTTTGCCTTCCACAAAGAGGGTGCTTTCTTCCTGTATGATCGACTTGCTTCCACAGTTATCTACCGAGGAACGACTTTCGCTGAAGGTGTCGTTGCATTTCTGATACTGCCCCAAGCTAAGAAGGACTTCTTCAGCTCACACCCCTTGAGAGAGCCGGTCAATGCAACGGAGGACCCGTCGAGTGGCTATTATTCTACCACAATTAGATATCAGGCTACCGGTTTTGGAACTAATGAGACAGAGTACTTGTTCGAGGTTGACAATTTGACCTACGTCCAACTTGAATCAAGATTCACACCACAGTTTCTGCTCCAGCTGAATGAGACAATATATGCAAGTGGGAAGAGGAGCAACACCACGGGAAAACTAATTTGGAAGGTCAACCCCGAAATTGATACAACAATCGGGGAGTGGGCCTTCTGGGAAACTAAAAAAAACCTCACTAGAAAAATTCGCAGTGAAGAGTTGTCTTTCACAGCTGTATCAAACGGACCCAAAAACATCAGTGGTCAGAGTCCGGCGCGAACTTCTTCCGACCCAGAGACCAACACAACAAATGAAGACCACAAAATCATGGCTTCAGAAAATTCCTCTGCAATGGTTCAAGTGCACAGTCAAGGAAGGAAAGCTGCAGTGTCGCATCTGACAACCCTTGCCACAATCTCCACGAGTCCTCAACCTCCCACAACCAAAACAGGTCCGGACAACAGCACCCATAATACACCCGTGTATAAACTTGACATCTCTGAGGCAACTCAAGTTGGACAACATCACCGTAGAGCAGACAACGACAGCACAGCCTCCGACACTCCCCCCGCCACGACCGCAGCCGGACCCTTAAAAGCAGAGAACACCAACACGAGTAAGAGCGCTGACTCCCTGGACCTCGCCACCACGACAAGCCCCCAAAACTACAGCGAGACTGCTGGCAACAACAACACTCATCACCAAGATACCGGAGAAGAGAGTGCCAGCAGCGGGAAGCTAGGCTTAATTACCAATACTATTGCTGGAGTAGCAGGACTGATCACAGGCGGGAGAAGGACTCGAAGAGAAGTAATTGTCAATGCTCAACCCAAATGCAACCCCAATTTACATTACTGGACTACTCAGGATGAAGGTGCTGCAATCGGATTGGCCTGGATACCATATTTCGGGCCAGCAGCCGAAGGAATTTACACAGAGGGGCTAATGCACAACCAAGATGGTTTAATCTGTGGGTTGAGGCAGCTGGCCAACGAAACGACTCAAGCTCTCCAACTGTTCCTGAGAGCCACAACTGAGCTGCGAACCTTTTCAATCCTCAACCGTAAGGCAATTGACTTCCTGCTGCAGCGATGGGGTGGCACATGCCACATTTTGGGACCGGACTGCTGTATCGAACCACATGATTGGACCAAGAACATAACAGACAAAATTGATCAGATTATTCATGATTTTGTTGATAAAACCCTTCCGGACCAGGGGGACAATGACAATTGGTGGACAGGATGGAGACAATGGATACCGGCAGGTATTGGAGTTACAGGTGTTATAATTGCAGTTATCGCTTTATTCTGTATATGCAAATTTGTCTTTTAG</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>join(5994..6878,6878..8023)</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>6878</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
<INSDInterval>
<INSDInterval_from>6878</INSDInterval_from>
<INSDInterval_to>8023</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_operator>join</INSDFeature_operator>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>ribosomal_slippage</INSDQualifier_name>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>additional A residue inserted during transcription; encodes two disulfide linked subunits GP1 and GP2; receptor binding and fusion</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>virion spike glycoprotein precursor</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11800.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348599</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSVTKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFFLYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYASGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTAVSNGPKNISGQSPARTSSDPETNTTNEDHKIMASENSSAMVQVHSQGRKAAVSHLTTLATISTSPQPPTTKTGPDNSTHNTPVYKLDISEATQVGQHHRRADNDSTASDTPPATTAAGPLKAENTNTSKSADSLDLATTTSPQNYSETAGNNNTHHQDTGEESASSGKLGLITNTIAGVAGLITGGRRTRREVIVNAQPKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYTEGLMHNQDGLICGLRQLANETTQALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKIDQIIHDFVDKTLPDQGDNDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location><5994..>7088</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>7088</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_partial5 value="true"/>
<INSDFeature_partial3 value="true"/>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>sGP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>ATGGGTGTTACAGGAATATTGCAGTTACCTCGTGATCGATTCAAGAGGACATCATTCTTTCTTTGGGTAATTATCCTTTTCCAAAGAACATTTTCCATCCCGCTTGGAGTTATCCACAATAGTACATTACAGGTTAGTGATGTCGACAAACTAGTTTGTCGTGACAAACTGTCATCCACAAATCAATTGAGATCAGTTGGACTGAATCTCGAGGGGAATGGAGTGGCAACTGACGTGCCATCTGTGACTAAAAGATGGGGCTTCAGGTCCGGTGTCCCACCAAAGGTGGTCAATTATGAAGCTGGTGAATGGGCTGAAAACTGCTACAATCTTGAAATCAAAAAACCTGACGGGAGTGAGTGTCTACCAGCAGCGCCAGACGGGATTCGGGGCTTCCCCCGGTGCCGGTATGTGCACAAAGTATCAGGAACGGGACCATGTGCCGGAGACTTTGCCTTCCACAAAGAGGGTGCTTTCTTCCTGTATGATCGACTTGCTTCCACAGTTATCTACCGAGGAACGACTTTCGCTGAAGGTGTCGTTGCATTTCTGATACTGCCCCAAGCTAAGAAGGACTTCTTCAGCTCACACCCCTTGAGAGAGCCGGTCAATGCAACGGAGGACCCGTCGAGTGGCTATTATTCTACCACAATTAGATATCAGGCTACCGGTTTTGGAACTAATGAGACAGAGTACTTGTTCGAGGTTGACAATTTGACCTACGTCCAACTTGAATCAAGATTCACACCACAGTTTCTGCTCCAGCTGAATGAGACAATATATGCAAGTGGGAAGAGGAGCAACACCACGGGAAAACTAATTTGGAAGGTCAACCCCGAAATTGATACAACAATCGGGGAGTGGGCCTTCTGGGAAACTAAAAAAACCTCACTAGAAAAATTCGCAGTGAAGAGTTGTCTTTCACAGCTGTATCAAACGGACCCAAAAACATCAGTGGTCAGAGTCCGGCGCGAACTTCTTCCGACCCAGAGACCAACACAACAAATGAAGACCACAAAATCATGGCTTCAGAAAATTCCTCTGCAATGGTTCAAGTGCACAGTCAAGGAAGGAAAGCTGCAGTGTCGCATCTGA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>5994..7088</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>7088</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>sGP secreted as an anti-parallel oriented homodimer; small non-structural secreted glycoprotein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>sGP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11801.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348600</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSVTKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFFLYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYASGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKTSLEKFAVKSCLSQLYQTDPKTSVVRVRRELLPTQRPTQQMKTTKSWLQKIPLQWFKCTVKEGKLQCRI</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>join(<5994..6877,6879..>6888)</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>6877</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
<INSDInterval>
<INSDInterval_from>6879</INSDInterval_from>
<INSDInterval_to>6888</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_operator>join</INSDFeature_operator>
<INSDFeature_partial5 value="true"/>
<INSDFeature_partial3 value="true"/>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>ssGP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>ribosomal slippage</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>ATGGGTGTTACAGGAATATTGCAGTTACCTCGTGATCGATTCAAGAGGACATCATTCTTTCTTTGGGTAATTATCCTTTTCCAAAGAACATTTTCCATCCCGCTTGGAGTTATCCACAATAGTACATTACAGGTTAGTGATGTCGACAAACTAGTTTGTCGTGACAAACTGTCATCCACAAATCAATTGAGATCAGTTGGACTGAATCTCGAGGGGAATGGAGTGGCAACTGACGTGCCATCTGTGACTAAAAGATGGGGCTTCAGGTCCGGTGTCCCACCAAAGGTGGTCAATTATGAAGCTGGTGAATGGGCTGAAAACTGCTACAATCTTGAAATCAAAAAACCTGACGGGAGTGAGTGTCTACCAGCAGCGCCAGACGGGATTCGGGGCTTCCCCCGGTGCCGGTATGTGCACAAAGTATCAGGAACGGGACCATGTGCCGGAGACTTTGCCTTCCACAAAGAGGGTGCTTTCTTCCTGTATGATCGACTTGCTTCCACAGTTATCTACCGAGGAACGACTTTCGCTGAAGGTGTCGTTGCATTTCTGATACTGCCCCAAGCTAAGAAGGACTTCTTCAGCTCACACCCCTTGAGAGAGCCGGTCAATGCAACGGAGGACCCGTCGAGTGGCTATTATTCTACCACAATTAGATATCAGGCTACCGGTTTTGGAACTAATGAGACAGAGTACTTGTTCGAGGTTGACAATTTGACCTACGTCCAACTTGAATCAAGATTCACACCACAGTTTCTGCTCCAGCTGAATGAGACAATATATGCAAGTGGGAAGAGGAGCAACACCACGGGAAAACTAATTTGGAAGGTCAACCCCGAAATTGATACAACAATCGGGGAGTGGGCCTTCTGGGAAACTAAAAAACCTCACTAG</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>join(5994..6877,6879..6888)</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>5994</INSDInterval_from>
<INSDInterval_to>6877</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
<INSDInterval>
<INSDInterval_from>6879</INSDInterval_from>
<INSDInterval_to>6888</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_operator>join</INSDFeature_operator>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>ribosomal_slippage</INSDQualifier_name>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>one A residue is deleted or two additional A residues are inserted at the editing site during transcription; second non-structural secreted glycoprotein; secreted in a monomeric form</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>ssGP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11802.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348601</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQVSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSVTKRWGFRSGVPPKVVNYEAGEWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFFLYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTIRYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYASGKRSNTTGKLIWKVNPEIDTTIGEWAFWETKKPH</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>misc_feature</INSDFeature_key>
<INSDFeature_location>7481..7495</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>7481</INSDInterval_from>
<INSDInterval_to>7495</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>cleavage site; precursor GP is cleaved by subtilisin-like cellularprotease furin into subunits GP1 and GP2 that are linked by a disulfide bond</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>misc_feature</INSDFeature_key>
<INSDFeature_location>7745..7825</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>7745</INSDInterval_from>
<INSDInterval_to>7825</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>immunosuppressive motif</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>8243..9695</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>8243</INSDInterval_from>
<INSDInterval_to>9695</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>8243..9695</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>8243</INSDInterval_from>
<INSDInterval_to>9695</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GATGAAGATTAAGAAAAAGGTAATCTTTCGATTATCTTTAGTCTTCATCCTTGATTCTACAATCATGACAGTTGTCTTTAATGAAAAAGGAAAAAAGCCTTTTTATTAAGTTGTAATAATCAGATCTGCAAACCGGTAGAATTTAGTTGTAACCTAACACACACAAAGCATTGGTAAAAAAGTCAATAGAAATTTAAACAGTGAGTGCAGACAACTCTTAAATGGAAGCTTCATATGAGAGAGGACGCCCCCGAGCTGCCAGACAGCATTCAAGGGATGGACACGACCACCATGTTCGAGCACGATCATCATCCAGAGAGAATTATCGAGGTGAGTACCGTCAATCAAGGAGCGCCTCACAAGTGCGCGTTCCTACTGTATTTCATAAGAAGAGAGTTGAACCATTAACAGTTCCTCCAGCACCTAAAGACATATGTCCGACCTTGAAAAAAGGATTTTTGTGTGACAGTAGTTTTTGCAAAAAAGACCACCAGTTAGAAAGTTTAACTGATAGGGAATTACTCCTACTAATCGCCCGTAAGACTTGTGGATCAGTAGAACAACAATTAAATATAACTGCACCCAAGGACTCGCGCTTAGCAAATCCAACGGCTGATGATTTCCAGCAAGAGGAAGGTCCAAAAATTACCTTGTTGACACTGATCAAGACGGCAGAACACTGGGCGAGACAAGACATCCGAACCATAGAGGATTCCAAATTAAGGGCATTGTTAACTCTATGTGCTGTGATGACGAGGAAATTCTCAAAATCCCAGCTGAGTCTTTTGTGTGAGACACACCTAAGGCGCGAAGGGCTTGGGCAAGATCAGGCAGAACCCGTTCTCGAAGTATATCAACGATTACACAGTGATAAAGGAGGCAGTTTTGAAGCTGCACTATGGCAACAATGGGACCGACAATCCCTAATTATGTTTATCACTGCATTCTTGAATATCGCTCTCCAGTTACCGTGTGAAAGTTCTGCTGTCGTTGTTTCAGGGTTAAGAACATTGGTTCCTCAATCAGATAATGAGGAAGCTTCAACCAACCCGGGGACATGCTCATGGTCTGATGAGGGTACCCCTTAATAAGGCTGACTAAAACACTATATAACCTTCTACTTGATCACAATACTCCGTATACCTATCATCATATATTTAATCAAGACGATATCCTTTAAAACTTATTCAGTACTATAATCACTCTCATTTCAAATTGATAAGATATGCATAATTGCCTTAATATATAAAGAGGTATGATATAACCCAAACATTGACCAAAGAAAATCATAATCTCGTATCGCTCGCAATATAACCTGCCAAGCATACCTCTTGCACAAAGTGATTCTTGTACACAAATAATGTTTGACTCTACAGGAGGTAGCAACGATCCATCTCATCAAAAAATAAGTATTTTATGATTTACTAATGATCTCTTAAAATATTAAGAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>8243..8254</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>8243</INSDInterval_from>
<INSDInterval_to>8254</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>8250..8260</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>8250</INSDInterval_from>
<INSDInterval_to>8260</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>GP</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>8464..9330</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>8464</INSDInterval_from>
<INSDInterval_to>9330</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>minor nucleoprotein; polymerase complex protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11803.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348602</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MEASYERGRPRAARQHSRDGHDHHVRARSSSRENYRGEYRQSRSASQVRVPTVFHKKRVEPLTVPPAPKDICPTLKKGFLCDSSFCKKDHQLESLTDRELLLLIARKTCGSVEQQLNITAPKDSRLANPTADDFQQEEGPKITLLTLIKTAEHWARQDIRTIEDSKLRALLTLCAVMTRKFSKSQLSLLCETHLRREGLGQDQAEPVLEVYQRLHSDKGGSFEAALWQQWDRQSLIMFITAFLNIALQLPCESSAVVVSGLRTLVPQSDNEEASTNPGTCSWSDEGTP</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>9685..9695</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>9685</INSDInterval_from>
<INSDInterval_to>9695</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP30</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>9840..11473</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>9840</INSDInterval_from>
<INSDInterval_to>11473</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>9840..11473</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>9840</INSDInterval_from>
<INSDInterval_to>11473</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GATGAAGATTAATGCGGAGGTCTGATGAGAATAAACCTTATTATTCAGATTAGGCCCCAAGAGGCATTCTTCATCTCCTTTTAGCAAAATACTATTTCAGGATAGTCCAGCTAGTGACACGTCTTTTAGCTGTATACCAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGAGCTAAAGTGGTCTGTACACATCTCATACATTGTATTAGGGGCAATAATATCTAATTGAACTTAGCCATTTAAAATTTAGTGCATAAATCTGGGCTAACTCCACCAGGTCAACTCCATTGGCTGAAAAGAAGCCCACCTACAACGAACATTACTTTGAGCGCCCTCACAATTAAAAAATAAGAGCGTCGTTCCAACAATCGAGCGCAAGGTTACAAGGTTGAACTGAGAGTGTCTAGACAACAAAATATCGATACTCCAGACACCAAGCAAGACCTGAGAAAAAACCATGGCCAAAGCTACGGGACGATACAATCTAATATCGCCCAAAAAGGACCTGGAGAAAGGGGTTGTCTTAAGCGACCTCTGTAACTTCTTAGTTAGTCAAACTATTCAAGGGTGGAAAGTTTATTGGGCTGGTATTGAGTTTGATGTGACTCACAAAGGAATGGCCCTATTGCATAGACTGAAAACTAATGACTTTGCCCCTGCATGGTCAATGACAAGGAACCTATTTCCCCATTTATTTCAAAATCCGAATTCCACTATTGAATCACCGCTGTGGGCACTGAGAGTCATCCTTGCAGCAGGGATACAGGACCAGTTAATTGACCAGTCTTTGATTGAACCCTTAGCAGGAGCCCTTGGTCTGATCTCTGATTGGCTGCTAACAACCAACACTAACCATTTCAACATGCGAACACAACGTGTCAAGGAACAATTGAGCCTAAAAATGCTGTCGTTGATTCGATCCAATATTCTCAAGTTTATTAACAAATTGGATGCTCTACATGTCGTGAACTACAATGGATTATTGAGCAGTATTGAAATTGGAACTCAAAATCATACAATCATCATAACTCGAACTAACATGGGTTTTCTGGTGGAGCTCCAAGAACCCGACAAATCGGCAATGAACCGCAAGAAGCCTGGGCCGGCGAAATTTTCCCTCCTTCATGAGTCCACACTGAAAGCATTTACACAAGGGTCCTCGACACGAATGCAAAGTTTAATTCTTGAATTCAATAGCTCTCTTGCTATCTAACTAAGATGGAATACTTCATATTGGGCTAACTCATATATGCTGACTCAATAGTTAACTTGACATCTCTGCCTTCATAATCAGATATATAAGCATAATAAATAAATACTCATATTTCTTGATAATTTGTTTAACCACAGATAAATCCTCACTGTAAGCCAGCTTCCAAGTTGACACCCTTACAAAAACCAGGACTCAGAATCCCTCAAATAAGAGATTCCAAGACAACATCATAGAATTGCTTTATTATATTAATAAGCATTTTATCACTAGAAATCCAATATACGAAATGGTTAATTGTAACTAAACCCGCAGGTCATGTGTGTTAGGTTTCACAAATTATATATATTACTAACTCCATACTCGTAACTAACATTAGATAAGTAGGTTAAGAAAAAAGCTTGAGGAAGATTAAGAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>9840..9851</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>9840</INSDInterval_from>
<INSDInterval_to>9851</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>10300..11055</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>10300</INSDInterval_from>
<INSDInterval_to>11055</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>membrane-associated protein</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11804.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348603</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MAKATGRYNLISPKKDLEKGVVLSDLCNFLVSQTIQGWKVYWAGIEFDVTHKGMALLHRLKTNDFAPAWSMTRNLFPHLFQNPNSTIESPLWALRVILAAGIQDQLIDQSLIEPLAGALGLISDWLLTTNTNHFNMRTQRVKEQLSLKMLSLIRSNILKFINKLDALHVVNYNGLLSSIEIGTQNHTIIITRTNMGFLVELQEPDKSAMNRKKPGPAKFSLLHESTLKAFTQGSSTRMQSLILEFNSSLAI</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>11440..11451</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11440</INSDInterval_from>
<INSDInterval_to>11451</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>gene</INSDFeature_key>
<INSDFeature_location>11456..18237</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11456</INSDInterval_from>
<INSDInterval_to>18237</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>mRNA</INSDFeature_key>
<INSDFeature_location>11456..18237</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11456</INSDInterval_from>
<INSDInterval_to>18237</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>polymerase</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transcription</INSDQualifier_name>
<INSDQualifier_value>GAGGAAGATTAAGAAAAACTGCTTATTGGGTCTTTCCGTGTTTTAGATGAAGCAGTTGACATTCTTCCTCTTGATATTAAATGGCTACACAACATACCCAATACCCAGACGCCAGGTTATCATCACCAATTGTATTGGACCAATGTGACCTTGTCACTAGAGCTTGCGGGTTGTATTCATCATACTCCCTTAATCCGCAACTACGCAACTGTAAACTCCCGAAACATATATACCGTTTAAAATATGATGTAACTGTTACCAAGTTCTTAAGTGATGTACCAGTGGCGACATTGCCCATAGATTTCATAGTCCCAATTCTTCTCAAGGCACTATCAGGCAATGGGTTCTGTCCTGTTGAGCCGCGGTGCCAACAGTTCTTAGATGAAATTATTAAGTACACAATGCAAGATGCTCTCTTCCTGAAATATTATCTCAAAAATGTGGGTGCTCAAGAAGACTGTGTTGATGACCACTTTCAAGAAAAAATCTTATCTTCAATTCAGGGCAATGAATTTTTACATCAAATGTTTTTCTGGTATGACCTGGCTATTTTAACTCGAAGGGGTAGATTAAATCGAGGAAACTCTAGATCAACGTGGTTTGTTCATGATGATTTAATAGACATCTTAGGCTATGGGGACTATGTTTTTTGGAAGATCCCAATTTCACTGTTACCACTGAACACACAAGGAATCCCCCATGCTGCTATGGATTGGTATCAGACATCAGTATTCAAAGAAGCGGTTCAAGGGCATACACACATTGTTTCTGTTTCTACTGCCGATGTCTTGATAATGTGCAAAGATTTAATTACATGTCGATTCAACACAACTCTAATCTCAAAAATAGCAGAGGTTGAGGACCCAGTTTGCTCTGATTATCCCAATTTTAAGATTGTGTCTATGCTTTACCAGAGCGGAGATTACTTACTCTCCATATTAGGGTCTGATGGGTATAAAATCATTAAGTTTCTCGAACCATTGTGCTTGGCTAAAATTCAATTGTGCTCAAAGTACACCGAGAGGAAGGGCCGATTCTTAACACAAATGCATTTAGCTGTAAATCACACCCTGGAAGAAATTACAGAAATACGTGCACTAAAGCCTTCACAGGCTCACAAGATCCGTGAATTCCATAGAACATTGATAAGGCTGGAGATGACGCCACAACAACTTTGTGAGCTATTTTCCATACAAAAACACTGGGGGCATCCTGTGCTACATAGTGAAACAGCAATCCAAAAAGTTAAAAAACATGCTACGGTGCTAAAAGCATTACGCCCTATCGTGATTTTCGAGACATATTGTGTTTTTAAATATAGCATTGCAAAACATTATTTTGATAGTCAAGGATCTTGGTACAGTGTTACCTCAGATAGAAATCTAACACCAGGTCTTAATTCTTATATCAAAAGAAATCAATTCCCTCCGTTGCCAATGATTAAAGAACTGCTATGGGAATTTTACCACCTTGACCATCCTCCACTTTTCTCAACCAAAATTATTAGTGACTTAAGTATTTTTATAAAAGACAGAGCTACTGCAGTAGAAAGGACATGCTGGGATGCAGTATTCGAGCCTAATGTTCTGGGATATAATCCACCTCACAAATTCAGTACCAAACGTGTACCGGAACAATTTTTAGAGCAAGAAAACTTTTCTATTGAGAATGTTCTTTCCTACGCGCAAAAACTCGAGTATCTACTACCACAATATCGGAATTTTTCTTTCTCATTGAAAGAGAAAGAGTTGAATGTAGGTAGAACTTTCGGAAAATTGCCTTATCCGACTCGCAATGTTCAAACACTTTGTGAAGCTCTGTTAGCTGATGGTCTTGCTAAAGCATTTCCTAGCAATATGATGGTAGTTACGGAACGTGAACAAAAAGAAAGCTTATTGCATCAAGCATCATGGCACCACACAAGTGATGATTTCGGTGAGCATGCCACAGTTAGAGGGAGTAGCTTTGTAACTGATTTAGAGAAATACAATCTTGCATTTAGGTATGAGTTTACAGCACCTTTTATAGAATATTGCAACCGTTGCTATGGTGTTAAGAATGTTTTTAATTGGATGCATTATACAATCCCACAGTGTTATATGCATGTCAGTGATTATTATAATCCACCGCATAACCTCACACTGGAAAATCGAAACAACCCCCCTGAAGGGCCTAGTTCATACAGGGGTCATATGGGAGGGATTGAAGGACTGCAACAAAAACTCTGGACAAGTATTTCATGTGCTCAAATTTCTTTAGTTGAAATTAAGACTGGTTTTAAGTTGCGCTCAGCTGTGATGGGTGACAATCAGTGCATTACCGTTTTATCAGTCTTCCCCTTAGAGACTGATGCAGGCGAGCAGGAACAGAGCGCCGAGGACAATGCAGCGAGGGTGGCCGCCAGCCTAGCAAAAGTTACAAGTGCCTGTGGAATCTTTTTAAAACCTGATGAAACATTTGTACATTCAGGTTTTATCTATTTTGGAAAAAAACAATATTTGAATGGGGTCCAATTGCCTCAGTCCCTTAAAACGGCTACAAGAATGGCACCATTGTCTGATGCAATTTTTGATGATCTTCAAGGGACCCTGGCTAGTATAGGTACTGCTTTTGAGCGATCCATCTCTGAGACACGACATATCTTTCCTTGCAGAATAACCGCAGCTTTCCATACGTTCTTTTCGGTGAGAATCTTGCAATATCATCACCTCGGATTTAATAAAGGTTTTGACCTTGGACAGTTAACACTCGGCAAACCTCTGGATTTCGGAACAATATCATTGGCACTAGCGGTACCGCAGGTGCTTGGAGGGTTATCCTTCTTGAATCCTGAGAAATGTTTCTACCGGAATCTAGGAGATCCAGTTACCTCAGGTTTATTCCAGTTAAAAACTTATCTCCGAATGATTGAGATGGATGATTTATTCTTACCTTTAATTGCGAAGAACCCTGGGAACTGCACTGCCATTGACTTTGTGCTAAATCCTAGCGGATTAAATGTTCCTGGGTCGCAAGACTTAACTTCATTTCTGCGCCAGATTGTACGTAGGACTATCACCCTAAGTGCGAAAAACAAACTTATTAATACCTTATTTCATGCATCAGCTGACTTCGAAGACGAAATGGTTTGTAAGTGGCTCTTATCATCAACTCCTGTTATGAGTCGTTTCGCAGCCGATATATTTTCACGCACGCCGAGCGGGAAGCGATTGCAAATTCTAGGATACTTGGAAGGAACACGCACATTATTAGCCTCTAAGATCATCAACAATAATACAGAGACGCCGGTTTTGGACAGACTGAGGAAGATAACATTGCAAAGGTGGAGTCTATGGTTTAGTTATCTTGATCATTGTGATAATATCCTGGCGGAGGCTTTAACCCAAATAACTTGCACAGTTGATTTAGCACAGATCCTGAGGGAATATTCATGGGCACATATTTTAGAGGGGAGACCTCTTATTGGAGCCACACTCCCATGTATGATTGAGCAATTCAAAGTGGTTTGGCTGAAACCCTACGAACAATGTCCGCAGTGTTCAAATGCCAAGCAACCTGGTGGGAAACCATTCGTGTCAGTAGCAGTCAAGAAACATATTGTTAGTGCATGGCCAAATGCATCCCGAATAAGCTGGACTATCGGGGATGGAATCCCATACATTGGATCAAGGACAGAAGATAAGATAGGGCAACCTGCTATTAAACCAAAATGTCCTTCCGCAGCCTTAAGAGAGGCCATTGAATTGGCGTCCCGTTTAACATGGGTAACTCAAGGCAGTTCGAACAGTGACTTGCTAATAAAACCATTTTTGGAAGCACGAGTAAATTTAAGTGTTCAAGAAATACTTCAAATGACCCCTTCACATTACTCGGGAAATATTGTTCATAGGTACAACGATCAATACAGTCCTCATTCTTTCATGGCCAATCGTATGAGTAACTCAGCAACGCGATTGATTGTTTCTACAAACACTTTAGGTGAGTTTTCAGGAGGTGGCCAATCGGCACGCGACAGCAATATTATTTTCCAGAATGTTATAAATTATGCAGTTGCACTGTTCGATATTAAATTTAGAAACACTGAGGCTACAGATATCCAGTATAATCGTGCTCACCTTCATCTAACTAAGTGTTGCACCCGGGAGGTACCAGCTCAGTACTTAACATACACATCTACATTGGATTTAGATTTAACAAGATACCGAGAAAATGAATTGATTTATGACAATAATCCTCTAAAAGGAGGACTCAATTGCAATATCTCATTTGATAACCCATTTTTCCAAGGCAAACAGCTGAACATTATAGAAGATGACCTTATTCGACTGCCTCACTTATCTGGATGGGAGCTAGCTAAGACCATCATGCAATCAATTATTTCAGATAGCAATAATTCGTCTACAGACCCAATTAGCAGTGGAGAAACAAGATCATTCACTACCCATTTCTTAACTTATCCCAAGATAGGACTTCTGTACAGTTTTGGGGCCTTTGTAAGTTATTATCTTGGCAATACAATTCTTCGGACTAAGAAATTAACACTTGACAATTTTTTATATTACTTAACTACCCAAATTCATAATCTACCACATCGCTCATTGCGAATACTTAAGCCAACATTCAAACATGCAAGCGTTATGTCACGATTAATGAGTATTGATCCCCATTTTTCTATTTACATAGGCGGTGCTGCAGGTGACAGAGGACTCTCAGATGCGGCCAGGTTATTTTTGAGAACGTCCATTTCATCTTTTCTTACATTTGTAAAGGAATGGATAATTAATCGCGGAACAATTGTCCCTTTATGGATAGTATATCCATTAGAGGGTCAAAATCCAACACCTGTTAATAATTTCCTCCATCAGATCGTAGAACTGCTGGTGCATGATTCATCAAGACACCAGGCTTTTAAAACTACCATAAATGATCATGTACATCCTCACGACAATCTTGTTTACACATGTAAGAGTACAGCCAGCAATTTCTTCCATGCGTCATTGGCGTACTGGAGGAGCAGGCACAGAAACAGCAACCGAAAAGACTTGACAAGAAACTCTTCAACTGGATCAAGCACAAACAACAGTGATGGTCATATTAAGAGAAGTCAAGAACAAACCACCAGAGATCCACATGATGGCACTGAACGGAGTCTAGTCCTGCAAATGAGCCATGAAATAAAAAGAACGACAATTCCACAAGAGAACACGCACCAGGGTCCGTCGTTCCAGTCATTTCTAAGTGACTCTGCTTGCGGTACAGCAAACCCAAAACTAAATTTCGATAGATCGAGACACAATGTGAAATCTCAGGATCATAACTCAGCATCCAAGAGGGAAGGTCATCAAATAATCTCACATCGTCTAGTCCTACCTTTCTTTACATTATCTCAAGGGACACGCCAATTAACGTCATCCAATGAGTCACAAACCCAAGATGAGATATCAAAGTACTTACGGCAATTGAGATCCGTCATTGATACCACAGTTTATTGTAGGTTTACCGGTATAGTCTCGTCCATGCATTACAAACTTGATGAGGTCCTTTGGGAAATAGAGAATTTTAAGTCGGCTGTGACGCTGGCAGAGGGAGAAGGTGCTGGTGCCTTACTATTGATTCAGAAATACCAAGTTAAGACCTTATTTTTCAACACGCTAGCTACTGAGTCCAGTATAGAGTCAGAAATAGTATCAGGAATGACTACTCCTAGGATGCTTCTACCTGTTATGTCAAAATTCCATAATGACCAAATTGAGATTATTCTTAACAACTCAGCAAGCCAAATAACAGACATAACAAATCCTACTTGGTTTAAAGACCAAAGAGCAAGGCTACCTAGGCAAGTCGAGGTTATAACCATGGATGCAGAGACGACAGAGAATATAAACAGATCGAAATTGTACGAAGCTGTACATAAATTGATCTTACACCATGTTGATCCCAGCGTATTGAAAGCAGTGGTCCTTAAAGTCTTTCTAAGTGATACCGAGGGTATGTTATGGCTAAATGATAATCTAGCCCCGTTTTTTGCCACTGGGTATTTAATTAAGCCAATAACGTCAAGTGCCAGGTCTAGTGAGTGGTATCTTTGTCTGACGAACTTCTTATCAACTACACGTAAGATGCCACACCAAAACCATCTCAGTTGTAAGCAGGTAATACTTACGGCATTGCAACTGCAAATTCAACGGAGCCCATACTGGCTAAGTCATTTAACTCAGTATGCTGACTGCGATTTACATTTAAGCTATATCCGCCTTGGTTTTCCATCATTAGAGAAAGTACTATACCACAGGTATAACCTTGTCGATTCAAAAAGAGGTCCACTAGTCTCTGTCACTCAGCACTTAGCACATCTTAGGGCAGAGATTCGAGAATTGACCAATGATTATAATCAACAGCGACAAAGTCGGACTCAAACATATCACTTTATTCGTACTGCAAAAGGACGAATCACAAAACTAGTCAATGATTATTTAAAATTCTTTCTTATTGTACAAGCATTAAAACATAATGGGACATGGCAAGCTGAGTTTAAGAAATTACCAGAGTTGATTAGTGTGTGCAATAGGTTCTATCATATTAGAGATTGTAATTGTGAAGAACGTTTCTTAGTTCAAACCTTATATTTACATAGAATGCAGGATTCTGAAGTTAAGCTTATCGAAAGGCTGACAGGGCTTCTGAGTTTATTTCCAGATGGTCTCTACAGGTTCGATTGAATAACCGTGCATAGTATTTTGATACTTGTAAAGGTTGGTTATCAACATACAGATTATAAAAAA</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>11456..11467</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11456</INSDInterval_from>
<INSDInterval_to>11467</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>other</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative transcription start signal</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>11463..11473</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11463</INSDInterval_from>
<INSDInterval_to>11473</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>VP24</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>putative</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>CDS</INSDFeature_key>
<INSDFeature_location>11536..18174</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>11536</INSDInterval_from>
<INSDInterval_to>18174</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>note</INSDQualifier_name>
<INSDQualifier_value>involved in synthesis of viral RNAs and transcriptional RNA editing</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>codon_start</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>transl_table</INSDQualifier_name>
<INSDQualifier_value>1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>product</INSDQualifier_name>
<INSDQualifier_value>polymerase</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>protein_id</INSDQualifier_name>
<INSDQualifier_value>AIE11805.1</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>db_xref</INSDQualifier_name>
<INSDQualifier_value>GI:661348604</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>translation</INSDQualifier_name>
<INSDQualifier_value>MATQHTQYPDARLSSPIVLDQCDLVTRACGLYSSYSLNPQLRNCKLPKHIYRLKYDVTVTKFLSDVPVATLPIDFIVPILLKALSGNGFCPVEPRCQQFLDEIIKYTMQDALFLKYYLKNVGAQEDCVDDHFQEKILSSIQGNEFLHQMFFWYDLAILTRRGRLNRGNSRSTWFVHDDLIDILGYGDYVFWKIPISLLPLNTQGIPHAAMDWYQTSVFKEAVQGHTHIVSVSTADVLIMCKDLITCRFNTTLISKIAEVEDPVCSDYPNFKIVSMLYQSGDYLLSILGSDGYKIIKFLEPLCLAKIQLCSKYTERKGRFLTQMHLAVNHTLEEITEIRALKPSQAHKIREFHRTLIRLEMTPQQLCELFSIQKHWGHPVLHSETAIQKVKKHATVLKALRPIVIFETYCVFKYSIAKHYFDSQGSWYSVTSDRNLTPGLNSYIKRNQFPPLPMIKELLWEFYHLDHPPLFSTKIISDLSIFIKDRATAVERTCWDAVFEPNVLGYNPPHKFSTKRVPEQFLEQENFSIENVLSYAQKLEYLLPQYRNFSFSLKEKELNVGRTFGKLPYPTRNVQTLCEALLADGLAKAFPSNMMVVTEREQKESLLHQASWHHTSDDFGEHATVRGSSFVTDLEKYNLAFRYEFTAPFIEYCNRCYGVKNVFNWMHYTIPQCYMHVSDYYNPPHNLTLENRNNPPEGPSSYRGHMGGIEGLQQKLWTSISCAQISLVEIKTGFKLRSAVMGDNQCITVLSVFPLETDAGEQEQSAEDNAARVAASLAKVTSACGIFLKPDETFVHSGFIYFGKKQYLNGVQLPQSLKTATRMAPLSDAIFDDLQGTLASIGTAFERSISETRHIFPCRITAAFHTFFSVRILQYHHLGFNKGFDLGQLTLGKPLDFGTISLALAVPQVLGGLSFLNPEKCFYRNLGDPVTSGLFQLKTYLRMIEMDDLFLPLIAKNPGNCTAIDFVLNPSGLNVPGSQDLTSFLRQIVRRTITLSAKNKLINTLFHASADFEDEMVCKWLLSSTPVMSRFAADIFSRTPSGKRLQILGYLEGTRTLLASKIINNNTETPVLDRLRKITLQRWSLWFSYLDHCDNILAEALTQITCTVDLAQILREYSWAHILEGRPLIGATLPCMIEQFKVVWLKPYEQCPQCSNAKQPGGKPFVSVAVKKHIVSAWPNASRISWTIGDGIPYIGSRTEDKIGQPAIKPKCPSAALREAIELASRLTWVTQGSSNSDLLIKPFLEARVNLSVQEILQMTPSHYSGNIVHRYNDQYSPHSFMANRMSNSATRLIVSTNTLGEFSGGGQSARDSNIIFQNVINYAVALFDIKFRNTEATDIQYNRAHLHLTKCCTREVPAQYLTYTSTLDLDLTRYRENELIYDNNPLKGGLNCNISFDNPFFQGKQLNIIEDDLIRLPHLSGWELAKTIMQSIISDSNNSSTDPISSGETRSFTTHFLTYPKIGLLYSFGAFVSYYLGNTILRTKKLTLDNFLYYLTTQIHNLPHRSLRILKPTFKHASVMSRLMSIDPHFSIYIGGAAGDRGLSDAARLFLRTSISSFLTFVKEWIINRGTIVPLWIVYPLEGQNPTPVNNFLHQIVELLVHDSSRHQAFKTTINDHVHPHDNLVYTCKSTASNFFHASLAYWRSRHRNSNRKDLTRNSSTGSSTNNSDGHIKRSQEQTTRDPHDGTERSLVLQMSHEIKRTTIPQENTHQGPSFQSFLSDSACGTANPKLNFDRSRHNVKSQDHNSASKREGHQIISHRLVLPFFTLSQGTRQLTSSNESQTQDEISKYLRQLRSVIDTTVYCRFTGIVSSMHYKLDEVLWEIENFKSAVTLAEGEGAGALLLIQKYQVKTLFFNTLATESSIESEIVSGMTTPRMLLPVMSKFHNDQIEIILNNSASQITDITNPTWFKDQRARLPRQVEVITMDAETTENINRSKLYEAVHKLILHHVDPSVLKAVVLKVFLSDTEGMLWLNDNLAPFFATGYLIKPITSSARSSEWYLCLTNFLSTTRKMPHQNHLSCKQVILTALQLQIQRSPYWLSHLTQYADCDLHLSYIRLGFPSLEKVLYHRYNLVDSKRGPLVSVTQHLAHLRAEIRELTNDYNQQRQSRTQTYHFIRTAKGRITKLVNDYLKFFLIVQALKHNGTWQAEFKKLPELISVCNRFYHIRDCNCEERFLVQTLYLHRMQDSEVKLIERLTGLLSLFPDGLYRFD</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
<INSDFeature>
<INSDFeature_key>regulatory</INSDFeature_key>
<INSDFeature_location>18227..18237</INSDFeature_location>
<INSDFeature_intervals>
<INSDInterval>
<INSDInterval_from>18227</INSDInterval_from>
<INSDInterval_to>18237</INSDInterval_to>
<INSDInterval_accession>KM034549.1</INSDInterval_accession>
</INSDInterval>
</INSDFeature_intervals>
<INSDFeature_quals>
<INSDQualifier>
<INSDQualifier_name>regulatory_class</INSDQualifier_name>
<INSDQualifier_value>polyA_signal_sequence</INSDQualifier_value>
</INSDQualifier>
<INSDQualifier>
<INSDQualifier_name>gene</INSDQualifier_name>
<INSDQualifier_value>L</INSDQualifier_value>
</INSDQualifier>
</INSDFeature_quals>
</INSDFeature>
</INSDSeq_feature-table>
<INSDSeq_sequence>gaataactatgaggaagattaataattttcctctcattgaaatttatatcggaatttaaattgaaattgttactgtaatcatacctggtttgtttcagagccatatcaccaagatagagaacaacctaggtctccggagggggcaagggcatcagtgtgctcagttgaaaatcccttgtcaacatctaggccttatcacatcacaagttccgccttaaactctgcagggtgatccaacaaccttaatagcaacattattgttaaaggacagcattagttcacagtcaaacaagcaagattgagaattaactttgattttgaacctgaacacccagaggactggagactcaacaaccctaaagcctggggtaaaacattagaaatagtttaaagacaaattgctcggaatcacaaaattccgagtatggattctcgtcctcagaaagtctggatgacgccgagtctcactgaatctgacatggattaccacaagatcttgacagcaggtctgtccgttcaacaggggattgttcggcaaagagtcatcccagtgtatcaagtaaacaatcttgaggaaatttgccaacttatcatacaggcctttgaagctggtgttgattttcaagagagtgcggacagtttccttctcatgctttgtcttcatcatgcgtaccaaggagattacaaacttttcttggaaagtggcgcagtcaagtatttggaagggcacgggttccgttttgaagtcaagaagcgtgatggagtgaagcgccttgaggaattgctgccagcagtatctagtgggagaaacattaagagaacacttgctgccatgccggaagaggagacgactgaagctaatgccggtcagttcctctcctttgcaagtctattccttccgaaattggtagtaggagaaaaggcttgccttgagaaggttcaaaggcaaattcaagtacatgcagagcaaggactgatacaatatccaacagcttggcaatcagtaggacacatgatggtgattttccgtttgatgcgaacaaattttttgatcaaatttcttctaatacaccaagggatgcacatggttgccggacatgatgccaacgatgctgtgatttcaaattcagtggctcaagctcgtttttcaggtctattgattgtcaaaacagtacttgatcatatcctacaaaagacagaacgaggagttcgtctccatcctcttgcaaggaccgccaaggtaaaaaatgaggtgaactccttcaaggctgcactcagctccctggccaagcatggagagtatgctcctttcgcccgacttttgaacctttctggagtaaataatcttgagcatggtcttttccctcaactgtcggcaattgcactcggagtcgccacagcccacgggagcaccctcgcaggagtaaatgttggagaacagtatcaacagctcagagaggcagccactgaggctgagaagcaactccaacaatatgcggagtctcgtgaacttgaccatcttggacttgatgatcaggaaaagaaaattcttatgaacttccatcagaaaaagaacgaaatcagcttccagcaaacaaacgcgatggtaactctaagaaaagagcgcctggccaagctgacagaagctatcactgctgcatcactgcccaaaacaagtggacattacgatgatgatgacgacattccctttccaggacccatcaatgatgacgacaatcctggccatcaagatgatgatccgactgactcacaggatacgaccattcccgatgtggtagttgaccccgatgatggaggctacggcgaataccaaagttactcggaaaacggcatgagtgcaccagatgacttggtcctattcgatctagacgaggacgacgaggacaccaagccagtgcctaacagatcgaccaagggtggacaacagaaaaacagtcaaaagggccagcatacagagggcagacagacacaatccacgccaactcaaaacgtcacaggccctcgcagaacaatccaccatgccagtgctccactcacggacaatgacagaagaaacgaaccctccggctcaaccagccctcgcatgctgaccccaatcaacgaagaggcagacccactggacgatgccgacgacgagacgtctagccttccgcccttagagtcagatgatgaagaacaggacagggacggaacttctaaccgcacacccactgtcgccccaccggctcccgtatacagagatcactccgaaaagaaagaactcccgcaagatgaacaacaagatcaggaccacattcaagaggccaggaaccaagacagtgacaacacccagccagaacattcttttgaggagatgtatcgccacattctaagatcacaggggccatttgatgccgttttgtattatcatatgatgaaggatgagcctgtagttttcagtaccagtgatggtaaagagtacacgtatccggactcccttgaagaggaatatccaccatggctcactgaaaaagaggccatgaatgatgagaatagatttgttacactggatggtcaacaattttattggccagtaatgaatcacaggaataaattcatggcaatcctgcaacatcatcagtgaatgagcatgtaataatgggatgatttaatcgacaaatagctaacattaaatagtcaaggaacgcaaacaggaagaatttttgatgtctaaggtgtgaattattatcacaataaaagtgattcttagttttgaatttaaagctagcttattattactagccgtttttcaaagttcaatttgagtcttaatgcaaataagcgttaagccacagttatagccataatggtaactcaatatcttagccagcgatttatctaaattaaattacattatgcttttataacttacctactagcctgcccaacatttacacgatcgttttataattaagaaaaaactaatgatgaagattaaaaccttcatcatccttacgtcaattgaattctctagcactagaagcttattgtcttcaatgtaaaagaaaagctggcctaacaagatgacaactagaacaaagggcaggggccatactgtggccacgactcaaaacgacagaatgccaggccctgagctttcgggctggatctctgagcagctaatgaccggaaggattcctgtaaacgacatcttctgtgatattgagaacaatccaggattatgctacgcatcccaaatgcaacaaacgaagccaaacccgaagatgcgcaacagtcaaacccaaacggacccaatttgcaatcatagttttgaggaggtagtacaaacattggcttcattggctactgttgtgcaacaacaaaccatcgcatcagaatcattagaacaacgcattacgagtcttgagaatggtctaaagccagtttatgatatggcaaaaacaatctcctcattgaacagggtttgtgctgagatggttgcaaaatatgatcttctggtgatgacaaccggtcgggcaacagcaaccgctgcggcaactgaggcttattgggctgaacatggtcaaccaccacctggaccatcactttatgaagaaagtgcgattcggggtaagattgaatctagagatgagactgtccctcaaagtgttagggaggcattcaacaatctagacagtaccacttcactaactgaggaaaattttgggaaacctgacatttcggcaaaggatttgagaaacattatgtatgatcacttgcctggttttggaactgctttccaccaattagtacaagtgatttgtaaattgggaaaagatagcaattcattggacattattcatgctgagttccaggccagcctggctgaaggagactcccctcaatgtgccctaattcaaattacaaaaagagttccaatcttccaagatgctgctccacctgtcatccacatccgctctcgaggtgacattccccgagcttgccagaagagcttgcgtccagtcccaccatcacccaagattgatcgaggttgggtatgtgtttttcagcttcaagatggtaaaacacttggactcaaaatttgagccaatctcttttccctccgaaagaggcaactaatagcagaggcttcaactgctgaactatagggtatgttacattaatgatacacttgtgagtatcagccctagataatataagtcaattaaacaaccaagataaaattgttcatatcccgctagcagctttaaagataaatgtaataggagctatacctctgacagtattataattaattgttattaagtaacccaaaccaaaaatgatgaagattaagaaaaacctacctcgactgagagagtgttttttcattaaccttcatcttgtaaacgttgagcaaaattgttaaaaatatgaggcgggttatattgcctactgctcctcctgaatatatggaggccatataccctgccaggtcaaattcaacaattgctaggggtggcaacagcaatacaggcttcctgacaccggagtcagtcaatggagacactccatcgaatccactcaggccaattgctgatgacaccatcgaccatgccagccacacaccaggcagtgtgtcatcagcattcatcctcgaagctatggtgaatgtcatatcgggccccaaagtgctaatgaagcaaattccaatttggcttcctctaggtgtcgctgatcaaaagacctacagctttgactcaactacggccgccatcatgcttgcttcatatactatcacccatttcggcaaggcaaccaatccgcttgtcagagtcaatcggctgggtcctggaatcccggatcaccccctcaggctcctgcgaattggaaaccaggctttcctccaggagttcgttcttccaccagtccaactaccccagtatttcacctttgatttgacagcactcaaactgatcactcaaccactgcctgctgcaacatggaccgatgacactccaactggatcaaatggagcgttgcgtccaggaatttcatttcatccaaaacttcgccccattcttttacccaacaaaagtgggaagaaggggaacagtgccgatctaacatctccggagaaaatccaagcaataatgacttcactccaggactttaagatcgttccaattgatccaaccaaaaatatcatgggtatcgaagtgccagaaactctggtccacaagctgaccggtaagaaggtgacttccaaaaatggacaaccaatcatccctgttcttttgccaaagtacattgggttggacccggtggctccaggagacctcaccatggtaatcacacaggattgtgacacgtgtcattctcctgcaagtcttccagctgtggttgagaagtaattgcaataattgactcagatccagttttacagaatcttctcagggatagtgataacatctttttaataatccgtctactagaagagatacttctaattgatcaatatactaaaggtgctttacaccattgtctcttttctctcctaaatgtagagcttaacaaaagactcataatatacctgtttttaaaagattgattgatgaaagatcatgactaataacattacaaacaatcctactataatcaatacggtgattcaaatgtcaatctttctcattgcacatactctttgtccttatcctcaaattgcctacatgcttacatctgaggacagccagtgtgacttggattggagatgtggaggaaaaatcggggcccatttctaagttgttcacaatctaagtacagacattgctcttctaattaagaaaaaatcggcgatgaagattaagccgacagtgagcgtaatcttcatctctcttagattatttgtcttccagagtaggggtcatcaggtccttttcaattggataaccaaaataagcttcactagaaggatattgtgaggcgacaacacaatgggtgttacaggaatattgcagttacctcgtgatcgattcaagaggacatcattctttctttgggtaattatccttttccaaagaacattttccatcccgcttggagttatccacaatagtacattacaggttagtgatgtcgacaaactagtttgtcgtgacaaactgtcatccacaaatcaattgagatcagttggactgaatctcgaggggaatggagtggcaactgacgtgccatctgtgactaaaagatggggcttcaggtccggtgtcccaccaaaggtggtcaattatgaagctggtgaatgggctgaaaactgctacaatcttgaaatcaaaaaacctgacgggagtgagtgtctaccagcagcgccagacgggattcggggcttcccccggtgccggtatgtgcacaaagtatcaggaacgggaccatgtgccggagactttgccttccacaaagagggtgctttcttcctgtatgatcgacttgcttccacagttatctaccgaggaacgactttcgctgaaggtgtcgttgcatttctgatactgccccaagctaagaaggacttcttcagctcacaccccttgagagagccggtcaatgcaacggaggacccgtcgagtggctattattctaccacaattagatatcaggctaccggttttggaactaatgagacagagtacttgttcgaggttgacaatttgacctacgtccaacttgaatcaagattcacaccacagtttctgctccagctgaatgagacaatatatgcaagtgggaagaggagcaacaccacgggaaaactaatttggaaggtcaaccccgaaattgatacaacaatcggggagtgggccttctgggaaactaaaaaaacctcactagaaaaattcgcagtgaagagttgtctttcacagctgtatcaaacggacccaaaaacatcagtggtcagagtccggcgcgaacttcttccgacccagagaccaacacaacaaatgaagaccacaaaatcatggcttcagaaaattcctctgcaatggttcaagtgcacagtcaaggaaggaaagctgcagtgtcgcatctgacaacccttgccacaatctccacgagtcctcaacctcccacaaccaaaacaggtccggacaacagcacccataatacacccgtgtataaacttgacatctctgaggcaactcaagttggacaacatcaccgtagagcagacaacgacagcacagcctccgacactccccccgccacgaccgcagccggacccttaaaagcagagaacaccaacacgagtaagagcgctgactccctggacctcgccaccacgacaagcccccaaaactacagcgagactgctggcaacaacaacactcatcaccaagataccggagaagagagtgccagcagcgggaagctaggcttaattaccaatactattgctggagtagcaggactgatcacaggcgggagaaggactcgaagagaagtaattgtcaatgctcaacccaaatgcaaccccaatttacattactggactactcaggatgaaggtgctgcaatcggattggcctggataccatatttcgggccagcagccgaaggaatttacacagaggggctaatgcacaaccaagatggtttaatctgtgggttgaggcagctggccaacgaaacgactcaagctctccaactgttcctgagagccacaactgagctgcgaaccttttcaatcctcaaccgtaaggcaattgacttcctgctgcagcgatggggtggcacatgccacattttgggaccggactgctgtatcgaaccacatgattggaccaagaacataacagacaaaattgatcagattattcatgattttgttgataaaacccttccggaccagggggacaatgacaattggtggacaggatggagacaatggataccggcaggtattggagttacaggtgttataattgcagttatcgctttattctgtatatgcaaatttgtcttttagtctttcttcagattgtttcacggcaaaactcaacctcaaatcaatgaaactaggatttaattatatgaatcacttgaatctaagattacttgacaaatgataacataatacactggagcttcaaacatagccaatgtgattctaactcctttaaactcacagttaatcataaacaaggtttgacatcaatctagctatatctttaagaatgataaacttgatgaagattaagaaaaaggtaatctttcgattatctttagtcttcatccttgattctacaatcatgacagttgtctttaatgaaaaaggaaaaaagcctttttattaagttgtaataatcagatctgcaaaccggtagaatttagttgtaacctaacacacacaaagcattggtaaaaaagtcaatagaaatttaaacagtgagtgcagacaactcttaaatggaagcttcatatgagagaggacgcccccgagctgccagacagcattcaagggatggacacgaccaccatgttcgagcacgatcatcatccagagagaattatcgaggtgagtaccgtcaatcaaggagcgcctcacaagtgcgcgttcctactgtatttcataagaagagagttgaaccattaacagttcctccagcacctaaagacatatgtccgaccttgaaaaaaggatttttgtgtgacagtagtttttgcaaaaaagaccaccagttagaaagtttaactgatagggaattactcctactaatcgcccgtaagacttgtggatcagtagaacaacaattaaatataactgcacccaaggactcgcgcttagcaaatccaacggctgatgatttccagcaagaggaaggtccaaaaattaccttgttgacactgatcaagacggcagaacactgggcgagacaagacatccgaaccatagaggattccaaattaagggcattgttaactctatgtgctgtgatgacgaggaaattctcaaaatcccagctgagtcttttgtgtgagacacacctaaggcgcgaagggcttgggcaagatcaggcagaacccgttctcgaagtatatcaacgattacacagtgataaaggaggcagttttgaagctgcactatggcaacaatgggaccgacaatccctaattatgtttatcactgcattcttgaatatcgctctccagttaccgtgtgaaagttctgctgtcgttgtttcagggttaagaacattggttcctcaatcagataatgaggaagcttcaaccaacccggggacatgctcatggtctgatgagggtaccccttaataaggctgactaaaacactatataaccttctacttgatcacaatactccgtatacctatcatcatatatttaatcaagacgatatcctttaaaacttattcagtactataatcactctcatttcaaattgataagatatgcataattgccttaatatataaagaggtatgatataacccaaacattgaccaaagaaaatcataatctcgtatcgctcgcaatataacctgccaagcatacctcttgcacaaagtgattcttgtacacaaataatgtttgactctacaggaggtagcaacgatccatctcatcaaaaaataagtattttatgatttactaatgatctcttaaaatattaagaaaaactgacggaacataaattctttctgcttcaagttgtggaggaggtctatggtattcgctattgttatattacaatcaataacaagcttgtaaaaatattgttcttgtttcaggaggtatattgtgaccggaaaagctaaactaatgatgaagattaatgcggaggtctgatgagaataaaccttattattcagattaggccccaagaggcattcttcatctccttttagcaaaatactatttcaggatagtccagctagtgacacgtcttttagctgtataccagnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnngagctaaagtggtctgtacacatctcatacattgtattaggggcaataatatctaattgaacttagccatttaaaatttagtgcataaatctgggctaactccaccaggtcaactccattggctgaaaagaagcccacctacaacgaacattactttgagcgccctcacaattaaaaaataagagcgtcgttccaacaatcgagcgcaaggttacaaggttgaactgagagtgtctagacaacaaaatatcgatactccagacaccaagcaagacctgagaaaaaaccatggccaaagctacgggacgatacaatctaatatcgcccaaaaaggacctggagaaaggggttgtcttaagcgacctctgtaacttcttagttagtcaaactattcaagggtggaaagtttattgggctggtattgagtttgatgtgactcacaaaggaatggccctattgcatagactgaaaactaatgactttgcccctgcatggtcaatgacaaggaacctatttccccatttatttcaaaatccgaattccactattgaatcaccgctgtgggcactgagagtcatccttgcagcagggatacaggaccagttaattgaccagtctttgattgaacccttagcaggagcccttggtctgatctctgattggctgctaacaaccaacactaaccatttcaacatgcgaacacaacgtgtcaaggaacaattgagcctaaaaatgctgtcgttgattcgatccaatattctcaagtttattaacaaattggatgctctacatgtcgtgaactacaatggattattgagcagtattgaaattggaactcaaaatcatacaatcatcataactcgaactaacatgggttttctggtggagctccaagaacccgacaaatcggcaatgaaccgcaagaagcctgggccggcgaaattttccctccttcatgagtccacactgaaagcatttacacaagggtcctcgacacgaatgcaaagtttaattcttgaattcaatagctctcttgctatctaactaagatggaatacttcatattgggctaactcatatatgctgactcaatagttaacttgacatctctgccttcataatcagatatataagcataataaataaatactcatatttcttgataatttgtttaaccacagataaatcctcactgtaagccagcttccaagttgacacccttacaaaaaccaggactcagaatccctcaaataagagattccaagacaacatcatagaattgctttattatattaataagcattttatcactagaaatccaatatacgaaatggttaattgtaactaaacccgcaggtcatgtgtgttaggtttcacaaattatatatattactaactccatactcgtaactaacattagataagtaggttaagaaaaaagcttgaggaagattaagaaaaactgcttattgggtctttccgtgttttagatgaagcagttgacattcttcctcttgatattaaatggctacacaacatacccaatacccagacgccaggttatcatcaccaattgtattggaccaatgtgaccttgtcactagagcttgcgggttgtattcatcatactcccttaatccgcaactacgcaactgtaaactcccgaaacatatataccgtttaaaatatgatgtaactgttaccaagttcttaagtgatgtaccagtggcgacattgcccatagatttcatagtcccaattcttctcaaggcactatcaggcaatgggttctgtcctgttgagccgcggtgccaacagttcttagatgaaattattaagtacacaatgcaagatgctctcttcctgaaatattatctcaaaaatgtgggtgctcaagaagactgtgttgatgaccactttcaagaaaaaatcttatcttcaattcagggcaatgaatttttacatcaaatgtttttctggtatgacctggctattttaactcgaaggggtagattaaatcgaggaaactctagatcaacgtggtttgttcatgatgatttaatagacatcttaggctatggggactatgttttttggaagatcccaatttcactgttaccactgaacacacaaggaatcccccatgctgctatggattggtatcagacatcagtattcaaagaagcggttcaagggcatacacacattgtttctgtttctactgccgatgtcttgataatgtgcaaagatttaattacatgtcgattcaacacaactctaatctcaaaaatagcagaggttgaggacccagtttgctctgattatcccaattttaagattgtgtctatgctttaccagagcggagattacttactctccatattagggtctgatgggtataaaatcattaagtttctcgaaccattgtgcttggctaaaattcaattgtgctcaaagtacaccgagaggaagggccgattcttaacacaaatgcatttagctgtaaatcacaccctggaagaaattacagaaatacgtgcactaaagccttcacaggctcacaagatccgtgaattccatagaacattgataaggctggagatgacgccacaacaactttgtgagctattttccatacaaaaacactgggggcatcctgtgctacatagtgaaacagcaatccaaaaagttaaaaaacatgctacggtgctaaaagcattacgccctatcgtgattttcgagacatattgtgtttttaaatatagcattgcaaaacattattttgatagtcaaggatcttggtacagtgttacctcagatagaaatctaacaccaggtcttaattcttatatcaaaagaaatcaattccctccgttgccaatgattaaagaactgctatgggaattttaccaccttgaccatcctccacttttctcaaccaaaattattagtgacttaagtatttttataaaagacagagctactgcagtagaaaggacatgctgggatgcagtattcgagcctaatgttctgggatataatccacctcacaaattcagtaccaaacgtgtaccggaacaatttttagagcaagaaaacttttctattgagaatgttctttcctacgcgcaaaaactcgagtatctactaccacaatatcggaatttttctttctcattgaaagagaaagagttgaatgtaggtagaactttcggaaaattgccttatccgactcgcaatgttcaaacactttgtgaagctctgttagctgatggtcttgctaaagcatttcctagcaatatgatggtagttacggaacgtgaacaaaaagaaagcttattgcatcaagcatcatggcaccacacaagtgatgatttcggtgagcatgccacagttagagggagtagctttgtaactgatttagagaaatacaatcttgcatttaggtatgagtttacagcaccttttatagaatattgcaaccgttgctatggtgttaagaatgtttttaattggatgcattatacaatcccacagtgttatatgcatgtcagtgattattataatccaccgcataacctcacactggaaaatcgaaacaacccccctgaagggcctagttcatacaggggtcatatgggagggattgaaggactgcaacaaaaactctggacaagtatttcatgtgctcaaatttctttagttgaaattaagactggttttaagttgcgctcagctgtgatgggtgacaatcagtgcattaccgttttatcagtcttccccttagagactgatgcaggcgagcaggaacagagcgccgaggacaatgcagcgagggtggccgccagcctagcaaaagttacaagtgcctgtggaatctttttaaaacctgatgaaacatttgtacattcaggttttatctattttggaaaaaaacaatatttgaatggggtccaattgcctcagtcccttaaaacggctacaagaatggcaccattgtctgatgcaatttttgatgatcttcaagggaccctggctagtataggtactgcttttgagcgatccatctctgagacacgacatatctttccttgcagaataaccgcagctttccatacgttcttttcggtgagaatcttgcaatatcatcacctcggatttaataaaggttttgaccttggacagttaacactcggcaaacctctggatttcggaacaatatcattggcactagcggtaccgcaggtgcttggagggttatccttcttgaatcctgagaaatgtttctaccggaatctaggagatccagttacctcaggtttattccagttaaaaacttatctccgaatgattgagatggatgatttattcttacctttaattgcgaagaaccctgggaactgcactgccattgactttgtgctaaatcctagcggattaaatgttcctgggtcgcaagacttaacttcatttctgcgccagattgtacgtaggactatcaccctaagtgcgaaaaacaaacttattaataccttatttcatgcatcagctgacttcgaagacgaaatggtttgtaagtggctcttatcatcaactcctgttatgagtcgtttcgcagccgatatattttcacgcacgccgagcgggaagcgattgcaaattctaggatacttggaaggaacacgcacattattagcctctaagatcatcaacaataatacagagacgccggttttggacagactgaggaagataacattgcaaaggtggagtctatggtttagttatcttgatcattgtgataatatcctggcggaggctttaacccaaataacttgcacagttgatttagcacagatcctgagggaatattcatgggcacatattttagaggggagacctcttattggagccacactcccatgtatgattgagcaattcaaagtggtttggctgaaaccctacgaacaatgtccgcagtgttcaaatgccaagcaacctggtgggaaaccattcgtgtcagtagcagtcaagaaacatattgttagtgcatggccaaatgcatcccgaataagctggactatcggggatggaatcccatacattggatcaaggacagaagataagatagggcaacctgctattaaaccaaaatgtccttccgcagccttaagagaggccattgaattggcgtcccgtttaacatgggtaactcaaggcagttcgaacagtgacttgctaataaaaccatttttggaagcacgagtaaatttaagtgttcaagaaatacttcaaatgaccccttcacattactcgggaaatattgttcataggtacaacgatcaatacagtcctcattctttcatggccaatcgtatgagtaactcagcaacgcgattgattgtttctacaaacactttaggtgagttttcaggaggtggccaatcggcacgcgacagcaatattattttccagaatgttataaattatgcagttgcactgttcgatattaaatttagaaacactgaggctacagatatccagtataatcgtgctcaccttcatctaactaagtgttgcacccgggaggtaccagctcagtacttaacatacacatctacattggatttagatttaacaagataccgagaaaatgaattgatttatgacaataatcctctaaaaggaggactcaattgcaatatctcatttgataacccatttttccaaggcaaacagctgaacattatagaagatgaccttattcgactgcctcacttatctggatgggagctagctaagaccatcatgcaatcaattatttcagatagcaataattcgtctacagacccaattagcagtggagaaacaagatcattcactacccatttcttaacttatcccaagataggacttctgtacagttttggggcctttgtaagttattatcttggcaatacaattcttcggactaagaaattaacacttgacaattttttatattacttaactacccaaattcataatctaccacatcgctcattgcgaatacttaagccaacattcaaacatgcaagcgttatgtcacgattaatgagtattgatccccatttttctatttacataggcggtgctgcaggtgacagaggactctcagatgcggccaggttatttttgagaacgtccatttcatcttttcttacatttgtaaaggaatggataattaatcgcggaacaattgtccctttatggatagtatatccattagagggtcaaaatccaacacctgttaataatttcctccatcagatcgtagaactgctggtgcatgattcatcaagacaccaggcttttaaaactaccataaatgatcatgtacatcctcacgacaatcttgtttacacatgtaagagtacagccagcaatttcttccatgcgtcattggcgtactggaggagcaggcacagaaacagcaaccgaaaagacttgacaagaaactcttcaactggatcaagcacaaacaacagtgatggtcatattaagagaagtcaagaacaaaccaccagagatccacatgatggcactgaacggagtctagtcctgcaaatgagccatgaaataaaaagaacgacaattccacaagagaacacgcaccagggtccgtcgttccagtcatttctaagtgactctgcttgcggtacagcaaacccaaaactaaatttcgatagatcgagacacaatgtgaaatctcaggatcataactcagcatccaagagggaaggtcatcaaataatctcacatcgtctagtcctacctttctttacattatctcaagggacacgccaattaacgtcatccaatgagtcacaaacccaagatgagatatcaaagtacttacggcaattgagatccgtcattgataccacagtttattgtaggtttaccggtatagtctcgtccatgcattacaaacttgatgaggtcctttgggaaatagagaattttaagtcggctgtgacgctggcagagggagaaggtgctggtgccttactattgattcagaaataccaagttaagaccttatttttcaacacgctagctactgagtccagtatagagtcagaaatagtatcaggaatgactactcctaggatgcttctacctgttatgtcaaaattccataatgaccaaattgagattattcttaacaactcagcaagccaaataacagacataacaaatcctacttggtttaaagaccaaagagcaaggctacctaggcaagtcgaggttataaccatggatgcagagacgacagagaatataaacagatcgaaattgtacgaagctgtacataaattgatcttacaccatgttgatcccagcgtattgaaagcagtggtccttaaagtctttctaagtgataccgagggtatgttatggctaaatgataatctagccccgttttttgccactgggtatttaattaagccaataacgtcaagtgccaggtctagtgagtggtatctttgtctgacgaacttcttatcaactacacgtaagatgccacaccaaaaccatctcagttgtaagcaggtaatacttacggcattgcaactgcaaattcaacggagcccatactggctaagtcatttaactcagtatgctgactgcgatttacatttaagctatatccgccttggttttccatcattagagaaagtactataccacaggtataaccttgtcgattcaaaaagaggtccactagtctctgtcactcagcacttagcacatcttagggcagagattcgagaattgaccaatgattataatcaacagcgacaaagtcggactcaaacatatcactttattcgtactgcaaaaggacgaatcacaaaactagtcaatgattatttaaaattctttcttattgtacaagcattaaaacataatgggacatggcaagctgagtttaagaaattaccagagttgattagtgtgtgcaataggttctatcatattagagattgtaattgtgaagaacgtttcttagttcaaaccttatatttacatagaatgcaggattctgaagttaagcttatcgaaaggctgacagggcttctgagtttatttccagatggtctctacaggttcgattgaataaccgtgcatagtattttgatacttgtaaaggttggttatcaacatacagattataaaaaactcataaattgctctcatacatcatcttgatctgatttcaataaataactatttagataacgaaaggagtccttacattatacactatatttggcctctctccctgcgtgataatcaaaaaattcacaatacagcatgtgtgacatattactgctgcaatgagtctaacgcaacataataaactccgcactctttataattaagctttaacgataggtctgggctcatattgttattgatatagtaatgttgtatcaatatcttgccagatggaatagtgctttggttgataacacgacttcttaaaacaaaactgatctttaagattaagttttttataattgtcattgctttaatttgtcgatttaaaaatggtgatagccttaatctttgtgtaaaataagagattaggtgtaataactttaacatttttgtctagtaagctactattccattcagaatgataaaattaaaagaaaagacatgactgtaaaatcagaaataccttctttacaatatagcagactagataataatcttcgtgttaatgataattaaggcattgaccacgctcatcagaaggctcactagaataaac</INSDSeq_sequence>
<INSDSeq_xrefs>
<INSDXref>
<INSDXref_dbname>BioProject</INSDXref_dbname>
<INSDXref_id>PRJNA257197</INSDXref_id>
</INSDXref>
<INSDXref>
<INSDXref_dbname>BioSample</INSDXref_dbname>
<INSDXref_id>SAMN02951952</INSDXref_id>
</INSDXref>
</INSDSeq_xrefs>
</INSDSeq>
To extract all the information about organism, host, sampling time, etc., that is held in the list of INSDQualifiers, I loop through all the sequences and generate a dictionary with accession as the key and a dictionary of qualifiers as the value.
I start by initialising an empty dictionary, with strings as both the key and the value.
In [7]:
seq_dict=Dict{ASCIIString,Dict{ASCIIString,ASCIIString}}()
Out[7]:
Dict{ASCIIString,Dict{ASCIIString,ASCIIString}} with 0 entries
Extracting the information is a mixture of find_element and find_elements_by_tagname to search for the right elements, get_elements_by_tagname, and finally using content to extract the contents of the qualifiers.
In [8]:
for i in 1:numseq
s=sequences[i]
accession=content(find_element(s, "INSDSeq_primary-accession"))
feature_table=find_element(s,"INSDSeq_feature-table")
features=get_elements_by_tagname(feature_table,"INSDFeature")
feature_quals=get_elements_by_tagname(features[1], "INSDFeature_quals")
qualifiers=get_elements_by_tagname(feature_quals[1], "INSDQualifier")
qualifier_dict=Dict{ASCIIString,ASCIIString}()
for q in qualifiers
n=find_element(q,"INSDQualifier_name")
v=find_element(q,"INSDQualifier_value")
if v!=nothing
qualifier_dict[content(n)]=content(v)
end
end
seq_dict[accession]=qualifier_dict
end;
Here is an example of the features for the first accession.
In [9]:
seq_dict[accessions[1]]
Out[9]:
Dict{ASCIIString,ASCIIString} with 8 entries:
"organism" => "Zaire ebolavirus"
"isolation_source" => "serum"
"host" => "Homo sapiens"
"mol_type" => "viral cRNA"
"collection_date" => "25-May-2014"
"isolate" => "Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095B"
"db_xref" => "taxon:186538"
"country" => "Sierra Leone"
To flatten the dictionary, I first make a dictionary of all feature names, with the number of times the field is found.
In [10]:
fn_dict=(ASCIIString=>Int64)[]
for acc in keys(seq_dict)
features=seq_dict[acc]
for k in keys(features)
current_count=get(fn_dict,k,0)
fn_dict[k]=current_count+1
end
end
fn_dict
WARNING: deprecated syntax "(ASCIIString=>Int64)[]" at In[10]:1.
Use "Dict{ASCIIString,Int64}()" instead.
Out[10]:
Dict{ASCIIString,Int64} with 9 entries:
"organism" => 249
"isolation_source" => 165
"host" => 249
"collected_by" => 150
"mol_type" => 249
"collection_date" => 249
"isolate" => 249
"db_xref" => 249
"country" => 249
I extract the names of the qualifiers as a list, that will be used below to construct a DataFrame.
In [11]:
feature_names=collect(keys(fn_dict))
Out[11]:
9-element Array{ASCIIString,1}:
"organism"
"isolation_source"
"host"
"collected_by"
"mol_type"
"collection_date"
"isolate"
"db_xref"
"country"
I then loop through each feature name, for each sequence, determine whether the feature is present, and construct a DataArray, which is then added to a DataFrame.
In [12]:
df=DataFrame(accession=accessions)
numfeatures=length(feature_names)
for i in 1:numfeatures
key=feature_names[i]
dv=DataArray(ASCIIString[],Bool[])
for j in 1:numseq
acc=accessions[j]
f=seq_dict[acc]
val=get(f,key,NA) # NA is the default
push!(dv,val)
end
df[symbol(key)]=dv
end;
I now have a DataFrame that has the features in a flat format.
In [13]:
head(df)
Out[13]:
accession organism isolation_source host collected_by mol_type collection_date isolate db_xref country 1 KM034549 Zaire ebolavirus serum Homo sapiens NA viral cRNA 25-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095B taxon:186538 Sierra Leone 2 KM034550 Zaire ebolavirus serum Homo sapiens NA viral cRNA 25-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095 taxon:186538 Sierra Leone 3 KM034551 Zaire ebolavirus serum Homo sapiens NA viral cRNA 26-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM096 taxon:186538 Sierra Leone 4 KM034552 Zaire ebolavirus serum Homo sapiens NA viral cRNA 26-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM098 taxon:186538 Sierra Leone 5 KM034553 Zaire ebolavirus serum Homo sapiens NA viral cRNA 27-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3670.1 taxon:186538 Sierra Leone 6 KM034554 Zaire ebolavirus serum Homo sapiens NA viral cRNA 27-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3676.1 taxon:186538 Sierra Leone
Extract patient ID from dataframe
In [14]:
ids = [x |> # select
(x)->split(x,"-") |> # split on hyphen
last |> # take last
(x)->split(x,".") |> # split on period
first for x in df[:isolate]]
df[:ids] = ids;
Load in annotations, obtained from the Sabeti/Garry labs available hereand select on the basis of IDs.
In [15]:
annot = readtable("ebola-data.csv")
Out[15]:
Patient_ID Diagnosis Age Gender Village Chiefdom District Outcome Date_of_Outcome Admitted_at_report Pre_admission_date Date_of_admission Date_of_discharge Temperature Systolic_pressure Diastolic_pressure Hearth_rate Respiratory_rate Days_since_onset Oxygen_saturation Bleeding_gums Bleeding_nose Blood_in_stool Blood_in_vomit Bleeding_injection Bleeding_hematoma Blood_in_sputum Blood_in_urine Vaginal_bleeding No_bleeding Abdominal_pain Joint_pain Muscle_pain Back_pain Side_pain Retrosternal_pain Other_pain No_pain Fever Conjunctivitis Edema Inflammation Rash Headache Sore_throat Vomit Cough Diarrhea Weakness Dizziness Hearing Convulsions Confusion Jaundice Other_symptoms No_symptoms Antimalarials Ceftriaxone Paracetamol Metronidazole Artemisinin_Combination_Therapy Ciprofloxacin Ampicillin Omeprazole Date_of_metabolic_panel_1 Alanine_Aminotransferase_U_L_day_1 Albumin_g_L_day_1 Alkaline_Phosphatase_U_L_day_1 Aspartate_Aminotransferase_U_L_day_1 Calcium_mmol_L_day_1 Chloride_mmol_L_day_1 Creatinine_umol_L_day_1 Glucose_mmol_L_day_1 Potassium_mmol_L_day_1 Sodium_mmol_L_day_1 Total_Bilirubin_umol_L_day_1 Total_Carbon_Dioxide_mmol_L_day_1 Total_Protein_g_L_day_1 Blood_Urea_Nitrogen_mmol_urea_L_day_1 Date_of_metabolic_panel_2 Alanine_Aminotransferase_U_L_day_2 Albumin_g_L_day_2 Alkaline_Phosphatase_U_L_day_2 Aspartate_Aminotransferase_U_L_day_2 Calcium_mmol_L_day_2 Chloride_mmol_L_day_2 Creatinine_umol_L_day_2 Glucose_mmol_L_day_2 Potassium_mmol_L_day_2 Sodium_mmol_L_day_2 Total_Bilirubin_umol_L_day_2 Total_Carbon_Dioxide_mmol_L_day_2 Total_Protein_g_L_day_2 Blood_Urea_Nitrogen_mmol_urea_L_day_2 Date_of_metabolic_panel_3 Alanine_Aminotransferase_U_L_day_3 Albumin_g_L_day_3 Alkaline_Phosphatase_U_L_day_3 Aspartate_Aminotransferase_U_L_day_3 Calcium_mmol_L_day_3 Chloride_mmol_L_day_3 Creatinine_umol_L_day_3 Glucose_mmol_L_day_3 Potassium_mmol_L_day_3 Sodium_mmol_L_day_3 Total_Bilirubin_umol_L_day_3 Total_Carbon_Dioxide_mmol_L_day_3 Total_Protein_g_L_day_3 Blood_Urea_Nitrogen_mmol_urea_L_day_3 Date_of_metabolic_panel_4 Alanine_Aminotransferase_U_L_day_4 Albumin_g_L_day_4 Alkaline_Phosphatase_U_L_day_4 Aspartate_Aminotransferase_U_L_day_4 Calcium_mmol_L_day_4 Chloride_mmol_L_day_4 Creatinine_umol_L_day_4 Glucose_mmol_L_day_4 Potassium_mmol_L_day_4 Sodium_mmol_L_day_4 Total_Bilirubin_umol_L_day_4 Total_Carbon_Dioxide_mmol_L_day_4 Total_Protein_g_L_day_4 Blood_Urea_Nitrogen_mmol_urea_L_day_4 Date_of_metabolic_panel_5 Alanine_Aminotransferase_U_L_day_5 Albumin_g_L_day_5 Alkaline_Phosphatase_U_L_day_5 Aspartate_Aminotransferase_U_L_day_5 Calcium_mmol_L_day_5 Chloride_mmol_L_day_5 Creatinine_umol_L_day_5 Glucose_mmol_L_day_5 Potassium_mmol_L_day_5 Sodium_mmol_L_day_5 Total_Bilirubin_umol_L_day_5 Total_Carbon_Dioxide_mmol_L_day_5 Total_Protein_g_L_day_5 Blood_Urea_Nitrogen_mmol_urea_L_day_5 First_measured_viral_load_log_units_ Maximum_measured_viral_load_log_units_ Minimum_measured_viral_load_log_units_ Averaged_viral_load_log_units_ Date_of_qPCR_1 EBOV_copies_mL_plasma_log_units_day_1 Date_of_qPCR_2 EBOV_copies_mL_plasma_log_units_day_2 Date_of_qPCR_3 EBOV_copies_mL_plasma_log_units_day_3 Date_of_qPCR_4 EBOV_copies_mL_plasma_log_units_day_4 Date_of_qPCR_5 EBOV_copies_mL_plasma_log_units_day_5 Date_of_qPCR_6 EBOV_copies_mL_plasma_log_units_day_6 SNP_572 SNP_800 SNP_1024 SNP_1288 SNP_1492 SNP_1849 SNP_2124 SNP_2185 SNP_2341 SNP_2364 SNP_2497 SNP_2931 SNP_3116 SNP_3388 SNP_3638 SNP_4340 SNP_4505 SNP_4709 SNP_4759 SNP_4976 SNP_5461 SNP_6175 SNP_6283 SNP_6909 SNP_8280 SNP_8928 SNP_9390 SNP_9536 SNP_9923 SNP_10005 SNP_10218 SNP_10252 SNP_10268 SNP_10509 SNP_10743 SNP_10801 SNP_11142 SNP_11811 SNP_11943 SNP_12878 SNP_12885 SNP_13856 SNP_13923 SNP_14019 SNP_14232 SNP_15599 SNP_15660 SNP_15963 SNP_16054 SNP_16455 SNP_16750 SNP_17142 SNP_17985 SNP_18412 SNP_18895 Allele_Frequency_10218 Cluster _mutations_from_cluster Sub_cluster _mutations_from_sub_cluster 1 EM-095 Positive 42.0 Female Koindu Kissi Teng Kailahun NA NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.17520455919 5.17520455919 5.17520455919 5.17520455919 2014-05-27 5.17520455919 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 0 NA NA 2 EM-95B Positive NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 6.12450740145 6.12450740145 6.12450740145 6.12450740145 NA 6.12450740145 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3 EM-099 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-05-27 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 4 EM-100 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5 EM-101 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 6 EM-102 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7 EM-103 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8 EM-105 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9 EM-108 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 10 EM-109 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-02 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 11 EM-112 Positive 65.0 Female Njala Jawie Kailahun Died 2014-06-03 No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.74822576799 7.74822576799 7.74822576799 7.74822576799 2014-06-03 7.74822576799 NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.750850993741 Cluster 3 0 NA NA 12 EM-114 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-03 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 13 EM-117 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-03 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 14 EM-118 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-03 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 15 EM-121 Positive 44.0 Male Foindu Kissi Kama Kailahun Died 2014-06-06 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.10069330882 7.10069330882 7.10069330882 7.10069330882 2014-06-04 7.10069330882 NA NA NA NA NA NA NA NA NA NA No Yes No No Yes No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 1 NA NA 16 EM-122 Positive 11.0 Female Daru Jawie Kailahun Died 2014-06-08 No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-07 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 17 EM-123 Positive 46.0 Female Daru Jawie Kailahun Died 2014-06-09 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-06 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 18 EM-124 Positive 35.0 Female Daru Jawie Kailahun Died 2014-06-22 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-10 263.0 23.0 335.0 638.0 1.62 95.0 720.0 5.7 NA 124.0 12 18.0 68 27.0 2014-06-11 187.0 21 217 351.0 1.75 105 736 5.8 2.0 127 14 16 60 33.7 2014-06-12 137 22 164 212.0 1.83 102 754.0 4.3 2.1 127 20 20 60 36.3 2014-06-13 103 22 144 117.0 1.84 99 826 3.7 2.0 125 16 23 65 41.3 2014-06-14 89 22 117 93 1.77 100 748 2.8 2.3 125 12 22 64 40.3 5.16284676401 5.16284676401 2.57931013071 3.74939867228 2014-06-06 5.16284676401 NA 4.31662430974 NA 3.87158337583 NA 2.57931013071 NA 2.81662878109 2014-06-18 NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.868852459016 Cluster 3 0 NA NA 19 EM-125 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-06 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 20 EM-127 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-06 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 21 G-3670 Positive 20.0 Female Koindu Kissi Teng Kailahun Discharged 2014-07-08 Yes 2014-05-26 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.69901470087 7.69901470087 7.69901470087 7.69901470087 2014-05-27 7.69901470087 2014-06-06 NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No Yes No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 2 NA NA 22 G-3676 Positive 45.0 Female Buedu Kissi Teng Kailahun Died 2014-05-30 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.17133670769 8.80761074826 8.17133670769 8.48947372797 2014-05-27 8.17133670769 NA 8.80761074826 NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No 0.0 Cluster 1 0 NA NA 23 G-3677 Positive 50.0 Female Koindu Kissi Teng Kailahun Died 2014-05-27 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.14473079255 9.14473079255 9.11630138184 9.13051608719 2014-05-26 9.14473079255 2014-05-27 9.11630138184 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA 24 G-3678 Negative NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-05-29 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 25 G-3679 Positive 15.0 Female Nyummdu Kissi Teng Kailahun NA NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.26223169357 7.317157612 7.26223169357 7.28969465279 2014-05-26 7.26223169357 2014-05-28 7.317157612 NA NA NA NA NA NA NA NA No Yes No No No No No No Yes No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No Yes No 0.0 Cluster 2 2 Sub-cluster a 0 26 G-3680 Positive 8.0 Female Nyummdu Kissi Teng Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.57103639539 8.57103639539 8.57103639539 8.57103639539 2014-05-28 8.57103639539 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No Yes No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 1 NA NA 27 G-3681 Positive 55.0 Female Kolosu Kissi Teng Kailahun Died NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.39286642314 5.39286642314 5.39286642314 5.39286642314 2014-05-28 5.39286642314 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 28 G-3682 Positive 54.0 Female Kolosu Kissi Teng Kailahun Died NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.92024778036 9.20736818601 5.92024778036 7.56380798319 2014-05-27 5.92024778036 2014-05-28 9.20736818601 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA 29 G-3683 Positive 57.0 Female Fokoma Kissi Teng Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.10849805276 7.10849805276 7.10849805276 7.10849805276 2014-05-28 7.10849805276 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 1 NA NA 30 G-3686 Positive 27.0 Female Buedu Kissi Tongi Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.64337116822 9.64337116822 9.64337116822 9.64337116822 2014-05-29 9.64337116822 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No 0.0 Cluster 1 2 NA NA &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip
In [16]:
annot[:ids] = [x |> (x)->replace(x,"-","") for x in annot[:Patient_ID]]
Out[16]:
213-element Array{Any,1}:
"EM095"
"EM95B"
"EM099"
"EM100"
"EM101"
"EM102"
"EM103"
"EM105"
"EM108"
"EM109"
"EM112"
"EM114"
"EM117"
⋮
"G3832"
"G3833"
"G3842"
"G3852"
"G3853"
"G3854"
"G3844"
"G3849"
"G3858"
"G3859"
"G3835"
"G3836"
Marge sequences and annotations
In [17]:
bigdf = join(annot,df,on=:ids,kind=:inner)
Out[17]:
Patient_ID Diagnosis Age Gender Village Chiefdom District Outcome Date_of_Outcome Admitted_at_report Pre_admission_date Date_of_admission Date_of_discharge Temperature Systolic_pressure Diastolic_pressure Hearth_rate Respiratory_rate Days_since_onset Oxygen_saturation Bleeding_gums Bleeding_nose Blood_in_stool Blood_in_vomit Bleeding_injection Bleeding_hematoma Blood_in_sputum Blood_in_urine Vaginal_bleeding No_bleeding Abdominal_pain Joint_pain Muscle_pain Back_pain Side_pain Retrosternal_pain Other_pain No_pain Fever Conjunctivitis Edema Inflammation Rash Headache Sore_throat Vomit Cough Diarrhea Weakness Dizziness Hearing Convulsions Confusion Jaundice Other_symptoms No_symptoms Antimalarials Ceftriaxone Paracetamol Metronidazole Artemisinin_Combination_Therapy Ciprofloxacin Ampicillin Omeprazole Date_of_metabolic_panel_1 Alanine_Aminotransferase_U_L_day_1 Albumin_g_L_day_1 Alkaline_Phosphatase_U_L_day_1 Aspartate_Aminotransferase_U_L_day_1 Calcium_mmol_L_day_1 Chloride_mmol_L_day_1 Creatinine_umol_L_day_1 Glucose_mmol_L_day_1 Potassium_mmol_L_day_1 Sodium_mmol_L_day_1 Total_Bilirubin_umol_L_day_1 Total_Carbon_Dioxide_mmol_L_day_1 Total_Protein_g_L_day_1 Blood_Urea_Nitrogen_mmol_urea_L_day_1 Date_of_metabolic_panel_2 Alanine_Aminotransferase_U_L_day_2 Albumin_g_L_day_2 Alkaline_Phosphatase_U_L_day_2 Aspartate_Aminotransferase_U_L_day_2 Calcium_mmol_L_day_2 Chloride_mmol_L_day_2 Creatinine_umol_L_day_2 Glucose_mmol_L_day_2 Potassium_mmol_L_day_2 Sodium_mmol_L_day_2 Total_Bilirubin_umol_L_day_2 Total_Carbon_Dioxide_mmol_L_day_2 Total_Protein_g_L_day_2 Blood_Urea_Nitrogen_mmol_urea_L_day_2 Date_of_metabolic_panel_3 Alanine_Aminotransferase_U_L_day_3 Albumin_g_L_day_3 Alkaline_Phosphatase_U_L_day_3 Aspartate_Aminotransferase_U_L_day_3 Calcium_mmol_L_day_3 Chloride_mmol_L_day_3 Creatinine_umol_L_day_3 Glucose_mmol_L_day_3 Potassium_mmol_L_day_3 Sodium_mmol_L_day_3 Total_Bilirubin_umol_L_day_3 Total_Carbon_Dioxide_mmol_L_day_3 Total_Protein_g_L_day_3 Blood_Urea_Nitrogen_mmol_urea_L_day_3 Date_of_metabolic_panel_4 Alanine_Aminotransferase_U_L_day_4 Albumin_g_L_day_4 Alkaline_Phosphatase_U_L_day_4 Aspartate_Aminotransferase_U_L_day_4 Calcium_mmol_L_day_4 Chloride_mmol_L_day_4 Creatinine_umol_L_day_4 Glucose_mmol_L_day_4 Potassium_mmol_L_day_4 Sodium_mmol_L_day_4 Total_Bilirubin_umol_L_day_4 Total_Carbon_Dioxide_mmol_L_day_4 Total_Protein_g_L_day_4 Blood_Urea_Nitrogen_mmol_urea_L_day_4 Date_of_metabolic_panel_5 Alanine_Aminotransferase_U_L_day_5 Albumin_g_L_day_5 Alkaline_Phosphatase_U_L_day_5 Aspartate_Aminotransferase_U_L_day_5 Calcium_mmol_L_day_5 Chloride_mmol_L_day_5 Creatinine_umol_L_day_5 Glucose_mmol_L_day_5 Potassium_mmol_L_day_5 Sodium_mmol_L_day_5 Total_Bilirubin_umol_L_day_5 Total_Carbon_Dioxide_mmol_L_day_5 Total_Protein_g_L_day_5 Blood_Urea_Nitrogen_mmol_urea_L_day_5 First_measured_viral_load_log_units_ Maximum_measured_viral_load_log_units_ Minimum_measured_viral_load_log_units_ Averaged_viral_load_log_units_ Date_of_qPCR_1 EBOV_copies_mL_plasma_log_units_day_1 Date_of_qPCR_2 EBOV_copies_mL_plasma_log_units_day_2 Date_of_qPCR_3 EBOV_copies_mL_plasma_log_units_day_3 Date_of_qPCR_4 EBOV_copies_mL_plasma_log_units_day_4 Date_of_qPCR_5 EBOV_copies_mL_plasma_log_units_day_5 Date_of_qPCR_6 EBOV_copies_mL_plasma_log_units_day_6 SNP_572 SNP_800 SNP_1024 SNP_1288 SNP_1492 SNP_1849 SNP_2124 SNP_2185 SNP_2341 SNP_2364 SNP_2497 SNP_2931 SNP_3116 SNP_3388 SNP_3638 SNP_4340 SNP_4505 SNP_4709 SNP_4759 SNP_4976 SNP_5461 SNP_6175 SNP_6283 SNP_6909 SNP_8280 SNP_8928 SNP_9390 SNP_9536 SNP_9923 SNP_10005 SNP_10218 SNP_10252 SNP_10268 SNP_10509 SNP_10743 SNP_10801 SNP_11142 SNP_11811 SNP_11943 SNP_12878 SNP_12885 SNP_13856 SNP_13923 SNP_14019 SNP_14232 SNP_15599 SNP_15660 SNP_15963 SNP_16054 SNP_16455 SNP_16750 SNP_17142 SNP_17985 SNP_18412 SNP_18895 Allele_Frequency_10218 Cluster _mutations_from_cluster Sub_cluster _mutations_from_sub_cluster ids accession organism isolation_source host collected_by mol_type collection_date isolate db_xref country 1 EM-095 Positive 42.0 Female Koindu Kissi Teng Kailahun NA NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.17520455919 5.17520455919 5.17520455919 5.17520455919 2014-05-27 5.17520455919 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 0 NA NA EM095 KM034550 Zaire ebolavirus serum Homo sapiens NA viral cRNA 25-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM095 taxon:186538 Sierra Leone 2 EM-112 Positive 65.0 Female Njala Jawie Kailahun Died 2014-06-03 No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.74822576799 7.74822576799 7.74822576799 7.74822576799 2014-06-03 7.74822576799 NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.750850993741 Cluster 3 0 NA NA EM112 KM233039 Zaire ebolavirus NA Homo sapiens NA viral cRNA 03-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM112 taxon:186538 Sierra Leone 3 EM-121 Positive 44.0 Male Foindu Kissi Kama Kailahun Died 2014-06-06 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.10069330882 7.10069330882 7.10069330882 7.10069330882 2014-06-04 7.10069330882 NA NA NA NA NA NA NA NA NA NA No Yes No No Yes No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 1 NA NA EM121 KM233044 Zaire ebolavirus NA Homo sapiens NA viral cRNA 04-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM121 taxon:186538 Sierra Leone 4 EM-124 Positive 35.0 Female Daru Jawie Kailahun Died 2014-06-22 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-10 263.0 23.0 335.0 638.0 1.62 95.0 720.0 5.7 NA 124.0 12 18.0 68 27.0 2014-06-11 187.0 21 217 351.0 1.75 105 736 5.8 2.0 127 14 16 60 33.7 2014-06-12 137 22 164 212.0 1.83 102 754.0 4.3 2.1 127 20 20 60 36.3 2014-06-13 103 22 144 117.0 1.84 99 826 3.7 2.0 125 16 23 65 41.3 2014-06-14 89 22 117 93 1.77 100 748 2.8 2.3 125 12 22 64 40.3 5.16284676401 5.16284676401 2.57931013071 3.74939867228 2014-06-06 5.16284676401 NA 4.31662430974 NA 3.87158337583 NA 2.57931013071 NA 2.81662878109 2014-06-18 NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.868852459016 Cluster 3 0 NA NA EM124 KM233045 Zaire ebolavirus NA Homo sapiens NA viral cRNA 04-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM124.1 taxon:186538 Sierra Leone 5 EM-124 Positive 35.0 Female Daru Jawie Kailahun Died 2014-06-22 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-10 263.0 23.0 335.0 638.0 1.62 95.0 720.0 5.7 NA 124.0 12 18.0 68 27.0 2014-06-11 187.0 21 217 351.0 1.75 105 736 5.8 2.0 127 14 16 60 33.7 2014-06-12 137 22 164 212.0 1.83 102 754.0 4.3 2.1 127 20 20 60 36.3 2014-06-13 103 22 144 117.0 1.84 99 826 3.7 2.0 125 16 23 65 41.3 2014-06-14 89 22 117 93 1.77 100 748 2.8 2.3 125 12 22 64 40.3 5.16284676401 5.16284676401 2.57931013071 3.74939867228 2014-06-06 5.16284676401 NA 4.31662430974 NA 3.87158337583 NA 2.57931013071 NA 2.81662878109 2014-06-18 NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.868852459016 Cluster 3 0 NA NA EM124 KM233046 Zaire ebolavirus NA Homo sapiens NA viral cRNA 06-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM124.2 taxon:186538 Sierra Leone 6 EM-124 Positive 35.0 Female Daru Jawie Kailahun Died 2014-06-22 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-10 263.0 23.0 335.0 638.0 1.62 95.0 720.0 5.7 NA 124.0 12 18.0 68 27.0 2014-06-11 187.0 21 217 351.0 1.75 105 736 5.8 2.0 127 14 16 60 33.7 2014-06-12 137 22 164 212.0 1.83 102 754.0 4.3 2.1 127 20 20 60 36.3 2014-06-13 103 22 144 117.0 1.84 99 826 3.7 2.0 125 16 23 65 41.3 2014-06-14 89 22 117 93 1.77 100 748 2.8 2.3 125 12 22 64 40.3 5.16284676401 5.16284676401 2.57931013071 3.74939867228 2014-06-06 5.16284676401 NA 4.31662430974 NA 3.87158337583 NA 2.57931013071 NA 2.81662878109 2014-06-18 NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.868852459016 Cluster 3 0 NA NA EM124 KM233047 Zaire ebolavirus NA Homo sapiens NA viral cRNA 08-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM124.3 taxon:186538 Sierra Leone 7 EM-124 Positive 35.0 Female Daru Jawie Kailahun Died 2014-06-22 Yes 2014-06-05 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-10 263.0 23.0 335.0 638.0 1.62 95.0 720.0 5.7 NA 124.0 12 18.0 68 27.0 2014-06-11 187.0 21 217 351.0 1.75 105 736 5.8 2.0 127 14 16 60 33.7 2014-06-12 137 22 164 212.0 1.83 102 754.0 4.3 2.1 127 20 20 60 36.3 2014-06-13 103 22 144 117.0 1.84 99 826 3.7 2.0 125 16 23 65 41.3 2014-06-14 89 22 117 93 1.77 100 748 2.8 2.3 125 12 22 64 40.3 5.16284676401 5.16284676401 2.57931013071 3.74939867228 2014-06-06 5.16284676401 NA 4.31662430974 NA 3.87158337583 NA 2.57931013071 NA 2.81662878109 2014-06-18 NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.868852459016 Cluster 3 0 NA NA EM124 KM233048 Zaire ebolavirus NA Homo sapiens NA viral cRNA 09-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-EM124.4 taxon:186538 Sierra Leone 8 G-3670 Positive 20.0 Female Koindu Kissi Teng Kailahun Discharged 2014-07-08 Yes 2014-05-26 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.69901470087 7.69901470087 7.69901470087 7.69901470087 2014-05-27 7.69901470087 2014-06-06 NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No Yes No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 2 NA NA G3670 KM034553 Zaire ebolavirus serum Homo sapiens NA viral cRNA 27-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3670.1 taxon:186538 Sierra Leone 9 G-3676 Positive 45.0 Female Buedu Kissi Teng Kailahun Died 2014-05-30 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.17133670769 8.80761074826 8.17133670769 8.48947372797 2014-05-27 8.17133670769 NA 8.80761074826 NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No 0.0 Cluster 1 0 NA NA G3676 KM034554 Zaire ebolavirus serum Homo sapiens NA viral cRNA 27-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3676.1 taxon:186538 Sierra Leone 10 G-3676 Positive 45.0 Female Buedu Kissi Teng Kailahun Died 2014-05-30 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.17133670769 8.80761074826 8.17133670769 8.48947372797 2014-05-27 8.17133670769 NA 8.80761074826 NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No 0.0 Cluster 1 0 NA NA G3676 KM034555 Zaire ebolavirus serum Homo sapiens NA viral cRNA 06-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3676.2 taxon:186538 Sierra Leone 11 G-3677 Positive 50.0 Female Koindu Kissi Teng Kailahun Died 2014-05-27 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.14473079255 9.14473079255 9.11630138184 9.13051608719 2014-05-26 9.14473079255 2014-05-27 9.11630138184 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA G3677 KM034556 Zaire ebolavirus serum Homo sapiens NA viral cRNA 26-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3677.1 taxon:186538 Sierra Leone 12 G-3677 Positive 50.0 Female Koindu Kissi Teng Kailahun Died 2014-05-27 Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.14473079255 9.14473079255 9.11630138184 9.13051608719 2014-05-26 9.14473079255 2014-05-27 9.11630138184 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA G3677 KM034557 Zaire ebolavirus serum Homo sapiens NA viral cRNA 27-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3677.2 taxon:186538 Sierra Leone 13 G-3679 Positive 15.0 Female Nyummdu Kissi Teng Kailahun NA NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.26223169357 7.317157612 7.26223169357 7.28969465279 2014-05-26 7.26223169357 2014-05-28 7.317157612 NA NA NA NA NA NA NA NA No Yes No No No No No No Yes No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No Yes No 0.0 Cluster 2 2 Sub-cluster a 0 G3679 KM034558 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3679.1 taxon:186538 Sierra Leone 14 G-3680 Positive 8.0 Female Nyummdu Kissi Teng Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.57103639539 8.57103639539 8.57103639539 8.57103639539 2014-05-28 8.57103639539 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No Yes No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 1 NA NA G3680 KM034559 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3680.1 taxon:186538 Sierra Leone 15 G-3682 Positive 54.0 Female Kolosu Kissi Teng Kailahun Died NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.92024778036 9.20736818601 5.92024778036 7.56380798319 2014-05-27 5.92024778036 2014-05-28 9.20736818601 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA G3682 KM034560 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3682.1 taxon:186538 Sierra Leone 16 G-3683 Positive 57.0 Female Fokoma Kissi Teng Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 7.10849805276 7.10849805276 7.10849805276 7.10849805276 2014-05-28 7.10849805276 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 1 NA NA G3683 KM034561 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3683.1 taxon:186538 Sierra Leone 17 G-3686 Positive 27.0 Female Buedu Kissi Tongi Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.64337116822 9.64337116822 9.64337116822 9.64337116822 2014-05-29 9.64337116822 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No 0.0 Cluster 1 2 NA NA G3686 KM034562 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3686.1 taxon:186538 Sierra Leone 18 G-3687 Positive 38.0 Female Buedu Kissi Tongi Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 5.30610721301 5.30610721301 5.30610721301 5.30610721301 2014-05-28 5.30610721301 NA NA NA NA NA NA NA NA NA NA No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No No No No No Yes No No No No No No No Yes No No No No No No No No No NA Cluster 1 1 NA NA G3687 KM034563 Zaire ebolavirus serum Homo sapiens NA viral cRNA 28-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3687.1 taxon:186538 Sierra Leone 19 G-3707 Positive 38.0 Female Daru Jawie Kailahun Died 2014-06-06 Yes 2014-05-31 2014-05-31 NA 36.7 110 70 98 22 NA 100 No No No No No No No No No Yes No No No No No No No Yes Yes No No No No Yes No No No Yes No Yes No No No No None No No Yes Yes Yes No No No No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 6.91162410985 6.91162410985 6.91162410985 6.91162410985 2014-05-31 6.91162410985 NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.832594235033 Cluster 3 0 NA NA G3707 KM233049 Zaire ebolavirus NA Homo sapiens NA viral cRNA 31-May-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3707 taxon:186538 Sierra Leone 20 G-3713 Positive 37.0 Female Dambu Njaluahun Kailahun Died 2014-06-11 Yes 2014-06-04 2014-06-04 2014-06-11 38.4 100 60 96 24 7 97 No No No No No No No No No Yes No No No No No No Neck pain No Yes Yes Yes No No Yes No Yes No No Yes Yes No No No No Nausea No Yes Yes Yes No No No No No 2014-06-04 42.0 39.0 56.0 221.0 1.9 99.0 89.0 6.8 3.6 125.0 8 21.0 71 2.7 2014-06-09 740.0 30 900 2000.01 1.58 98 1012 7.5 3.2 123 16 8 72 31.3 2014-06-10 607 27 912 2000.01 1.51 103 1343.0 4.7 NA 128 26 7 69 45.2 2014-06-11 402 19 843 2000.01 1.16 113 1472 4.4 3.3 127 32 5 56 49.7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.43990696049 8.43990696049 7.88132542633 8.21449792522 2014-06-07 NA 2014-06-09 8.43990696049 NA 7.88132542633 NA 8.32226138883 NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No Yes Yes No Yes No No No Yes No No No 1.0 Cluster 3 1 NA NA G3713 KM233050 Zaire ebolavirus NA Homo sapiens NA viral cRNA 09-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3713.2 taxon:186538 Sierra Leone 21 G-3713 Positive 37.0 Female Dambu Njaluahun Kailahun Died 2014-06-11 Yes 2014-06-04 2014-06-04 2014-06-11 38.4 100 60 96 24 7 97 No No No No No No No No No Yes No No No No No No Neck pain No Yes Yes Yes No No Yes No Yes No No Yes Yes No No No No Nausea No Yes Yes Yes No No No No No 2014-06-04 42.0 39.0 56.0 221.0 1.9 99.0 89.0 6.8 3.6 125.0 8 21.0 71 2.7 2014-06-09 740.0 30 900 2000.01 1.58 98 1012 7.5 3.2 123 16 8 72 31.3 2014-06-10 607 27 912 2000.01 1.51 103 1343.0 4.7 NA 128 26 7 69 45.2 2014-06-11 402 19 843 2000.01 1.16 113 1472 4.4 3.3 127 32 5 56 49.7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.43990696049 8.43990696049 7.88132542633 8.21449792522 2014-06-07 NA 2014-06-09 8.43990696049 NA 7.88132542633 NA 8.32226138883 NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No Yes Yes No Yes No No No Yes No No No 1.0 Cluster 3 1 NA NA G3713 KM233051 Zaire ebolavirus NA Homo sapiens NA viral cRNA 11-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3713.3 taxon:186538 Sierra Leone 22 G-3713 Positive 37.0 Female Dambu Njaluahun Kailahun Died 2014-06-11 Yes 2014-06-04 2014-06-04 2014-06-11 38.4 100 60 96 24 7 97 No No No No No No No No No Yes No No No No No No Neck pain No Yes Yes Yes No No Yes No Yes No No Yes Yes No No No No Nausea No Yes Yes Yes No No No No No 2014-06-04 42.0 39.0 56.0 221.0 1.9 99.0 89.0 6.8 3.6 125.0 8 21.0 71 2.7 2014-06-09 740.0 30 900 2000.01 1.58 98 1012 7.5 3.2 123 16 8 72 31.3 2014-06-10 607 27 912 2000.01 1.51 103 1343.0 4.7 NA 128 26 7 69 45.2 2014-06-11 402 19 843 2000.01 1.16 113 1472 4.4 3.3 127 32 5 56 49.7 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.43990696049 8.43990696049 7.88132542633 8.21449792522 2014-06-07 NA 2014-06-09 8.43990696049 NA 7.88132542633 NA 8.32226138883 NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No Yes Yes No Yes No No No Yes No No No 1.0 Cluster 3 1 NA NA G3713 KM233052 Zaire ebolavirus NA Homo sapiens NA viral cRNA 13-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3713.4 taxon:186538 Sierra Leone 23 G-3724 Positive 45.0 Female Benduma Jawie Kailahun Died 2014-06-07 Yes 2014-06-05 2014-06-05 2014-06-07 39.9 100 60 109 24 8 94 No No No No No No No No No Yes No No No No No No Sacral Pain No Yes Yes Yes No Yes Yes No No Yes No Yes No Yes No No No Semiconscious No No Yes No No No No No No 2014-06-05 95.0 28.0 682.0 1421.0 1.9 106.0 108.0 6.5 2.9 124.0 15 17.0 70 6.5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 10.563047275 10.563047275 10.563047275 10.563047275 2014-06-05 10.563047275 NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No Yes No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 1.0 Cluster 3 1 NA NA G3724 KM233053 Zaire ebolavirus NA Homo sapiens NA viral cRNA 05-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3724 taxon:186538 Sierra Leone 24 G-3729 Positive 28.0 Female Njala Jawie Kailahun NA NA No NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-07 19.0 34.0 124.0 28.0 1.67 89.0 28.0 1.7 2.4 122.0 11 19.0 75 2.1 2014-06-09 2000.01 26 1694 2000.01 1.65 87 661 4.0 8.51 135 19 NA 84 19.4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 8.83863473858 8.83863473858 8.83863473858 8.83863473858 2014-06-07 8.83863473858 NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.126352395672 Cluster 2 0 NA NA G3729 KM233054 Zaire ebolavirus NA Homo sapiens NA viral cRNA 07-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3729 taxon:186538 Sierra Leone 25 G-3734 Negative 39.0 Female Kailahun Luawa Kailahun NA NA Yes NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 2014-06-07 221.0 45.0 210.0 1251.0 2.16 120.0 145.0 2.1 6.3 157.0 8 16.0 81 19.9 2014-06-10 80.0 35 75 62.0 2.29 107 87 8.4 4.0 129 15 21 78 6.1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No No No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 0.0 Cluster 2 0 NA NA G3734 KM233055 Zaire ebolavirus NA Homo sapiens NA viral cRNA 07-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3734.1 taxon:186538 Sierra Leone 26 G-3735 Positive 60.0 Male Daru Jawie Kailahun Died 2014-06-09 Yes 2014-06-07 NA 2014-06-09 37.4 130 90 118 22 7 96 No No No No No No No No No Yes Yes Yes No Yes No No No No Yes No No No No Yes Yes Yes No No Yes Yes No No No No Poor appetite No No Yes No No No No No No 2014-06-07 818.0 38.0 420.0 2000.01 1.98 101.0 237.0 6.0 5.4 137.0 11 21.0 68 13.5 2014-06-09 1503.0 26 1195 2000.01 1.99 102 718 5.0 6.4 137 25 14 58 30.6 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.10959044772 9.37277158662 9.10959044772 9.24118101717 2014-06-07 9.10959044772 2014-06-09 9.37277158662 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 1.0 Cluster 3 0 NA NA G3735 KM233056 Zaire ebolavirus NA Homo sapiens NA viral cRNA 07-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3735.1 taxon:186538 Sierra Leone 27 G-3735 Positive 60.0 Male Daru Jawie Kailahun Died 2014-06-09 Yes 2014-06-07 NA 2014-06-09 37.4 130 90 118 22 7 96 No No No No No No No No No Yes Yes Yes No Yes No No No No Yes No No No No Yes Yes Yes No No Yes Yes No No No No Poor appetite No No Yes No No No No No No 2014-06-07 818.0 38.0 420.0 2000.01 1.98 101.0 237.0 6.0 5.4 137.0 11 21.0 68 13.5 2014-06-09 1503.0 26 1195 2000.01 1.99 102 718 5.0 6.4 137 25 14 58 30.6 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 9.10959044772 9.37277158662 9.10959044772 9.24118101717 2014-06-07 9.10959044772 2014-06-09 9.37277158662 NA NA NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No No No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No 1.0 Cluster 3 0 NA NA G3735 KM233057 Zaire ebolavirus NA Homo sapiens NA viral cRNA 09-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3735.2 taxon:186538 Sierra Leone 28 G-3750 Positive 50.0 Male Daru Jawie Kailahun Discharged 2014-06-14 Yes 2014-06-11 2014-06-11 2014-06-14 36.2 120 70 105 20 3 95 No No No No No No No No No Yes No No No No No No No Yes Yes No No No No No No No No Yes No No No No No No None No Yes No No No No No No No 2014-06-10 136.0 32.0 141.0 472.0 1.82 105.0 260.0 6.9 3.0 128.0 12 15.0 74 10.4 2014-06-11 211.0 27 122 400.0 1.98 108 188 8.6 3.3 129 9 23 75 8.4 2014-06-13 117 24 65 124.0 1.77 109 120.0 5.8 3.4 134 8 30 60 4.8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 4.2746688236 4.2746688236 2.26851636001 3.47961128794 2014-06-10 4.2746688236 NA 3.89564868023 NA 2.26851636001 NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No Yes No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No NA Cluster 3 1 NA NA G3750 KM233058 Zaire ebolavirus NA Homo sapiens NA viral cRNA 10-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3750.1 taxon:186538 Sierra Leone 29 G-3750 Positive 50.0 Male Daru Jawie Kailahun Discharged 2014-06-14 Yes 2014-06-11 2014-06-11 2014-06-14 36.2 120 70 105 20 3 95 No No No No No No No No No Yes No No No No No No No Yes Yes No No No No No No No No Yes No No No No No No None No Yes No No No No No No No 2014-06-10 136.0 32.0 141.0 472.0 1.82 105.0 260.0 6.9 3.0 128.0 12 15.0 74 10.4 2014-06-11 211.0 27 122 400.0 1.98 108 188 8.6 3.3 129 9 23 75 8.4 2014-06-13 117 24 65 124.0 1.77 109 120.0 5.8 3.4 134 8 30 60 4.8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 4.2746688236 4.2746688236 2.26851636001 3.47961128794 2014-06-10 4.2746688236 NA 3.89564868023 NA 2.26851636001 NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No Yes No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No NA Cluster 3 1 NA NA G3750 KM233059 Zaire ebolavirus NA Homo sapiens NA viral cRNA 12-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3750.2 taxon:186538 Sierra Leone 30 G-3750 Positive 50.0 Male Daru Jawie Kailahun Discharged 2014-06-14 Yes 2014-06-11 2014-06-11 2014-06-14 36.2 120 70 105 20 3 95 No No No No No No No No No Yes No No No No No No No Yes Yes No No No No No No No No Yes No No No No No No None No Yes No No No No No No No 2014-06-10 136.0 32.0 141.0 472.0 1.82 105.0 260.0 6.9 3.0 128.0 12 15.0 74 10.4 2014-06-11 211.0 27 122 400.0 1.98 108 188 8.6 3.3 129 9 23 75 8.4 2014-06-13 117 24 65 124.0 1.77 109 120.0 5.8 3.4 134 8 30 60 4.8 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 4.2746688236 4.2746688236 2.26851636001 3.47961128794 2014-06-10 4.2746688236 NA 3.89564868023 NA 2.26851636001 NA NA NA NA NA NA No Yes No No No No No No No No No No No No No No No No No No No No No No No Yes No No No No Yes No Yes No No No No Yes No No No No No No No Yes No Yes No No No Yes No No No NA Cluster 3 1 NA NA G3750 KM233060 Zaire ebolavirus NA Homo sapiens NA viral cRNA 14-Jun-2014 Ebola virus/H.sapiens-wt/SLE/2014/Makona-G3750.3 taxon:186538 Sierra Leone &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip &vellip
In [18]:
size(bigdf)
Out[18]:
(92,226)
The annotations can now be written to file as a table.
In [19]:
writetable("ebola-sle-2014.txt", df, separator = '\t', header = true)
I make a dictionary of the sequences by accession...
In [20]:
seqstrings=[content(find_element(s,"INSDSeq_sequence")) for s in sequences];
seqdict = Dict{ASCIIString,ASCIIString}()
for i in 1:numseq
seqdict[accessions[i]]=seqstrings[i]
end
...then I write them out to a FASTA file.
In [21]:
f=open("ebola-sle-2014.fasta","w")
for i in 1:size(bigdf)[1]
acc = bigdf[:accession][i]
@printf(f,">%s\n%s\n",acc,seqdict[acc])
end
close(f)
Content source: molecular-epidemiology/molepi-data
Similar notebooks: