MentaLiST detects and reconstructs putative novel alleles, while also calling non-present loci, allowing its use on wgMLST schemes. Below we will show how to include novel alleles in the analysis.
First, let's do a re-cap of MentaLiST methods. If you are familiar with MentaLiST, you can skip until the section 'Updating an MLST scheme with the detected novel alleles'.
In [1]:
# depending on how you installed mentalist, you might have to add it and julia to the PATH:
PATH=$PATH:/rhome/pfeijao/sfu/MentaLiST/src:/rhome/pfeijao/bin
On this example we will call MLST alleles on a M. tuberculosis WGS sample. Follow the 'Basic Usage' tutorial to download and create the MLST database and to obtain the FASTQ files for this sample. To run the MLST caller:
In [2]:
# Go to the tmp folder:
mkdir -p /tmp/mentalist_quick_start
cd /tmp/mentalist_quick_start
Then, we run MentaLiST on the M. tuberculosis MLST scheme, setting the --kt
parameter to 10 to search more agressively for novel alleles (default is 6).
In [3]:
mentalist call -o sample.call --db mtb_cgmlst.db --kt 10 --output_votes --output_special -1 SRR6152708_1.fastq.gz -2 SRR6152708_2.fastq.gz
[ Info: Opening kmer database ...
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.
In [4]:
ls -l sample.call*
-rw-rw-r--. 1 pfeijao pfeijao 27602 Feb 14 10:48 sample.call
-rw-rw-r--. 1 pfeijao pfeijao 27643 Feb 14 10:48 sample.call.byvote
-rw-rw-r--. 1 pfeijao pfeijao 128763 Feb 14 10:48 sample.call.coverage.txt
-rw-rw-r--. 1 pfeijao pfeijao 89361 Feb 14 10:48 sample.call.novel.fa
-rw-rw-r--. 1 pfeijao pfeijao 5838 Feb 14 10:48 sample.call.novel.txt
-rw-rw-r--. 1 pfeijao pfeijao 205768 Feb 14 10:48 sample.call.special_cases.fa
-rw-rw-r--. 1 pfeijao pfeijao 719 Feb 14 10:48 sample.call.ties.txt
-rw-rw-r--. 1 pfeijao pfeijao 1209964 Feb 14 10:48 sample.call.votes.txt
In [5]:
# The 'main' file is sample.call, with the allele calls. Let's show the first 12 calls:
cut -f1-12 sample.call | column -ts $'\t'
Sample Rv0014c Rv0015c Rv0016c Rv0017c Rv0018c Rv0019c Rv0021c Rv0022c Rv0023 Rv0024 Rv0025
SRR6152708 5 2 1 1 2 1 1 1 1 N 1
The file sample.call.coverage.txt has a description of each the call for each locus. There are different types of call possible:
In [6]:
head -n12 sample.call.coverage.txt | grep Called
SRR6152708 Rv0014c 1.0 43 Called allele 5.
SRR6152708 Rv0015c 1.0 18 Called allele 2.
SRR6152708 Rv0016c 1.0 45 Called allele 1.
SRR6152708 Rv0017c 1.0 45 Called allele 1.
SRR6152708 Rv0018c 1.0 33 Called allele 2.
SRR6152708 Rv0019c 1.0 39 Called allele 1.
SRR6152708 Rv0021c 1.0 35 Called allele 1.
SRR6152708 Rv0022c 1.0 32 Called allele 1.
SRR6152708 Rv0023 1.0 57 Called allele 1.
SRR6152708 Rv0025 1.0 55 Called allele 1.
In [7]:
grep "Not present" sample.call.coverage.txt
SRR6152708 Rv1417 0.0 0 Not present; allele 58 is the best covered but below threshold with 188/435 missing kmers.
MentaLiST
looks for a novel allele that has 100% coverage, using existing alleles as "template" for creating a novel allele. Some novel alleles in the first 100 calls:
In [8]:
head -n 100 sample.call.coverage.txt | grep Novel
SRR6152708 Rv0024 1.0 42 Novel, 1 mutation from allele 98: Del of len 1 at pos 719
SRR6152708 Rv0035 1.0 46 Novel, 2 mutations from allele 227: Subst C->G at pos 47, Subst A->T at pos 76
SRR6152708 Rv0045c 1.0 61 Novel, 4 mutations from allele 62: Subst C->T at pos 318, Del of len 2 at pos 650, Subst A->G at pos 652
SRR6152708 Rv0063 1.0 55 Novel, 1 mutation from allele 140: Ins of base G at pos 334
SRR6152708 Rv0101 1.0 50 Novel, 2 mutations from allele 1541: Subst A->G at pos 5360, Subst A->G at pos 6088
In [9]:
grep Multiple sample.call.coverage.txt
SRR6152708 Rv0471c 1.0 43 Multiple possible alleles:1, 118 with depth 43, 39 and votes 0, -724. Most voted (1) is chosen on call file.
SRR6152708 Rv1318c 1.0 36 Multiple possible alleles:302, 1 with depth 36, 36 and votes 4427, 0. Most voted (302) is chosen on call file.
SRR6152708 Rv1319c 1.0 35 Multiple possible alleles:8, 3 with depth 35, 35 and votes 3930, 3402. Most voted (8) is chosen on call file.
SRR6152708 Rv1911c 1.0 26 Multiple possible alleles:1, 118 with depth 26, 26 and votes 0, -244. Most voted (1) is chosen on call file.
SRR6152708 Rv2319c 1.0 28 Multiple possible alleles:7, 1 with depth 28, 30 and votes 101, 0. Most voted (7) is chosen on call file.
In [10]:
grep Partially sample.call.coverage.txt
SRR6152708 Rv0275c 0.986 0 Partially covered allele or novel allele; Best allele 26 has 10/696 missing kmers, and no novel was found. Gaps on positions: (635, 665)
SRR6152708 Rv0581 0.973 0 Partially covered allele or novel allele; Best allele 1 has 5/186 missing kmers, and no novel was found. Gaps on positions: (115, 119)
SRR6152708 Rv0860 0.987 0 Partially covered allele or novel allele; Best allele 331 has 28/2133 missing kmers, and no novel was found. Gaps on positions: (2096, 2133)
SRR6152708 Rv1860 0.9526 0 Partially covered allele or novel allele; Best allele 152 has 45/948 missing kmers, and no novel was found. Gaps on positions: (855, 901)
SRR6152708 Rv1999c 0.9575 0 Partially covered allele or novel allele; Best allele 5 has 55/1293 missing kmers, and no novel was found. Gaps on positions: (1182, 1236)
SRR6152708 Rv2249c 0.998 0 Partially covered allele or novel allele; Best allele 1 has 3/1521 missing kmers, and no novel was found. Gaps on positions: (1, 3)
SRR6152708 Rv3017c 0.775 0 Partially covered allele or novel allele; Best allele 1 has 75/333 missing kmers, and no novel was found. Gaps on positions: (1, 75)
SRR6152708 Rv3394c 0.9614 0 Partially covered allele or novel allele; Best allele 85 has 60/1545 missing kmers, and no novel was found. Gaps on positions: (1462, 1541)
SRR6152708 Rv3795 0.9995 0 Partially covered allele or novel allele; Best allele 1 has 2/3267 missing kmers, and no novel was found. Gaps on positions: (28, 29)
In [11]:
# novel alleles found:
cat sample.call.novel.fa
>Rv0024_N1 Seen in 1 sample(s).
GTGAATACAGCGAGGTCGAGCTGTTGAGTCGCGCTCATCAACTGTTCGCCGGAGACAGTCGGCGACCGGGGTTGGATGCGGGCACCACACCCTACGGGGATCTGCTGTCTCGGGCTGCCG
ACCTGAATGTGGGTGCGGGCCAGCGCCGGTATCAACTCGCCGTGGACCACAGCCGGGCGGCCTTGCTGTCTGCTGCGCGAACCGATGCCGCGGCCGGGGCCGTCATCACCGGCGCTCAAC
GGGATCGGGCATGGGCCCGGCGGTCGACCGGAACCGTTCTCGACGAGGCTCGCTCGGATACCACCGTTACTGCGGTTATGCCGATAGCCCAGCGCGAAGCCATACGCCGTCGTGTGGCGC
GGCTGCGCGCGCAACGAGCCCATGTGCTGACGGCGCGACGACGGGCACGACGGCACCTGGCGGCGCTGCGTGCGCTGCGGTACCGGGTGGCGCACGGCCCGGGGGTCGCGCTGGCCAAAC
TTCGGCTGCCGTCGCCGAGCGGTCGCGCCGGCATCGCGGTCCACGCCGCGCTGTCGCGACTTGGCCGTCCCTATGTCTGGGGCGCAACGGGGCCCAACCAGTTCGACTGTTCCGGTTTGG
TCCAGTGGGCCTACGCCCAGGCGGGTGTTCACCTGGATCGCACCACCTATCAACAGATCAACGAGGGGATCCCGGTGCCGCGCTCACAGGTCCGGCCGGGCGATCTGGTCTTCCCGCACC
CCGGGCACGTGCAGCTGGCGATCGGCAACAATCTGGTCGTCGAGGCGCCCCATGCGGGCGCGTCGGTTCGGGTCAGCTCGCTGGGCAACAACGTGCAGATTCGGCGACCGCTGAGTGGCA
GATAA
>Rv0035_N1 Seen in 1 sample(s).
ATGACGGCGGCCTTGCTTTCACCAGCCATCGCCTGGCAGCAGATCTGGGCTTGCACGGACCGCACGCTGACGATCTCTTGCGAGGATTCCGAGGTAATCAGCTATCAGGACCTCATCGCG
CGCGCGGCGGCATGCATCCCCCCGCTACGGCGTCTTGACCTCAAACGCGGTGAACCCGTGCTGATCACCGCCCACACCAACCTGGAATTCCTGTCCTGCTTTTTGGGCCTCATGCTCCAT
GGCGCTGTGCCGGTACCCATCCCGCCGCGGGAGGCACTGAAGACCACCGAGCGTTTCATGACTCGGCTCGGCCCACTGCTGCGCCATCACCGCGTGCTGATCTGCACACCGGCCGAACAC
GACGAGATACGCGCTGCCGCCAGCACCGACTGCCAGATCAGCAGATTTACTGCCCTAGCCGAGGCTGGCGACGAGCAGTTCGGCCGCGCCACGGCCCAGCAACTCGCCGACACCGCCACC
GCCGACTGGCCGCTATGCACCCTCGACGACGACGCCTACGTCCAATACACCTCTGGCAGCACCGCAGCACCACGCGGAGTGGTCATCACCTACCGCAACCTGCTGTCCAACATGCGCGCA
ATGGCCGTGGGCTCACAATTCCAGCACGGCGATGTCATGGGCAGCTGGCTGCCCTTGCACCATGACATGGGGCTGGTGGGCAGCCTATTCGCCGCACTCTTCAACAGTGTCAGCGCGGTA
TTCACCACGCCACACCGGTTTCTGTATGACCCGTTGGGATTCCTCAGACTGCTCACCAGCTCCGGGGCTACCCACACGTTCATGCCTAACTTCGCTCTGGAGTGGCTGATCAACGCCTAC
CACAGGCGCGGCGCCGACATCGAAGGCATCGACCTACACAAAATGCGCCGCTTGATCATCGCCTCCGAACCCGTCCATGCCGAGGGCATGCGGAGATTCGCCGCCACCTTCGCCGGCGTC
GGACTTGCCCCCACGGCCCTGGGTTCGGGCTATGGCCTGGCCGAAGCGACCGTCGCCGTGTCAATGTCAGCGCCCAACACGGGATTCCGCACCGAAACCCACGCCGCCGCGGAGGTCGTC
ACCGGCGGCCGAGTGCTGCCTGGCTACGAGGTGCGCATTGACGCCGCACCAGGTGCCCGGGCCGGAACGATCAAACTGCGCGGCGACAGCGTGGCCGCCAAAGCCTATGTGGGCGGGAAG
AAGCTGGACGCGCTCGACGAGGAAGGCTTCTGCGACACCCACGACTTGGGTTTTCTTGTAGACGACGAAATCGTCATCCTTGGCCGGCAGGACGAGGTGTTCATTGTCCACGGAGAAAAC
AGATTCCCCTACGACATCGAGTTCATCATTCGCGGGGAATCCGAGCAGCACCGGACCAAAGTCGCATGTTTCGGGGTCAACGAACGCGTCGTGGTTGTGTTGGAAAGCCCATTGGACAGC
ATCATCGACAAGGCCGAAGCCGACCGACTGAGATGTCAAGTCGTTGCCGCGACTGGGCTGCAGTTGGATGAACTGATCACGGTTCGGCGCGGCGCGATTCCCACCACCACCAGCGGCAAG
CTCAAACGACGCGCCGTCGCGCAGGCTTATCGAGACGGCACACTGCCCCGTCTTGCCACCCACGCGTGGACGGCGGATCCCGATAGCGCTCCCAAAACGACCCGGTCCAGCCTGGAAGGC
GCCCACTGA
>Rv0045c_N1 Seen in 1 sample(s).
TCAGCGTGTGTCGAGCACCCCGCGCACGATCTCGATCAGGGCGCGCGGTTGGTCACTTTGCACCGAGTGGCCTGACTTCTCGACGATGTGAACGCCACGGAAATGCGTTGCACGCCTGTG
GAGTTCGGCGGTGTCCTGGTCGGTGACGAAGCCCGACGAGCCGCCGCGCACGAGTGTGATCGGCGCGGACAGGGCGTCGACGTCGTCCCAGAGCCCTGCGAAATCTCCGAACGTGCGGAT
CGCGTCATAGCGCCACACCCAGTTGCCGTTGTCCAGCCGGCGGGAGTTGTGGAACACGCCGCGGCGCAACGACTTGATATCGCGGTGCGGGGCCGCGGCGATCGTTAGGTCCAGCATGGC
CTGAAAGCTGGGGAATTCCCGCTCGCCGTGCATCAGCGCCACCGTGCCGCGCTGCTCGGCGGTCAGCTCGGCGTGCCGTTGCAATGCCGACGGGGTGACGTCGACGAGAACGAGTTCGCC
GACCAGGTCGGGTGCCATCGCGGCCAGCCGTATCGCAGTCAACCCGCCCAGCGACATGCCGACCACGAATTCGGCACCCGGCGCAAGCTCGCGTAGCACCGGCGCCAAGGTCTCGGAGTT
GAGCTGCGGCGAGTAATTGCCGTCCTCCCGCCAAGCGGAATGGCCGTGCTGGAAGGTCCACCGCCAGCGCCGGCTCACCCAGGCCGACGATCACGGTGTCCCAGGTATGGGCGTTCTGTC
CGCCGCCGTGCAGAAAGATCACCCGCGGCGCAGAGCCGCCCCAGCGCAGCGCGCTGATGGCTCCCGCTTGGACCCGCTCGACTTCAGGCAGTGGACCATTGACACCGGCCTGCTCAGCGT
TCTCAGCCAGCAGGGCAAACTCGTCCAGTCCGGTCAGTTCGTCGTCAGATAGCAC
>Rv0063_N1 Seen in 1 sample(s).
TTGGCGCGTGAGATCTCACGCCAGACGTTTCTGCGGGGTGCCGCCGGAGCGTTGGCCGCCGGCGCGGTCTTCGGCTCGGTCCGGGCTACCGCGGATCCGGCTGCCTCTGGCTGGGAGGCT
CTTTCTTCCGCCCTCGGAGGGAAAGTGCTACAACCGGACGACGGTCCCCAATTCGCAACGGCCAAGCAGGTTTTCAACACCAACTACAACGGCTATACGCCGGCGGTGATCGTTACCCCG
ACATCGCAGCTGGACGTGCAGAAGGCGATGGCGTTCGCTGCCGCGAACAACCTCAAGGTGGCCCCACGCGGTGGCGGGCACTCCTACGTGGGGGCGTCCACGGCCAACGGCGCCATGGTG
CTCGACCTACGTCAGCTACCTGGGGACATCAACTACGACGCCACCACCGGGCGGGTCACGGTGACGCCCGCCACCGGTTTGTACGCCATGCACCAGGTGTTGGCCGCGGCCGGCCGGGGC
ATCCCGACCGGCACCTGCCCGACGGTCGGTGTCGCGGGACACGCGCTGGGCGGCGGGCTGGGCGCCAATTCCCGGCACGCCGGCCTGCTCTGTGACCAATTGACGTCGGCGTCGGTGGTG
CTGCCCAGCGGCCAGGCGGTCACCGCGTCCGCCACCGACCACCCCGACCTGTTCTGGGCGTTGCGCGGTGGCGGTGGCGGCAACTTCGGCGTGACAACCTCGCTGACCTTCGCGACGTTC
CCCAGCGGGGACCTCGACGTCGTGAACCTCAATTTCCCACCGCAGTCGTTCGCGCAGGTTCTGGTCGGTTGGCAGAATTGGCTGCGAACCGCCGACCGAGGCAGCTGGGCACTGGCCGAT
GCCACCGTCGACCCGCTGGGCACGCATTGCCGCATCCTTGCGACCTGCCCGGCCGGGTCGGGCGGCAGCGTGGCGGCCGCCATCGTTTCGGCCGTCGGAACGCAACCGACCGGCACCGAA
AACCACACGTTCAACTATCTGGACCTGGTCAGATATCTGGCCGTCGGGAACCTCAACCCGTCGCCGCTGGGATATGTCGGCGGATCCGATGTCTTCACGACGATCACTCCGGCGACCGCC
CAGGGAATCGCCTCGGCGGTCGACGCCTTTCCGCGTGGAGCGGGCCGCATGTTGGCGATCATGCACGCCCTCGACGGCGCGCTCGCCACTGTGTCACCGGGGGCCACGGCCTTCCCGTGG
CGTCGGCAGTCGGCGCTGGTGCAGTGGTACGTCGAAACATCCGGCTCCCCGTCGGAAGCGACTAGCTGGCTCAACACCGCACATCAAGCGGTGCGAGCGTATTCGGTTGGCGGCTATGTG
AACTATCTCGAGGTAAACCAACCGCCGGCACGTTACTTTGGCCCGAATCTGTCCCGGCTGAGCGCAGTACGTCAGAAGTATGACCCCAGCCGGGTCCATGTTCTCCGGGCTGAACTTCTA
G
>Rv0101_N1 Seen in 1 sample(s).
GTGCACCGAGTTCGGTTAAGCCGCTCGCAGCGCAACCTCTACAACGGCGTGCGCCAGGATAACAATCCCGCGTTATATCTGATCGGCAAGAGCTATCGGTTCCGCCGGTTGGAGCTGGCG
AGATTCCTGGCCGCTCTGCACGCAACGGTACTGGACAACCCCGTGCAACTTTGCGTCCTGGAGAATTCGGGGGCAGACTATCCGGATCTGGTGCCGCGGCTACGGTTCGGCGACATCGTG
CGGGTGGGGTCAGCCGATGAGCACCTGCAGAGCACATGGTGTTCGGGCATCCTGGGCAAGCCACTGGTGCGGCATACGGTGCACACCGACCCGAACGGGTATGTGACCGGTCTGGACGTT
CACACCCACCACATCCTGCTGGACGGCGGCGCGACCGGGACGATCGAAGCTGACCTGGCGCGTTACCTGACCACCGACCCGGCGGGCGAAACCCCCAGTGTCGGTGCGGGTCTAGCCAAG
CTCAGGGAGGCGCACCGTCGTGAGACGGCCAAGGTGGAAGAATCGCGGGGGCGCCTGTCGGCTGTCGTGCAGCGTGAACTCGCCGACGAAGCATACCACGGCGGGCACGGGCACAGCGTT
AGCGACGCTCCCGGGACCGCGGCCAAGGGCGTCCTGCACGAATCGGCAACGATCTGCGGCAACGCGTTTGATGCCATCCTGACCCTTTCGGAAGCGCAGCGGGTCCCGCTTAATGTGCTG
GTGGCTGCGGCGGCCGTCGCGGTGGACGCGAGCCTTCGGCAGAACACCGAAACCCTCTTGGTGCACACGGTGGACAACCGGTTCGGAGATTCTGATCTGAATGTCGCGACCTGTTTGGTC
AATTCGGTTGCCCAGACCGTCCGGTTTCCCCCATTTGCGTCGGTGTCCGATGTCGTTCGAACGCTTGACCGCGGCTATGTCAAGGCGGTAAGACGCCGGTGGCTTCGTGAGGAGCATTAC
CGCCGAATGTATTTGGCGATCAACCGGACATCTCACGTGGAGGCGTTGACGCTAAATTTCATTCGCGAGCCATGCGCACCTGGCCTGCGCCCGTTCTTGTCGGAGGTCCCGATTGCCACG
GATATCGGTCCGGTCGAGGGCATGACGGTGGCGTCTGTTCTGGACGAAGAACAGCGCACACTGAACCTAGCCATCTGGAACCGAGCCGATCTGCCCGCGTGCAAGACACACCCCAAGGTC
GCGGAACGGATAGCGGCAGCGTTGGAATCGATGGCGGCGATGTGGGATCGGCCGATCGCCATGATCGTCAACGACTGGTTCGGGATCGGCCCGGACGGGACTCGCTGCCAAGGCGATTGG
CCAGCCCGTCAGCCGTCGACGCCCGCGTGGTTTCTCGATTCCGCAAGGGGCGTCCACCAATTTCTCGGCAGGCGCCGCTTCGTCTACCCGTGGGTCGCGTGGTTGGTGCAACGCGGCGCC
GCACCGGGTGATGTTCTGGTGTTCACCGACGACGACACCGACAAGACCATTGACCTGCTCATCGCGTGTCACCTTGCGGGTTGCGGGTACAGCGTCTGCGACACCGCTGACGAAATTTCC
GTGCGGACCAATGCGATTACCGAGCACGGCGATGGCATCTTGGTGACAGTGGTCGACGTGGCCGCCACCCAGCTGGCGGTTGTCGGCCATGACGAGCTGCGGAAGGTCGTTGACGAGCGC
GTCACACAGGTGACACACGACGCACTGCTGGCCACCAAGACCGCCTACATCATGCCGACCTCGGGAACTACCGGACAACCCAAGCTGGTGCGAATCTCACACGGCTCGCTCGCGGTTTTC
TGTGATGCGATCAGCCGCGCCTACGGTTGGGGAGCCCACGACACCGTTCTGCAGTGCGCTCCGTTGACATCGGACATCAGCGTCGAGGAGATTTTCGGTGGCGCGGCCTGTGGCGCGCGA
CTGGTGCGATCCGCGGCTATGAAAACCGGCGACCTGGCGGCGCTGGTTGACGATCTCGTCGCCCGCGAGACGACAATCGTCGACCTGCCGACCGCCGTCTGGCAGCTGTTGTGCGCCGAC
GGCGACGCCATTGACGCGATCGGCCGCTCGCGCCTGCGGCAGATCGTAATCGGCGGTGAAGCCATCCGCTGTAGCGCCGTGGACAAGTGGCTTGAATCGGCTGCTTCACAAGGGATCTCG
CTGCTCTCGAGCTATGGTCCAACAGAAGCCACGGTCGTCGCCACCTTCTTGCCGATCGTTTGCGACCAGACCACCATGGACGGCGCACTGCTCAGGCTCGGCCGGCCGATCCTACCGAAC
ACGGTGTTCCTCGCGTTCGGTGAAGTCGTCATTGTCGGGGATTTAGTCGCCGACGGCTACCTCGGGATCGACGGCGACGGCTTCGGCACCGTGACGGCCGCAGACGGTTCCCGACGCCGT
GCCTTTGCCACTGGCGACCGGGTGACCGTCGACGCCGAAGGATTTCCGGTCTTCTCCGGACGCAAAGACGCCGTCGTCAAGATCTCCGGCAAGCGTGTCGATATCGCTGAGGTAACCAGG
CGCATCGCCGAAGACCCCGCGGTGTCAGATGTCGCCGTCGAGTTGCACAGCGGAAGCCTCGGAGTGTGGTTCAAGAGCCAACGGACCCGCGAGGGCGAACAAGACGCTGCCGCGGCGACC
CGGATCAGGCTCGTCCTCGTGAGTCTGGGAGTGTCGTCGTTTTTCGTTGTCGGCGTGCCGAATATCCCGAGGAAGCCCAACGGGAAGATCGACAGCGACAACCTGCCGAGGCTGCCTCAG
TGGTCAGCTGCTGGGCTAAACACCGCCGAGACGGGTCAGCGAGCGGCCGGCCTCTCGCAGATCTGGAGCCGGCAGCTCGGCCGGGCAATCGGGCCGGACTCGTCGCTGCTTGGTGAGGGC
ATCGGCTCGTTGGATCTCATCAGAATACTGCCCGAGACGCGTAGGTATCTGGGGTGGCGCCTCTCGCTGCTGGATCTGATCGGTGCCGATACCGCCGCCAATCTGGCCGATTACGCGCCA
ACGCCCGACGCGCCGACGGGCGAAGATCGGTTTAGGCCGCTGGTGGCCGCGCAACGGCCCGCGGCGATTCCGTTGTCGTTTGCCCAGCGGCGACTATGGTTTCTCGACCAGTTACAGCGA
CCCGCTCCGGTCTACAACATGGCGGTGGCGTTGCGGCTGCGCGGGTATCTCGATACCGAGGCGTTGGGCGCGGCGGTCGCCGATGTCGTGGGCCGCCACGAAAGCCTACGGACGGTGTTT
CCGGCGGTCGACGGGGTCCCTCGGCAGCTGGTCATCGAAGCGCGGCGGGCAGATCTTGGCTGCGACATCGTCGATGCCACCGCATGGCCGGCTGACCGGCTGCAACGGGCCATCGAGGAG
GCGGCGCGCCACAGCTTCGATTTGGCAACCGAGATACCTTTGCGGACGTGGCTTTTCCGGATCGCCGACGACGAACATGTGCTGGTGGCGGTTGCACACCATATCGCCGCCGACGGCTGG
TCGGTGGCTCCGCTGACGGCCGATCTGAGTGCGGCATATGCCAGCCGTTGTGCGGGTCGGGCACCGGACTGGGCGCCATTGCCAGTGCAGTATGTCGATTACACGCTGTGGCAGCGGGAA
ATCCTCGGTGATCTCGACGACAGCGACAGCCCGATCGCCGCGCAGCTGGCCTACTGGGAAAATGCGTTGGCCGGTATGCCGGAACGGCTGCGGCTGCCCACCGCTCGGCCCTATCCACCG
GTTGCCGATCAGCGCGGCGCCAGTTTGGTGGTGGATTGGCCGGCGTCGGTGCAACAGCAGGTGCGTCGGATCGCCCGCCAGCACAACGCGACCAGCTTCATGGTGGTAGCTGCCGGGCTT
GCCGTGCTGCTGTCGAAACTCAGCGGAAGCCCCGATGTGGCGGTCGGATTTCCCATCGCCGGCCGCAGCGATCCTGCGCTGGATAACTTGGTGGGCTTTTTTGTCAACACCTTGGTGTTG
CGGGTCAACCTGGCCGGTGATCCCAGCTTCGCCGAACTGCTGGGGCAGGTGCGAGCGCGCAGCCTGGCCGCCTACGAAAATCAAGACGTACCTTTCGAGGTGCTCGTTGATCGCCTCAAA
CCCACTCGAGCCCTGACCCATCACCCGCTGATCCAGGTGATGTTGGCCTGGCAGGACAATCCGGTTGGACAGCTGAATTTGGGTGATCTGCAGGCCACCCCGATGCCGATCGACACCCGC
ACCGCCCGCATGGACTTGGTGTTTTCGTTAGCGGAACGCTTCAGCGAGGGTAGCGAACCTGCCGGGATCGGCGGAGCGGTGGAATACCGCACCGATGTGTTTGAAGCCCAAGCAATCGAC
GTGCTTATCGAGCGGTTGCGGAAGGTGTTGGTGGCGGTGGCCGCTGCTCCGGAACGGACGGTGTCGTCGATCGATGCGCTGGATGGGACCGAGCGTGCCCGGTTGGATGAGTGGGGTAAC
CGCGCTGTGCTGACTGCGCCCGCGCCCACGCCGGTGTCGATCCCGCAGATGTTGGCCGCCCAGGTGGCACGTATCCCCGAAGCGGAGGCGGTGTGTTGCGGGGACGCGTCGATGACGTAT
CGGGAACTCGACGAGGCGTCCAACCGGTTAGCGCATCGGCTGGCAGGTTGTGGGGCCGGCCCGGGCGAGTGTGTGGCGCTGCTGTTCGAGCGGTGCGCGCCGGCGGTCGTGGCGATGGTG
GCAGTGCTCAAAACCGGGGCGGCGTATCTGCCGATCGATCCGGCGAATCCTCCGCCGCGGGTGGCGTTCATGCTCGGCGACGCGGTGCCCGTGGCCGCGGTCACCACGGCTGGGCTGCGC
TCCCGGTTGGCGGGACACGACTTGCCGATCATCGATGTCGTCGATGCTTTAGCGGCATATCCGGGCACGCCCCCACCCATGCCGGCCGCAGTGAACCTCGCCTACATCCTGTACACCTCG
GGCACTACCGGCGAGCCCAAAGGCGTGGGGATCACCCATCGCAACGTCACCAGGCTGTTCGCATCACTGCCGGCACGCTTGTCGGCGGCGCAGGTGTGGTCGCAGTGTCATTCCTATGGC
TTCGACGCCTCGGCGTGGGAGATCTGGGGCGCGTTGCTAGGTGGTGGGCGACTGGTGATCGTGCCCGAGTCGGTGGCGGCCTCGCCGAACGACTTTCATGGGCTGCTCGTGGCCGAACAC
GTCAGCGTGCTGACTCAGACTCCGGCTGCGGTGGCAATGTTGCCGACGCAGGGTTTGGAGTCGGTGGCGTTGGTGGTGGCCGGTGAGGCATGTCCGGCAGCGCTGGTGGATCGGTGGGCG
CCCGGGCGGGTGATGCTAAATGCTTATGGCCCAACCGAGACCACGATCTGTGCGGCGATAAGTGCGCCGTTGCGACCGGGTTCGGGGATGCCGCCGATTGGTGTTCCGGTGTCGGGGGCG
GCGTTGTTTGTGCTGGATAGCTGGTTGCGCCCGGTACCGGCCGGGGTGGCCGGAGAGTTGTACATTGCCGGTGCGGGCGTCGGTGTTGGGTATTGGCGTCGGGCGGGGCTGACCGCGTCA
CGGTTTGTGGCCTGCCCATTCGGCGGTTCCGGGGCACGCATGTATCGCACCGGGGATCTGGTGTGTTGGCGCGCCGATGGCCAGTTGGAGTTCCTGGGGCGCACCGACGATCAGGTCAAG
ATCCGCGGGTATCGCATCGAGCTCGGCGAGGTTGCGACCGCGCTGGCCGAGCTGGCTGGGGTAGGTCAAGCGGTTGTAATCGCCCGTGAAGACCGCCCTGGGGACAAGCGCCTAGTCGGG
TATGCCACCGAAATTGCCCCCGGGGCAGTGGACCCGGCCGGGCTGCGGGCGCAACTAGCCCAGCGATTGCCCGGTTACCTGGTGCCAGCCGCGGTGGTAGTGATCGATGCGCTTCCGTTG
ACGGTCAACGGCAAACTTGATCATCGTGCGTTGCCGGCACCGGAATACGGTGATACCAACGGATATCGCGCTCCGGCCGGGCCGGTTGAGAAGACCGTGGCCGGCATCTTTGCCCGGGTG
CTTGGGCTTGAGCGGGTCGGCGTCGACGACTCGTTCTTCGAGCTCGGCGGCGATTCGCTGGCGGCAATGCGGGTTATCGCCGCGATCGACACCACCCTAAACGCCGATCTGCCGGTGCGC
GCGTTGCTGCACGCGTCGTCGACGAGAGGTTTAAGCCAGCTGTTGGGGCGAGATGCCCGACCGACCAGCGATCCGCGCTTGGTGTCTGTGCACGGCGACAACCCCACCGAGGTGCATGCC
AGCGACCTCACGCTGGACCGGTTCATCGACGCCGACACGCTGGCCACCGCCGTCAACCTGCCGGGCCCGAGCCCCGAGCTACGGACGGTCCTGCTGACGGGCGCGACGGGTTTCCTCGGA
CGGTATCTGGTCCTTGAATTGCTGCGGCGGCTGGACGTCGACGGCAGGCTGATCTGTTTGGTGCGGGCGGAGTCCGACGAGGATGCGCGGCGTCGTCTGGAGAAGACCTTCGATAGCGGT
GACCCGGAATTGCTGCGGCACTTCAAGGAGCTTGCCGCCGACCGGCTGGAGGTCGTCGCAGGCGACAAGAGCGAACCCGACCTGGGCCTGGACCAACCGATGTGGCGGCGGCTGGCCGAA
ACCGTGGATTTGATTGTCGATTCCGCGGCGATGGTCAACGCGTTTCCCTACCACGAATTGTTCGGGCCCAACGTCGCGGGCACCGCCGAGCTGATCCGAATCGCGCTTACCACCAAGCTC
AAACCCTTCACCTACGTGTCAACCGCCGACGTGGGTGCTGCGATCGAGCCGTCGGCGTTCACCGAGGACGCCGACATCCGGGTAATCAGCCCCACCCGCACCGTCGACGGCGGCTGGGCT
GGCGGCTACGGCACCAGCAAGTGGGCCGGTGAGGTGCTGCTGCGCGAGGCCAACGACCTGTGCGCGCTGCCGGTCGCGGTGTTTCGCTGCGGGATGATCCTGGCCGACACCAGCTATGCC
GGACAGCTCAACATGTCGGACTGGGTCACCCGGATGGTGTTGAGCTTGATGGCTACCGGCATCGCGCCTCGTTCGTTCTACGAACCGGACTCCGAGGGCAATCGGCAACGCGCGCACTTC
GACGGGCTGCCAGTCACCTTCGTTGCCGAGGCGATCGCGGTGCTGGGCGCGCGGGTGGCCGGCTCATCGTTGGCGGGATTTGCGACCTATCACGTGATGAACCCGCACGACGACGGTATC
GGGCTCGATGAGTATGTGGACTGGCTGATTGAGGCCGGCTACCCGATACGCCGCATCGATGACTTTGCGGAGTGGTTGCAGCGGTTTGAGGCCAGCCTGGGCGCTCTGCCGGATCGGCAA
CGCCGGCACTCGGTGCTGCCGATGCTGCTGGCGAGCAATTCCCAGCGATTGCAGCCGCTTAAGCCGACCAGGGGGTGCTCCGCGCCGACCGACCGATTCCGTGCCGCGGTGCGAGCGGCG
AAAGTCGGCTCCGACAAGGACAATCCAGACATCCCGCACGTGTCGGCGCCGACCATCATCAACTACGTCACCAACCTACAACTGCTCGGACTGCTGTAG
>Rv0134_N1 Seen in 1 sample(s).
ATGATCGCTCTGCCCGCCTTGGAAGGTGTCGAACATCGGCACGTGGATGTGGCGGAAGGCGTCAGGATCCACGTTGCGGACGCCGGGCCGGCCGATGGTCCGGCGGTAATGCTGGTGCAC
GGCTTCCCGCAGAACTGGTGGGAGTGGCGCGACCTCATCGGCCCGCTGGCCGCCGACGGCAACCGGGTGCTGTGTCCCGACCTGCGCGGCGCGGGCTGGAGTTCGGCGCCCCGCTCGCGG
TATACCAAGACCGAGATGGCTGACGATCTGGCTGCGGTTTTGGACGGCCTGGGTGTGGCCAAGGTCAAGCTGGTGGCCCACGATTGGGGTGGGCCGGTCGCGTTCATCATGATGTTGCGC
CATCCCGAGAAGGTGACCGGGTTTTCGGCGTGAACACCGTGGCACCCTGGGTGAAGCGCGATCTTGGCATGCTCCGCAATATGTGGCGGTTCTGGTATCAGATCCCCATGTCGCTGCCGG
TGATCGGCCCGCGGGTGATCAGCGATCCTAAGGGCCGCTACTTCCGGCTGTTGACCGGGTGGGTCGGGGGCGGATTTCGGGTTCCCGATGACGACGTGCGCCTGTACTTGGACTGCATGC
GCGAGCCGGGGCACGCCGAGGCCGGATCGCGGTGGTATCGCACCTTTCAGACCAGGGAAATGCTGCGCTGGCTGCGCGGCGAGTACAACGACGCTCGGGTCGATGTCCCGGTCCGATGGC
TGCACGGCACCGGAGATCCGGTGATCACGCCCGACCTGCTGGACGGCTATGCCGAGCGGGCCAGCGATTTCGAGGTGGAGCTGGTCGACGGCGTGGGCCATTGGATCGTCGAGCAGCGAC
CCGAGCTGGTGCTCGACCGGGTGCGTGCGTTCCTAGCTGCGGGGACCGAGCAGCGCGATTGA
>Rv0165c_N1 Seen in 1 sample(s).
TCAGCCAGGGCCTCCGTCAGCCTGCGTGCCCCATCGGTGAACTGCCAGACGGTGTGCTCGATTACGGCGGCTGTGTCGCGGCGGCGCAGCGCGGCGATCAGCTGCCGATGACTGTTCACC
GCGTCCGCGCCCCATCGCGGGTCGGCCGCGAACACCTGCGCCGGCATATAGCGCGCGGCATTAAGCAGGAACCAGGCCAACTTGATCCGGCGGCTCGCTTTGTTGAAGACGCGGTGGAAC
GCGAACTCGATCGACGCGATGGTTTTGGCATCACCGGACCCGATAGCACCGGCCAGCGCATTGTTGATGCGGTCCAGCTCGTCGATCTCAACGTCGGTGATGTGAGCGGTGGCCGATGTG
GCAAGTTCTTGGGCAATGGTGGCCTGCAGCCAGAAAATGTCGTCGATGTCTTGGCGGGTCAACGGCAGCACCACGTGGCCGCGATGTGGCTCCAGCCCGACCATCCCCTCACCGCGCAGT
TTCAGCAGCGCCTCCCGCACCGGCGTGACGCTGACTCCGAGCTCGGCTGCCGTCTCGTCCAGACGGATGAACGTTCCAGAGCGCAGGGCGCCCGACATGATGGCGGCCCGCAGGTGGCCC
GCGACCTCGTCGGACAACTGTGCCCGGCGCAGGGGAAGCTGGCTCCGCGGCTTCGCCGATAGAGGTGCGTTCAC
>Rv0195_N1 Seen in 1 sample(s).
ATGGCACCGGTGAATGTCATTTCGGTCGCGGTGGTGGCGAGCGACCCGTTGACCCGCGATGGAGCTTTGGCCCGACTCTCGTCTCACCGGGAGCTCGACGTGCGCGCTTGGCAGGCTGGA
TGCGAAACCTCGGTCCTGCTCGTGCTGGCCACCACGATCACCGCGCCTCTTCTATGCCAGATCGAGGACGCGCAGAAGGATGGCCCCAGTCACGCGCCGAAACTGGTCGTCGTCGCCGAC
GAATTCTCCGCTGAACAAGTTTTCCGGATGATCAAGCTGGGGTTGACCGGGTTGTTGTATCGCAGCCAGAGCACGTTCGACTGCATCGTCGAGACAATCCGGTTGTCCGCCGAAGGCCGC
CTGCGACTCCCCGAACGTGTCCAGCGTTACCTGGTCGGCCGCATCAAGTCCACCCCGACCGCCGAACCTGACACACCGTGCGCCGCCGCTCTTGCCGAGCGTGAGGTGGCGGTGCTGCGT
CTGCTAGCGGACGGCTTGAGCACGCACCAAGTGGCGGTGCAGCTCAACTATTGCGAGCGCACGATCAAGAACATCGTTCATGACATAGTGACGCGGCTGAAGCTCCGCAACCGCACGCAT
GCCGTCGCACATGCGCTGCGCGCGGGCCTCATTTGA
>Rv0226c_N1 Seen in 1 sample(s).
TTAATCCTGGGCGCGGCTCGCCGGCGTGTCTTCGCCGTGGTGTAAGTGTCGGCGCACCCAATAGCCGGCCGCGCCAGCGCCGCCGACCAGCAGCATCGAAAGCCACGCCCAATGCGCGAG
CATTGTCGCTTTGAGGCGGGCCGACGATGCACCGGAGGTTTGGCCGCCGACCCGATAAAGAGCCAATTCGTCGTCGCGGTGCGCCGCTGCTAGCCGGCCGAGGGTGCGTGCGGCCGCGCC
CATGTCGCCGGCGCTGTCGGATTCGACGACCAGCCACCCGACGCCGGCCGCGGCCAAGGTTGACGGATGGGGCCCGGTGAGCAGCAGCTCCTGGACCGCCCGGGCGTGCGCGTCTTCGCC
GGGAACGGTCACCCCGGAAATGACCAGATCACCTGTGGTCAGCACATCGGCGCGAACCCAACGGGGGAGCGGATCGAGTACCGGTGCCGAACCGGACCACGAGAAGCGCCGCATGGTGCC
CGCGGGCAAGACCGCAACCGTCCGGGGATCGGCATTGATCGCCGCTGCCACCGCCGCCCAACCGGACGGGTAGTGCACAGGCGCAACCTTGCCCCACACCCCCCACGCCAAGTCAGCCAG
CGTTAGGACCAGCGCCAGACAGCAGACCACCGCCGCCGTTGCCGGTCGCAGCCAGCGTCGCAGCGTTAGCACCGTGCCCGCACCGGAGAGTGTGTATCCGGGTACCGCCAGCGCGACCCA
CTTCTGTCCGTCGCGCAGCACGCCCAGGCCGGGTGCGGCATCGACCACCACCCGTAGCGCGTGCAGACCTGGGCCGGTCGCAAGGACAGCCGGGACCATCACGGACACCGCCGCTAGTGT
CAGCAGCGGCACTGCCACGGGCCGGCGCGCCACAGTCGGTAGTCCGATCGCCACCATGGCGAGTAGTACGACGGCGGATGCCACTGCGAAAAGCGTTGTCCGCGAGCTAGGTACGGCCTC
GCCGTTCCAGATCCCACCGAGACTGGCCAAGCTGCCAAGCGTGCCCAGCCCCGGTTCGGCGCGTGGCGCGAACGCGGTAACCCCAAGCTGATTGGCTGCCGTGTGGCTGGTCAACGACGA
GCCCAGCGCCGACGCCGTCAGCCAGGGCAGCGCACCCACCAGCGCGGAGCCCAACGCCGCGACCCCACATTGCCAGCGCGGGCGGCCCGCGCCGGGCATCGCCACGCACACCACCGCAAC
TGTCGCGGCGAGTAGCAGCCCGGACGGGGTCAGGCCGGCCAGCGCAACCCAGAACGCCAGCCCAAAAAGCCCGAACCAACCCGCGCCAACCGTTGTTCGCATCGTTAACATCGCGGTCGC
AACCCAGGGCAGACACCCATAGCCGACCAGCAGGCTCCAATGGCCCTGCAAAAGTCGTTCGGCCACATAGGGATTCCAGATCGCCAGCGTGATCGCGACAAACTGGCCGGCTGCCCCCGC
TGCGGGTAGTGCCGTTGCGACCAGTCGGGCCGCGCCCCAGCCCGCCAGCCAAAGCCCCAGCAGCAGCAGCGCTTTCACCACGACGCCGCCGTCGACGAGGTGTGACGCCAAAGCGACCGC
GAAGTCCTGCGGAGTCGCCCGGGGCGCCGATGTCAGCCCTAGGGCGTTGGCCGACACATACGACCGTGGTGTGGACACTGCATCGCGCAGCAGTAGGTATCCGGGCCGCAGTAGCGGCGC
GGCCAACAGCAGCACCAAGACCAGCGCGTACCCCGGTCGGAACCAGCGCAC
>Rv0276_N1 Seen in 1 sample(s).
ATGGCGATTTCGCTGGTGGCTCACCAGCCCATCCCCCACGTCGAGCGTCGCATGGCCGACCCACCCCGTCTCCAGCTGGCCAGGCGCCGGCGATCGGCGGCCGGCCCCGGCGGTAACGAG
GACAGCTTGATGGGAGTGGCGCTGCTAGCCGGCCCGGCCAACGTGATCATGGAGTTGGCGATGCCGGGTGTCGGCTACGGCGTGTTGGAGAGCCGTGTCGAAAGCGGCCGGCTGGACCGC
CATCCGATCAAGCGGGCGCGCACCACCTTTACCTACGTTGCGGTGGCCGTTGCCGGCAGCGACGACCAGAAGGCGGCCTTTCGTCGCGCGGTGAATAAGGTTCACGCGCAGGTGTATTCG
ACTCCGGAGAGCCCGGTGTCCTACCACGCGTTCGATCCCGAACTACAGCTGTGGGTGGCGGCATGCCTCTATAAGGGCGGCGTCGACGTCTACCGCACCTTCGTCGGCGAGATGGACGAC
GAAGAGGCCGACCATCATTACCGCGCGGGCATGGCGATGGGCACCACGTTGCAGGTGCCGCCGCAGATGTGGCCACCGGATCGGGCGGCCTTCGACCGCTACTGGCGGCAATCACTGGAC
AGGGTGCACATCGATGACGTCGTTCGCGACTACCTGTATCCGATCGTGGCGCTCCGAATTCGCGGGATCGCACTGCCGGGTCCGCTGCGGCGGCTGTCGGAGGGTATCGCGCTGCTGATC
ACCACCGGTTTCCTGCCGCAGCGGTTTCGCGACGAGATGCGGTTGCCGTGGGACGCGACCAAGCAGCGGCGCTTTGACGCGCTCATGGCCGTGCTGCGCACGGTGAATCGCCTGATGCCG
CGGTTTGTCCGGGAGTTCCCGTTCAACCTGATGCTCTGGGACCTGGACCGGCGGATGAGGCGCGGGCGCCCGCTGGTGTAA
>Rv0290_N1 Seen in 1 sample(s).
ATGTCCGGCACCGTCATGCAGATCGTCCGCGTCGCCATTCTTGCGGACAGCAGGTTGACCGAGATGGCCCTGCCCGCGGAGTTGCCACTGCGCGAAATCCTGCCCGCGGTACAACGCTTG
GTGGTTCCCTCGGCGCAAAACGGCGATGGTGGCCAAGCCGACTCCGGCGCTGCCGTGCAACTGAGTTTGGCGCCCGTCGGCGGGCAGCCGTTTAGCTTGGATGCCAACCTGGACACCGTC
GGTGTCGTCGACGGTGATCTGTTGGTGTTGCAGCCGGTGCCCACCGGTCCGGCCGCGCCGGGCATCGTCGAAGACATCGCCGACGCCGCGATGATCTTTTCGACGTCGCGGTTAAAGCCC
TGGGGCATAGCGCATATCCAACGAGGAGCGCTGGCCGCGGTGATTGCCGTGGCTCTGCTGGCTACCGGTTTGACGGTGACCTATCGGGTTGCCACCGGTGTGCTGGCCGGGCTGCTGGCG
GTGGCCGGGATCGCGGTGGCTAGCGCGCTGGCCGGATTGTTGATCACCATCCGTTCGCCACGTTCGGGTATCGCGCTGTCGATCGCCGCGCTGGTCCCCATCGGCGCGGCCCTGGCGTTG
GCGGTGCCAGGAAAGTTCGGGCCGGCGCAGGTATTGCTGGGTGCAGCTGGGGTAGCCGCATGGTCGCTGATCGCGCTGATGATTCCCAGCACCGAACGGGAACGCGTCGTCGCCTTCTTC
ACCGCAGCGGCGGTGGTCGGGGCGTCGGTGGCGCTGGCGGCCGGTGCGCAATTGCTGTGGCAGCTGCCGTTGTTGAGCATCGGCTGCGGGCTGATTGTGGCGGCGCTGTTGGTCACCATC
CAGGCGGCTCAGCTTTCCGCACTGTGGGCGCGGTTCCCGTTGCCGGTGATCCCGGCGCCGGGGGATCCCACCCCGTCGGCCCCGCCGTTGCGCCTGCTGGAGGATTTGCCTCGGCGGGTG
CGGGTCAGTGACGCCCATCAAAGCGGCTTCATCGCCGCGGCCGTGCTGCTCAGCGTGTTGGGGTCGGTGGCCATCGCGGTGCGCCCAGAGGCGCTCAGCGTTGTGGGCTGGTATCTGGTG
GCGGCGACTGCGGCCGCGGCCACCCTGCGCGCGCGGGTGTGGGATTCGGCCGCATGCAAGGCGTGGCTGCTGGCTCAGCCCTATCTGGTAGCCGGGGTCCTGTTGGTGTTCTACACCGCG
ACCGGACGCTATGTCGCCGCGTTCGGCGCGGTGCTGGTGCTAGCCGTGCTCATGCTGGCCTGGGTTGTGGTGGCACTGAACCCGGGCATCGCTTCGCCGGAGAGCTACTCGCTGCCGCTG
CGCCGGCTGCTGGGTTTGGTCGCCGCCGGGCTGGATGTTTCGCTGATCCCCGTCATGGCCTACCTGGTCGGATTGTTCGCTTGGGTGCTCAACAGATGA
>Rv0551c_N1 Seen in 1 sample(s).
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCAC
CGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTT
CTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTC
GACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGG
TCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGA
TCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGAGGCG
CCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATT
CTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGG
TTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACT
TGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGC
TGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAG
CCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGC
GCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGA
TGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGT
GGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
>Rv0654_N1 Seen in 1 sample(s).
ATGACCACCGCACAAGCCGCCGAATCCCAAAACCCATATCTCGAGGGCTTCCTGGCGCCGGTGAGCACCGAGGTAACTGCCACCGACCTGCCGGTCACCGGCCGCATTCCGGAACACCTC
GACGGGCGTTATCTGCGTAACGGCCCCAACCCGGTCGCGGAGGTCGACCCGGCCACCTACCACTGGTTCACCGGCGACGCCATGGTGCACGGAGTCGCGCTGCGCGACGGGAAGGCCCGC
TGGTATCGCAATCGCTGGGTCCGCACACCCGCGGTGTGCGCCGCCCTGGGCGAGCCCATTTCGGCCCGGCCTCACCCGCGCACCGGGATTATCGAGGGCGGTCCCAACACCAACGTGCTG
ACCCACGCCGGACGCACCCTGGCCTTGGTTGAGGCCGGCGTGGTCAACTACGAACTCACCGATGAGCTGGACACCGTGGGACCCTGTGACTTCGACGGCACCCTGCACGGCGGTTACACC
GCCCATCCGCAGCGTGATCCGCACACGGGTGAACTGCACGCGGTGTCCTACTCGTTCGCCCGCGGACACAGAGTGCAGTACTCGGTGATCGGCACCGACGGACACGCTCGTCGGACGGTT
GATATCGAGGTGGCGGGATCGCCGATGATGCACAGCTTCTCCCTGACCGACAACTACGTGGTGATCTACGACCTGCCGGTGACCTTCGACCCAATGCAGGTGGTGCCGGCGTCCGTGCCA
CGCTGGCTGCAACGGCCCGCCAGGTTGGTGATCCAGTCGGTCCTGGGCCGTGTCCGCATCCCCGACCCGATAGCGGCGTTGGGCAACCGGATGCAGGGTCACTCCGATCGCCTCCCGTAC
GCCTGGAACCCCAGCTACCCGGCGCGCGTCGGTGTCATGCCGCGCGAGGGTGGCAACGAGGACGTGCGGTGGTTCGACATCGAACCCTGCTACGTATACCACCCACTTAACGCCTACTCG
GAGTGCCGGAACGGCGCTGAGGTGCTGGTGTTGGACGTGGTGCGCTACTCACGGATGTTTGATCGCGACCGGCGGGGTCCCGGCGGTGACAGCCGGCCCTCGCTGGATCGCTGGACCATC
AACCTGGCGACCGGTGCGGTGACCGCCGAATGCCGCGACGATCGGGCGCAGGAGTTTCCCCGCATCAACGAGACTCTGGTGGGTGGGCCGCATCGCTTCGCCTACACCGTCGGCATCGAG
GGTGGGTTTCTCGTCGGCGCCGGCGCTGCGTTGTCGACTCCGCTGTATAAACAGGACTGCGTGACCGGGTCCAGCACGGTCGCCTCGCTCGATCCCGACCTGTTGATCGGCGAGATGGTG
TTCGTGCCGAACCCGTCGGCGCGTGCAGAAGATGACGGGATTCTCATGGGCTACGGCTGGCACCGCGGCCGCGACGAAGGCCAGCTGCTCTTGCTGGATGCCCAGACTCTCGAGTCGATC
GCCACCGTGCACCTGCCACAGCGTGTGCCGATGGGCTTCCACGGCAACTGGGCGCCGACCACCTGA
>Rv0739_N1 Seen in 1 sample(s).
TTGGTGCTGACGCGGCGCGCGCGCGTGAAGTGGCGCTGACACAGCACATTGGGGTATCCGCGGAGACCGATCGGGCCGTCGTCCCCAAGCTGCGCCAGGCCTATGACAGCCTGGTGTGCG
GTCGCCGCCGGCTTGGCGCCATTGGAGCCGAGATCGAGAACGCGGTGGCCCATCAGCGCGCGCTGGGCCTTGACACCCCGGCCGGTGCCCGTAACTTCTCCCGGTTTCTCGCCACCAAAG
CACACGACATCACGCGAGTGCTGGCAGCAACCGCCGCGGAATCCCAGGCCGGCGCGGCGCGGTTGCGATCCCTGGCTTCGTCCTATCAGGCTGTGGGATTTGGCCCCAAACCCCAGGAGC
CGCCTCCGGATCCAGTGCCATTTCCGCCCTACCAGCCGAAGGTGTGGGCGGCGTGCCGGGCGCGTGGCCAAGACCCGGACAAGGTCGTCAGGACGTTCCATCACGCGCCGATGAGCGCGA
GATTCCGCTCGCTACCGGCCGGAGACTCCGTGTTGTACTGCGGCAATGACAAGTACGGGCTGCTGCACATTCAGGCCAAGCATGGACGCCAATGGCACGATATTGCGGATGCACGATGGC
CGAGTGCAGGCAATTGGCGCTATCTCGCCGATTACGCAATCGGTGCCACACTGGCCTACCCGGAGCGAGTGGAGTACAACCAAGACAACGACACGTTCGCCGTATACCGGAGAATGTCGT
TGCCAGACGGCAGATACGTTTTCACAACCCGCGTCATTATTTCGGCACGCGACGGGAAGATCATTACGGCCTTCCCGCAGACGACGTGA
>Rv0757_N1 Seen in 1 sample(s).
ATGCGGAAAGGGGTTGATCTCGTGACGGCGGGAACCCCAGGCGAAAACACCACACCGGAGGCTCGTGTCCTCGTGGTCGATGATGAGGCCAACATCGTTGAACTGCTGTCGGTGAGCCTC
AAGTTCCAGGGCTTTGAAGTCTACACCGCGACCAACGGGGCACAGGCGCTGGATCGGGCCCGGGAAACCCGGCCGGACGCGGTGATCCTCGATGTGATGATGCCCGGGATGGACGGCTTT
GGGGTGCTGCGCCGGCTGCGCGCCGACGGCATCGATGCCCCGGCGTTGTTCCTGACGGCCCGTGACTCGCTACAGGACAAGATCGCGGGTCTGACCCTGGGTGGTGACGACTATGTGACA
AAGCCCTTCAGTCTGGAGGAGGTCGTGGCCAGGCTGCGGGTCATCCTGCGACGCGCGGGCAAGGGCAACAAGGAACCACGTAATGTTCGACTGACGTTCGCCGATATCGAGCTCGACGAG
GAGACCCACGAAGTGTGGAAGGCGGGCCAACCGGTGTCGCTGTCGCCCACCGAATTCACCCTGCTGCGCTATTTCGTGATCAACGCGGGCACCGTGCTGAGCAAGCCTAAGATTCTCGAC
CACGTTTGGCGCTACGACTTCGGTGGTGATGTCAACGTCGTCGAGTCCTACGTGTCGTATCTGCGCCGCAAGATCGACACTGGGGAGAAGCGGCTGCTGCACACGCTGCGCGGGGTGGGC
TACGTACTGCGGGAGCCTCGATGA
>Rv0818_N1 Seen in 1 sample(s).
TTGTTGGAGTTATTACTGCTGACCTCGGAGCTGTATCCGGATCCGGTCCTGCCGGCGCTGTCGCTGCTGCCCCACACCGTGCGGACGGCGCCGGCCGAGGCGTCTTCGTTGCTGGAGGCG
GGAAACGCAGACGCTGTGCTCGTCGACGCGCGCAACGACCTGTCGTCCGGGCGAGGCCTGTGCCGCCTGTTGAGCTCGACCGGCCGGTCGATCCCGGTACTGGCGGTGGTGAGCGAAGGC
GGGCTGGTGGCGGTCAGCGCTGACTGGGGGCTGGACGAGATCCTGCTGCTCAACACCGGGCCCGCTGAGATCGACGCCAGACTGCGGCTGGTGGTTGGCCGGCGCGGAGATCTGGCTGAC
CAGGAGAGTCTGGGCAAGGTGAGCCTGGGCGAGCTGGTGATCGACGAAGGCACCTACACCGCCCGGCTGCGTGGCCGCCCGCTGGATCTCACCTACAAAGAGTTCGAGCTGCTGAAATAC
CTGGCGCAGCATGCCGGCCGGGTGTTCACTCGGGCGCAGCTGCTGCACGAAGTATGGGGGTATGACTTCTTCGGGGGCACCCGGACTGTTGATGTGCACGTGCGGCGGTTGCGGGCCAAA
CTCGGCCCCGAGCATGAAGCGCTGATCGGCACGGTGCGCAACGTCGGATACAAAGCTGTTCGGCCGGCGCGCGGCCGACCGCCGGCCGCGGACCCCGACGACGAAGACGCCGATCCCGGC
CGGGATGGTATGCAAGAACCACTGGTCGACCCGTTGCGCAGTCAGTGA
>Rv0826_N1 Seen in 1 sample(s).
GTGACCCAAGATACGTCTGCTACCTGTCCGCTGACCAGCACCGTGCAGGATTCCTCGCCGGTTGCGGGCCAGCTTGGCAGGCCTATAGGGTTCCGCGGACTGGCCGGCGGTTGCCCCGTG
TCACCGCTGGGTTACGAATCGCCGCCGCTGCCGCTGGGGCCGGATTCGCTGACGTGGCGATACTTCGGTGACTGGCGCGGGATGCTGCAGGGACCGTGGGCGGGATCCATGCAGAATATG
CATCCGCAGCTGGGCGCGGCGGTCGAAGATCATTCGACGTTCTTCCGGGAACGCTGGCCACGGCTGCTGCGGTCGTTGTACCCGATCGGCGGAGTTGTCTTCGACGGCGATCGAGCCCCA
GTCACCGGTGTGCAGGTGCGTGACTACCACATCACCATCAAGGGTGTCGACGGTGCGGGCCGTCGCTACCACGCGTTGAATCCCGACGTCTTCTACTGGGCGCACGCCACCTTCTTTGTC
GGCACGTTGCATGTGGCCGAGCGGTTCTGCGGTGGCCTGACCGAGGCGCAGCGGCGCCAGCTATTTGACGAGCACGTCCAGTGGTACCGCATGTACGGCATGAGCATGCGGCCGGTGCCG
GCGACCTGGGAGGAGTTTCAGGACTACTGGGACCACATGTGCCGCAACGTGCTGGAGAACAACTTCGCGGCGCGTGCCGTGCTCGACCTGACCGAACTACCCAAACCGCCATTCGCCCAA
CGAGTTCCGGATTGGCTGTGGGCCGCGCCGCGCAAGTTGCTGGCCCGGTTCTTCGTCTGGCTGACCGTCGGACTCTACGATCCGCCCGTGCGCGAGCTGATGGGCTACCGGTGGTTGCGC
CGCGACGAATGGTTGCACCGCCGCTTTGGCGACATCGTCCGGCTCGTCTTTGCCTTGGTGGCATTCCGGTTTCGCAAGCACCCGCGGGCTCGCGCCGGCTGGGACCGTGCCACCGGCCGC
ATCCCCGCCGATGCGCCGCTAGTACAGACGCCCGCGCGCAACCTGCCGCCGCCCGACGAGCGTGACAACCCGACGCACTACTGCCCTAAGGTCTGA
>Rv0888_N1 Seen in 1 sample(s).
ATGGATTACGCCAAACGCATCGGCCAGGTTGGGGCGTTAGCCGTTGTCCTGGGGGTGGGGGCGGCGGTGACTACCCACGCGATCGGCTCTGCCGCGCCGACGGATCCGAGCTCCTCGAGC
ACCGATTCGCCGGTCGACGCGTGCTCGCCGTTGGGTGGGTCCGCCAGTTCGTTGGCTGCGATACCGGGCGCCAGTGTGCCACAGGTCGGCGTGCGACAGGTAGACCCCGGAAGCATCCCC
GATGACTTGCTCAATGCCCTGATCGACTTTCTGGCCGCGGTACGCAACGGGTTGGTGCCCATCATCGAAAACCGCACTCCGGTAGCGAATCCGCAACAAGTCAGCGTCCCTGAGGGGGGG
CACCGTCGGCCCGGTCCGGTTTGACGCCTGCGACCCCGATGGCAACCGGATGACCTTCGCGGTGCGCGAGCGCGGTGCACCCGGTGGACCCCAGCATGGCATCGTGACCGTCGACCAACG
AACGGCCAGCTTCATCTACACAGCCGATCCGGGTTTCGTTGGCACCGATACCTTCAGTGTGAACGTCAGCGATGACACCAGCCTGCACGTGCACGGTCTGGCGGGATACCTGGGTCCGTT
CCATGGGCACGACGACGTCGCCACCGTGACCGTGTTCGTCGGCAACACCCCGACCGACACCATCAGCGGCGACTTCAGCATGCTCACCTACAACATCGCGGGGCTGCCCTTCCCGCTATC
CAGCGCAATTCTGCCCCGGTTCTTCTACACCAAAGAGATTGGGAAGCGGCTCAACGCCTACTACGTCGCGAACGTCCAGGAGGATTTCGCCTACCACCAATTCCTCATCAAGAAATCCAA
GATGCCCAGCCAGACCCCGCCGGAGCCGCCTACCTTGCTGTGGCCTATCGGTGTGCCCTTCTCCGACGGGCTCAATACCCTCTCGGAGTTCAAGGTGCAGCGGCTGGACCGGCAGACATG
GTATGAGTGCACATCCGACAACTGCCTCACCTTGAAGGGCTTCACCTACAGCCAGATGCGGCTTCCCGGCGGTGACACGGTCGACGTCTACAACTTACATACCAACACCGGTGGAGGGCC
GACCACCAACGCCAACCTCGCGCAGGTCGCCAACTACATCCAGCAGAACTCGGCGGGCCGCGCGGTCATCGTCACCGGCGACTTCAACGCGCGGTACTCCGACGACCAAAGCGCTCTGTT
GCAATTTGCGCAGGTCAACGGGCTCACCGATGCCTGGGTGCAGGTAGAACACGGCCCCACCACACCGCCGTTCGCGCCCACTTGCATGGTCGGCAACGAGTGCGAGCTGCTCGACAAGAT
CTTCTATCGAAGCGGCCAGGGAGTGACGTTGCAGGCCGTCAGCTACGGCAACGAGGCGCCGAAATTCTTCAATTCCAAGGGTGAGCCACTGTCGGATCACAGCCCGGCGGTGGTCGGCTT
CCACTACGTCGCGGACAACGTGGCCGTACGGTGA
>Rv0908_N1 Seen in 1 sample(s).
ATGACCCGTTCGGCTTCGGCGACAGCCGGTTTGACCGATGCCGAAGTGGCGCAACGGGTCGCCGAAGGCAAGAGCAACGATATCCCGGAACGGGTCACCCGCACCGTCGGGCAGATCGTC
CGGGCCAACGTATTCACGCGGATCAACGCGATTCTGGGCGTTTTGCTGCTCATCGTCTTGGCGACGGGCTCGTTGATCAACGGGATGTTCGGCCTGCTCATCATCGCCAACAGCGTCATC
GGCATGGTCCAGGAGATCCGTGCCAAGCAGACGCTGGACAAACTCGCGATCATCGGACAGGCGAAACCGTTGGTGCGCAGGCAATCCGGAACGCGCACGCGGTCGACCAACGAGGTGGTG
CTGGACGACATCATCGAACTTGGGCCCGGGGACCAGGTTGTCGTCGACGGCGAGGTCGTCGAGGAGGAAAACTTGGAGATCGACGAATCATTGCTGACCGGCGAGGCCGACCCGATTGCC
AAAGACGCTGGCGATACCGTGATGTCGGGCAGTTTCGTCGTCTCCGGTGCCGGCGCCTACCGCGCCACCAAGGTCGGCAGCGAAGCATATGCAGCCAAACTGGCCGCCGAGGCCAGCAAG
TTCACCCTGGTGAAATCCGAATTGCGCAACGGCATCAACAGGATTCTGCAGTTCATCACTTACTTGTTGGTGCCGGCCGGCCTGCTGACCATCTACACCCAGTTGTTCACCACACACGTG
GGATGGCGGGAATCCGTGTTGCGGATGGTGGGCGCGCTGGTGCCGATGGTTCCCGAAGGCCTGGTGCTGATGACCTCGATCGCCTTCGCCGTCGGGGTGGTCAGGCTCGGCCAGCGTCAA
TGCCTGGTGCAAGAGTTGCCCGCCATCGAGGGGTTGGCGCGGGTGGACGTGGTCTGCGCCGACAAGACCGGCACACTGACCGAAAGTGGCATGCGGGTCTGCGAGGTCGAAGAGCTCGAC
GGGGCTGGTCGACAGGAAAGTGTCGCCGATGTGCTGGCCGCCCTGGCCGCCGCCGACGCCCGTCCCAACGCGAGCATGCAGGCAATCGCCGAGGCCTTTCACTCGCCGCCGGGCTGGGTC
GTGGCCGCGAACGCGCCTTTCAAGTCGGCCACCAAGTGGAGCGGCGTCTCCTTTCGCGATCACGGTAACTGGGTGATCGGCGCGCCCGACGTGCTGCTCGATCCGGCTTCGGTGGCGGCC
AGACAGGCCGAGCGGATCGGAGCGCAGGGATTGCGGGTGCTGCTGCTGGCTGCTGGCAGTGTGGCCGTCGACCATGCCCAAGCGCCGGGTCAGGTCACCCCGGTAGCGCTGGTTGTGCTG
GAGCAGAAGGTGCGGCCCGACGCCCGTGAAACGCTGGATTATTTTGCTGTTCAGAATGTTTCGGTCAAGGTGATCTCCGGTGACAACGCGGTGTCGGTTGGTGCGGTCGCCGACCGGCTC
GGGCTGCATGGCGAGGCGATGGATGCGCGTGCGCTGCCGACGGGCCGCGAAGAACTGGCCGACACACTGGACTCTTACACCAGTTTTGGCCGTGTGCGGCCGGACCAGAAGCGTGCGATC
GTGCATGCTCTGCAATCACACGGGCATACCGTGGCGATGACCGGCGACGGCGTCAACGACGTGCTTGCCCTCAAGGACGCTGATATCGGTGTGGCGATGGGCTCGGGCAGCCCGGCCTCG
CGTGCGGTGGCACAGATCGTGTTGCTGAACAACCGGTTTGCCACGCTGCCCCATGTGGTCGGCGAGGGGCGTCGGGTCATCGGCAATATCGAACGGGTCGCCAATCTATTCCTGACTAAG
ACGGTGTATTCCGTGTTGCTGGCGCTGCTGGTGGGTATTGAGTGCTTAATTGCCATACCGCTGCGGCGTGATCCGCTGTTGTTCCCGTTCCAGCCGATCCACGTCACCATCGCGGCCTGG
TTCACTATCGGGATCCCAGCGTTCATCCTGTCCTTGGCGCCCAACAACGAGCGGGCCTATCCGGGCTTCGTTCGGCGAGTTATGACGTCTGCGGTGCCGTTCGGACTAGTCATCGGTGTC
GCGACTTTCGTCACCTATCTGGCCGCTTACCAGGGTCGCTACGCCTCGTGGCAGGAGCAGGAACAGGCGTCGACCGCTGCGCTGATCACGTTGTTGATGACCGCGTTATGGGTGCTGGCG
GTGATCGCACGCCCCTATCAGTGGTGGCGACTGGCGCTGGTGCTTGTCTCCGGACTGGCCTATGTGGTGATCTTCAGCCTTCCGCTGGCGCGGGAGAAGTTCCTGCTGGATGCCTCGAAC
CTGGCGACGACGTCAATCGCGCTGGCGGTTGGCGTGGTGGGTGCGGCGACCATTGAGGCGATGTGGTGGATCCGAAGCAGGATGCTCGGTGTGAAACCGAGAGTGTGGCGATAA
>Rv1001_N1 Seen in 1 sample(s).
GTGGGTGTCGAATTGGGGTCAAATTCCGAGGTCGGCGCGCTAAGAGTGGTCATCCTGCACCGCCCGGGGGCCGAACTGCGCCGGCTCACACCGCGCAACACCGACCAGCTGCTGTTCGAC
GGCCTGCCCTGGGTATCCCGCGCGCAGGACGAGCACGACGAATTCGCCGAGCTGCTGGCTTCCCGCGGTGCGGAAGTGCTGTTGCTGTCGGACCTGTTGACTGAGGCACTACATCACAGC
GGGGCCGCCCGCATGCAGGGGATCGCCGCTGCCGTCGACGCACCGCGGCTGGGACTGCCGCTGGCGCAAGAGCTTTCGGCCTACCTGCGTAGTCTCGACCCAGGCAGGTTGGCGCATGTG
CTGACGGCCGGCATGACCTTCAACGAGCTCCCGTCGGACACGCGGACCGACGTGTCGTTGGTGTTGCGTATGCACCATGGCGGAGACTTCGTCATTGAGCCGTTGCCGAACCTGGTGTTC
ACCCGCGACTCGTCGATATGGATCGGGCCGCGGGTGGTGATCCCGTCGCTGGCATTACGGGCACGGGTGCGCGAAGCGTCGCTGACCGACCTCATCTATGCTCATCACCCGCGGTTCACC
GGTGTGCGGCGTGCCTATGAATCGCGCACCGCTCCGGTCGAGGGTGGCGACGTGTTGTTGCTCGCCCCGGGTGTGGTCGCTGTCGGAGTGGGCGAGCGGACTACACCAGCAGGCGCGGAA
GCATTGGCGCGCAGCCTTTTTGACGATGATCTTGCGCATACCGTGCTCGCCGTGCCGATCGCTCAGCAGCGCGCGCAAATGCATCTGGACACGGTGTGCACGATGGTCGACACCGATACG
ATGGTGATGTACGCCAACGTTGTCGACACGCTCGAGGCGTTCACGATCCAGCGCACACCCGACGGCGTGACCATCGGCGATGCGGCCCCGTTCGCGGAGGCGGCTGCCAAGGCGATGGGA
ATCGACAAGCTGCGGGTAATTCATACCGGAATGGACCCGTCGTCGCTGAACGCGAACAGTGGGACGACGGCAACAACACGTTGGCGTTGGCGCCCGGTGTCGTTGTCGCCTACGAGCGCA
ACGTACAGACCAACGCCCGCCTGCAGGACGCGGGCATCGAAGTGCTTACCATCGCCGGCTCCGAATTGGGTACCGGCCGTGGCGGGCCCCGCTGCATGTCCTGTCCGGCCGCCCGCGATC
CGCTTTAG
>Rv1097c_N1 Seen in 1 sample(s).
TCAGGACTTGTTGACCTTCAGCGCCTCAATGACCCTCTCGACGGTGGCGCGCGAGGTTGCATCACCGATGGGGGTGGCGCCCAGGAAGACGGTGACCGGCTTGGTGTCGACCGCGATGAT
CGTGACCGAATCACCTTTGACGTTGCGTGAACTGTCGGCGATTGTGATATCGGCGTCTACCCGGGCGGCCCTGACCCCGTCGACGGTGATCGACGACGTCTTGGTCGGGCCCAGGGTGGG
CGACGAGCCTGCGTAGCCGGGGCCGTCGGCCACGCATTGCATCAACTTCGATGCTTGCGCGGCGACGTCCGGTGGTGACGAAGTTGGTTATCGCAACCTCGGCTTGCATCATCCACTGGT
CGGCACCGGCCACCTCGTGGCCGACGCCCACCGCGTCGATGAGGTTCGGGTTCTGGTCGTCGGAGAACGCCGACCACCCGGGTGCCGCGCTGGTCGGGAACGACAGCTTACCCGCACTGA
TCGAATCGCCGATGGGCTGCACACCGCCGGACACATTTGGGGTACAACCGGTTGCGGTTTGCTGGGAAAACGGTTGCGACGTGGGAGCACTCGTCGCCGGAGAGGTTGCCGTGGTCGACT
TGTTGTCGCCGCGGAGGCCGATCACCAGGATCACCACCAGTAGGATGACACCCAGCACCGCGAGGCCGGCGAGGATCAGCCACGGTGTCTTCGATCCTGGCCCGGGCGGAGGTGGTCCTG
GCGGATAGGGCCCCGCCGGCCAGCCGGGCGGATACTGCTGGGGTGGGTAGGCCGGCGGATAGGAGCCGCCCTGCGGTTGGCCTCCCCAATACGGGTCCTGCCCATACGTATTCGGGCCGT
AGGGGTAGTTGCCGTAGGGGCCAGCGGGAGGAACCGTCAT
>Rv1128c_N1 Seen in 1 sample(s).
CTACGGGTCGCCTTCGTCGTCTGCCAGGAGCTTTTCCGGGTGATGGAACGTATTGACTCGAGGTTGGCCGTGGTCGAGATGTGGCGGCGGTAGCCACTCGGTGTCGCCGTGGGCGTTCTT
GCGGGTCGTCCAGCCACGTTCGGCTAACGGATGATGGCCACCGCAGCCGAGTGTCAGGTCATTGACGTCGGTGTTGCGGCACTGGGCGTACGGCGTGACATGATGGACTTCACAGTAATA
GCCGGGCACGTCGCAACCAGGTGCGCTGCAGCCACTGTCCTTGGCGTACAACATAATTCGCTGCGCCGGGGAGGCCAGGCGCTTGGTGTGGTAGAGCGCCAGGGCCTTGCCTCGATCGAA
TATCGCGAGGTAGTGGTTTGCGTGGCGGGCCAGCCGGATCACATCCGATATGGGCAAGATCGTACCCCCGCCGGTGAGCCCGCGCCGGCCGCGGCCTCCAAGTCCTTCAGCGTGGTGGTC
ACGATGATGCTGGCCGGTAATCCGTTGTGCTGGCCCAGATTGCCACTTGTCAACAAACTACGTAATCCGGCGTTGAGCGCGTCGTGGTTCCGCTGTGGGCAGCTGCGGGTGTCTCGCCGC
GCCTGCTCCTTCGAGGGCGCGCCGTTCACACACGGTGCCTTCTGCTCGGGGTTGCACATACCCGGGGCGGCCAGCTTGGCCCACACCGCCTCGATAGTGGCGCGCAGCTCGGGGGTCACA
TATCCGCTGAGCCGCGACATCCCATCGACATCTTGCTTTCCTAACGTCAAGCCGCGGCGGCGGGCGCGGTCCTCGTCGGTGTAGTCGCCATCGGGGTTGAGGCAGTCCATGATCCGCGCG
GCCAATTTGGCCAGCTGGTCGGGACGGTACTGGGTGGCCTGCTTAGCCAAGTCCCGTTCGGCCTTCTCCAGGGTCTTGAGGTCTACCCAGGATGGTAGGCGGTGCACGAAAGCACGGATT
ACTTCAACATGGCCGTCACCAATTAACCCGTGGCGCTGTGCCTTTGCGGTGGCGGTGAGTAGCGGTGGCAGCGGCTCGCCGGTCAGCGCACGGCGCTGGCCAAGGTCGGCGGCCTCGGCC
ACTCGCCGCTTGGCCTCGCTGCGGGTGATGCGCAACCGGTCGGCCAGCGTCAATCCCAGCTTGCCGCCCAGCTCCTCCTCGGTGGATTGTTCGCCGATCTGATTGATCAACGTGTGTTCG
ACGCTGGGCAGCTGGCGTCGCGCGGTCTCGCAGTGCTCCAGCAGCGCCAGGCGCTCCGGGGTGGTCAATGCGTCAAAGGTCAGCCCCAGCACGCGGGACAGCGCGGTAGCCAATGACGCG
AAGGCCTCCGTGATCTCCTCCCGAGTGGAACACAT
>Rv1145_N1 Seen in 1 sample(s).
ATGCTGCAGAGGATCGCTCGGCTCGCCATCGCTGCGCCGCGCCGAATCATCGGGTTTGCGGTCTTCGTCTTCATCGCCGCAGCGGTCTTCGGTGTTCCGGTGGCTGACAGCCTGTCGCCC
GGGGGTTTCCAAGATCCGCGATCGGAGTCGGCACGGGCAATCGAGGTGTTGACCGACAAGTTCGGCCAGAGCGGTCAGAAAATGCTGATCGTGGTTACGGCAGCCGCGGGCGCCGACAGC
CCACCTGCCCGCGAGGTCGGGACTGACATCGTCGAGGTGCTGCGGCGGTCGCCGTTGGTTTACAACGTGACCTCGCCGTGGACTGTGCCACCGACTGCCGCCGCCGACCTGCTCAGCACC
GACGGAAAATCGGGGTTGATCGTCGTCAACGTCAAAGGCGGCGAAAACGACGCGCAGAACCACGCCCAAACCCTGTCAGACGAAGTCGCCCATGACCGCGACGGCGTCACCGTCCGTGCC
GGCGGCTCGGCGATGGAGTACGCCCAGATCAATCGGCAGAACAAAGACGACCTGCTGGTGATGGAGTTGATCGCGATTCCGCTGAGCTTCCTGGTGCTGATCTGGGTGTTCGGTGGGCTG
TTGGCCGCCGGGCTGCCGATGGCCCAGGCCGTACTGGCCGTTGTGGGATCGATGGCCGTATTGCGACTCGTTACGTTTGCCACCGAGGTGTCGACCTTCGCGCTCAACCTGAGTACAGCG
TTGGGCCTCGCGTTGGCTATCGACTACACGCTGCTCATCGTCAGTCGCTATCGCGACGAGCTCGCCGAGGGCAGTGATCGAGACGAAGCACTGATCCGGACCATGGCGACTTCGGGGCGC
ACGGTGTTGTTTTCGGCGGTCACCGTGGCGCTGTCGATGTCGGCGACTGCGCTGTTCCCGATGTACTTTCTGA
>Rv1225c_N1 Seen in 1 sample(s).
CTAACACCCGAGCAACGGCGGCAGGCCGGCCACCGAGTCGATCACGTGGTGCGGCCGGGTCGCGCTGGCGCCGGCCAGCCAGCGATCCAGCGTTTGCTGGCGGAACTTGCCGGTGCGCAC
CAGCACACCCGTCATGCCCACCGCCTGGGCGGCCAGCACGTCGTTGTGCAGATCGTCGCCGATCATGACCATCTGCTGTGGATCGACACCGACGCGGTCGGCGGCCGCCAGGAATCCCTC
GGCCGCAGGCTTGCCGATGGCGGTGGCGGTCTTGCCGCAGGCCTGTTCCATTCCGGTCAGGTACATCCCGGTGTCGATGCGCAGCCCGTCGGTGGTGTTCCAGGTCATATTGCGGTGCAT
CGCCACCACCGGAACGCCGTCGAGCATCCACCCATAGACCCGGCTGAGCGTGCGGTGATCGAACTGGGGGCCGGGCACTGCCGAGCACGACGACGTCGGGGGCTTCGGGGCAATCCTCGG
GACCGATCTCGGTCGACAAGACGACGTCGATGCCGGGCAAGTCCTCGGTGATGTCGCCGTTGTTCACCAGGAAGCACCGCGCGCCGGGATAGGCGCCGTGCAGGTACTCGGCCGTCAGCA
CCCCGGCCGTGATCACGTCGTCGGCGGCGACGGGGATCCCCGCGGCACCCAGCGCCTCGGCGATCTGCCGGCGGGTGCGCGTCGTGGTGTTGGTCAGATACGCGCAGGCGATTCCCCGAT
GGGTCAGTTGCCGCACGGTCTCGGCGGCCCCGGGAATCGCGCGCCACGACAGCACCAGCACGCCGTCGATGTCGAACAGCACCGCCGCGGCCATCAGATGCGCCACGTCCAC
>Rv1258c_N1 Seen in 1 sample(s).
TCACTGAGCCGATCCTACGGGCCGATCGATGTCCGCTTGGGGCGCCAGATCCAGTTCGCGCAGCGCGGGCAGCCGGATCGCGACCAGCCCGGTGCACACGATGGGCAGTGCCAACGCGAG
AAACGTGGCATGCAGTCCAGCGGCGTCGGTCAGTGGACCGGCCAGCAACAGACCCAACGGGCCGGCGGCGTAGGCCAGCGACGTCATCACCCCGACTACCCGGCCGCGCAGATGCTGTGC
TGCCCGCGTCTGTATCACGTAGTTATAGATCGGCTGGATGGGTCCGTACACCAGGCCGACCACCGCGCACAACACCATGATGACCGGCAGTGGCGGCAGGAACGCGATGACCATCGATGC
CAAACCCAGGGTAAGAACCGCGGTCGACATGGTCACGCGACGGGGAACGCGGATAGCCAACACGGCATACCCCAGCGCTCCCACCAGGCCGCCGCCGGCGATCGCCATCAACGCCCAACC
CAGCTGCACCGGTTGCTGGTGGTCGGTGAAGTATTTCGGGAACAGCACGCTCTCCATCGGCAGATACAGCGCGGTGACGGTCAGGTCAATCATCCCGAGGGTGCGCAATACCCGCAGGTT
CCAGACGAAGCGCAGCCCCTCGGCGATCCCGGATACCAACCCTTGGGGCCGCGAGGTGTGGTGCGGCTTGCCGGCACCCTGCGAGTTGCAGGGCGGCAATCGCGAGGATGGACAACCCGA
ATGCCGTCGCGGTAATCCACATTGTGGTGATGCCGCCAACCGTCGCGATCATCAAGCCACCGATGGCCGGGCCGACAATAAAGGCCAGGTTGAGGATCGCCTCGTAGGCGCCGTTGATGC
GGTCCAACGACCAGCCTGCCCGAGCGGCGGCCTCGGGCAGCATCGAGTCACGAGCCGTCATGCCTGCCGGGCCGAAGGCGGCCGCCAGGGCGGCCAATACGGCCAGCACCAGCACGTTGA
CCGCGTCGCCGCCGTACCCCCACGCCACCAGGGGGACGCCGGCCACCGCCGCACCCGACAGCGCATCGGCCACCATCGACACCCGGCGACGCCCGAAGTAGTCGACCGCGGTGCCGGCGA
CCAGCGTGGCGAACAACAGCGGCAGCATGGTCGCACTGGCCACGATCGAGGCCTGCCCAGCGCTGCCCTCGCGCTGCAACACCAGCCACGGAAACGCGACTATCGAGACGCCATCACCCG
CGGCCGCCATCAGCGTTGCGAACAGGATCAGGAATGCCGGGCCGCGGTTGCTGTTTCTCAT
>Rv1269c_N1 Seen in 1 sample(s).
TTAGTTGCAGGCCCAGGTGTCGATGTAGCCGCCGCCGAGCTTGGTCAGGGCGTCCTTCATGGCGGCGGCCAAGGTGGGTCCAACTCCTCCCTGGTATGCCCTATCGTTGGCGGCGACGGC
GCCGCAGGCGGTGAAACTGGTGAGCACCTTGCAGTCGGAGTAGCCACACGACTTGACGGCGGTGGCTTCGGCAGCCGCCCGGGTTGGGTAGTCCCACGATCGGCCCCACGAGCCGTTGCC
GGAGTAGGCAATTGCGCCATAGACATCGGCGGCATTTGCTGGTGCGGGAGCCAGGGAGCCAGGGTGACGGTCGTCGCGGCGGCAGTGGCGACGCCGGCGACGGCCACCGCGAACCGTCGC
CGAAGAGTAATCATCGTCGTCAT
>Rv1326c_N1 Seen in 1 sample(s).
CTAGGCGGGCGTCAGCCACAGCGCCGAAGTGGGCGGCAGCACCAGCACCGCGGACGCCGGGCGGCCATGCCAGGGGTCGTCGGTGGCGTCCACGCCGCCGAGGTTGCCGATCCCTGAGCC
GTGGTAGATCGTCGCGTCGGTATTGAGCACCTCGCGCCAGCGGCCCGCGCGCGGCAGCCCGAGTCGATAGTCACGGTGTTCGGCACCTGCGAAATTGAACACGCAGGCCAGCACCGAGCC
GTCGCTGCCGTAGCGCATAAAGCTCAACACATTGTTGGCGGAGTCGTTGGCGTCGATCCAAGAATAGCCTTCGGGGGTGGTGTCTAAGCTCCACAGCGCCGGGTGGCATCGGTAGATGTC
GTTGATGTCGCGCACCAGCCGCTGAATCCCGTTGGAGAAGCCGTTTTCGTCGAGTTGGAACCAGTCCAGGCCGCGCTGCTCGGACCATTCGGCGCGTTGGCCGAATTCCTGACCCATGAA
CAGCAATTGCTTGCCGGGGTGTGCCCATTGGTAGGCAAGCAGGCTACGCAGGCCGGCGGCCTTGACGTGATTGTTGCCCGGCATCCGCCCCCACAGCGTGCCTTTGCCGTGCACCACCTC
GTCATGACTGAGCGGCAACACGTAATTTTCGCTGAACGCATACAGCATCGAGAACGTCATCTCGTGGTGGTGGTAGCTGCGGTACACCGGATCTCGGCTGACGTAGTCGAGCGTGTCGTG
CATCCAGCCCATGTTCCACTTCATCGAAAAGCCCAGGCCGCCAATGTTGGTCGGGCGGGTCACCCCAGGCCACGGCGTGGACTCCTCGGCGATGGTGACGATTCCCGGCGCGACCTTGTG
CGCCGTGGCGTTCATCTCCTGCAGGAACTGCACTGCTTCCAGGTTCTCCCGGCCGCCGTGGACGTTGGGGGTCCAGCCGCCCTCGGGTCGCGAGTAGTCTAGATAGAGCATTGAGGCCAC
CGCGTCCACCCGCAGGCCGTCGATGTGGAACTCCTGTAGCCAGTACAACGCATTGGCTACCAGAAAGTTGCGCACTTCCGGGCGGCCGAAGTCGAACACGTATGTGCCCCAATCCAGTTG
CTCGCCGCGTTTGGGATCGGAATGTTCGTAGAGCGGAGTGCCGTCGAACCGTCCCAGGGCCCACGCGTCCTTCGGGAAGTGCGCTGGGACCCAATCCACGATGACGCCGATGCCGGCCTG
GTGCAGGGCGTCGACCAGCGCCCGGAAGTCGTCGGGTGTGCCGAATCGTGATGTCGGCGCATAGTAGGACGTGACCTGATACCCCCATGATCCGGCGAATGGATGCTCGGCGACGGGCAA
CAGCTCCACATGGGTAAACCCTTGATCCACAATGTAATCCGTCAACTCACGAGCAAGCTGGCGGTAGCTGAGTCCAGGCCGCCACGAACCGAGATGGACTTCGTAGGTGCTCATCGCCTC
GTTCACCGGGTTGCGCAGCGCACGCCCAGCCATCCAGTCGTCGTCACCCCAGGTGTAGTCACTCGACGTCACCCGCGATGCGGTCTGCGGCGGCACCTCGGTGCCGAACGCGAACGGGTC
GGCCCGATCGGTAACCACGCCGTCGGCGCCGTGCACGCGGAACTTGTACAGACCGTCGCAAGGGAAGTCGGGCCAGAACAATTCCCATACCCCTGATGGGCCGAGCACCCGCATGGGGGC
TTCGTGGCCATTCCAACCGTTGAACTCGCCGATCAAGCTGACGCCCTTGGCGTTGGGCGCCCACACGGCGAACGACACGCCACTCACCACACCGTCGGCCGTGGTAAACGAGCGGGGGTG
GGCACCCAGGACTTCCCAAAGCCGTTCGTGGCGGCCCTCGGCGAACAGGTGCAGGTCGACCTCGCCCAGGGTGGGCAGGAATCGGTACGCATCGGCCACGGTGTGTGGCTCGCAACCTTC
ATAGGTCACCTGCAGGCGGTAGTCGATGAGGTCGACGAACGGCAATGCGACGGCAAACAGGCCAGAATCGAGGTGCTGCAACGAGAGCCGGTCCTTACCAACGAGCGCGACGACCTCGAC
GGCATGCGGACGGAACGCTCGGATGACGGTATGGTCGTCGTATTCGTGGGCGCCCAGGATGCCGTGCGGGTTGTGATGTGTACCCGCCACCAAGCGCGCCATTTCGGCCGGCTCGGGTGC
AAGGTGCTCCCCGGTGAGTTTCTCGGATCGACTCAT
>Rv1330c_N1 Seen in 1 sample(s).
TCAGGCCGGGATCGTGCGTGTCGGAATCGCCGGCTCGCCGGGTGCCAATTTCAGCCCGTCGGCTGGCAGGCTGCGCAACCCGGATGCCACCAGCTGCCGTGCCGCGGCCAGGCTGGTATC
GGCTACCGGTTGCCCGGCGCGGACCAGTGGCAGCGTCAAAACCCGGTGCGGCTCGACAATGACCGGCGGACGGCCCGCCGGATGCACGAGCTCCTCGGTGATGGTGCCCGTCGCACGGGA
GCGCCGCAGTGCCTCTTTGCGGCCGCCGGGGGATTCTTTGTAGCTGCTGCGCTTTTGCACCGGTACACCGTCTACCTCGACCAGTTTGTAGACCATGTTGGCGGTCGGCGCGCCCGACCC
GGTGACCAGCGACGTGCCCACGCCGTAGCTGTCGACGGGTTCACCGCGCAACGCGGCGATGCTGAACTCGTCAAGTTCGCCGGACACCACGATGCGCGTCCGGGTGGCTCCTAGCCGGTC
GAGCTGCTCCCGCGCTTGGCGGGCCAGTACCCCAAGCTCACCGGAATCGATGCGGATCGCGCCGAGCTCAGCGCCGGCGGCGGCAACGGCATTGGCCACACCGGTCGTGACGTCATAGGT
ATCCACCAGCAGCGTGGTACCGGGTCCCAGCGCTTCGACCTGGGCGCGGAATGCGGCTCGCTCGGCTAGTTCGGTGGGGCCGCCATGCTGGGCGTGCAACATGGTGAATGCGTGTGCCGC
GGTGCCGTGCGCGGGCACTCCGTAGCGTCGCTGCGCCGCCAAGTTGGATGACGCGGCGAAACCGGCGATATACGCCGCCCGGGCCGCTGCCACCGCGGCGCGTTCGTGGGTGCGCCGCGA
GCCCATCTCGATCAGTGGGCGCCCCCCGGCGGCGCTGACCATGCGCGCCGCTGCCGAGGCGATCGCTGTGTCGTGGTTGAAGATTGACAGCACCAGCGTTTCGAGCAGGACGCATTCGGC
GAAGCTGCCGCGTACCGAGAGCACCGGTGACCCGGGAAAATACAGCTCCCCCTCGGCATAGCCGTCGATATCGCCGCGGAACCGGAATTCGCGAAGATACCGCACCGTGGCCGGGTCGAG
GAATTGGGCCAGCAACTCGCACGCGTCAGCGTCGAACCTGAACTGCGGCAACGCTTCCAGCAACCGGCCGGTTCCGGCGACAACTCCGTAGCGACGGCCGGTGGGGAGTCGGCGAGCGAA
CACCTCGAATGTGGTGGGGCGATTGGCGCTGCCGTCGCGCAGGGCAGCCGCCAGCATGGTCAACTCGTACTTGTCGGTCAACAGCCCGGCTGGGTCTTGATTGTCGGGCTCTCCCTCTCG
CCGCCTGGCGGCTGGGGGTGGCCCCAC
>Rv1363c_N1 Seen in 1 sample(s).
TCACGGTACGAACTCAACTTTCGACATCTTGTACTGTCCCCCCTCTTCGGTCACGGTCACTTTGAGCCGCCACGGACGTGGTTCGTCTTTCGCCCCAGCGGAATTGGTGACCCGTGAAGT
CGCCGCGACGAGCACCACGGCGGAATGCTCGTTCATGGATTCGACGGCTGTCGCGTTCACCGTGCCTTCGGTGACCACTTTGGACTGTTCGACAACCTTGGTGAAATCGGCTGCCCGCTG
CTGGAAGTCATCCCTGAATTCGCCGGTGGAGCTGTCGATCACACGCGCGACGTCTTCTTTGGCCTTGTTGAAGTCCAGCGAGGTCATGTTGATGACACCTTGCTTGGCTCCGGCGGCGAA
CGCCGCGGCGCGCTGCTGGCGTTCGGTGGCCTCATGGTGTTGCCACACAATGTATCCGCTGAGCCCGGTGAAGCCGCAGATGATGACGACTGCGGCCGCCATGGCAATCGTGGACAGTCT
TGGTAACCGCACCCGCAACCGCCGTCGCCAGGATGCCGACCGTGCGGCCTCCTGGTCTGCGGCCTCATAGTCGTCATAGTCGTCATAGTCTTCGGCGTCTTCCCAGTCTGCATACTCCTC
GGGGACGTTCTCGTCCTCGGCTGGGGCCATCGCCAGCGCCTCACGCTTCAACCGGGCGGCACGGGCACGGGCCCGCGCCGCGGCGGCCAGCGCTTCGGCTTCGGCGGCTTCGGCTTCGGC
GGCCAACGCCATCGCGTCGGCTTGCGATGTCCCCGCGTCCGACGGTGGTTCGGTTGTCTCAGCCAT
>Rv1413_N1 Seen in 1 sample(s).
GTGGCCACCATCGGGGAAGTCGAGGTATTCGTCGACCACGGCGCCGACGACGTATTCATCACCTACCCATTGTGGATCGACACACGCCAAGCCGACCGGCTCCGTCAGCTGGCTGACCGC
GCTCGCATCGCTGTCGGTGCGGGCACCGCCGAGGGCGCTTCGAACACCGGCGCACGGCTCGCAGACGCCGCTGGCGCGATCGATGTTCTCATCGAAATCGACAGTGGCCATCACCGCAGC
GGCGTCCGTGCCGAACAAGTGTTGGAGGTCGCCCACGCCGTCGGTGAGGCTGGGCTTCACCTGGTGGGGGTGTTCACCTTCCCCGGTCACAGTTATGCGCCAGGTAAACCCGGCGAAGCC
GGCGAGCAAGAGCGGCGCGCTCTCAACGACGCGGCGAACGCGCTGGTCGCGGTGGGCTTCCCGATCAGCTGCCGCAGCGGTGGGTCCACTCCCACCGCATTGCTCACCGCCGCGGACGGG
GCCTCCGAGACGTCCCGGCGTCTATGTGCTCGGTGA
>Rv1420_N1 Seen in 1 sample(s).
GTGCCAGATCCCGCAACGTATCGCCCCGCGCCCGGGTCCATCCCGGTCGAGCCGGGCGTGTACCGATTCCGGGACCAGCATGGGCGAGTCATCTACGTCGGCAAGGCCAAGAGCCTGCGT
AGCCGGCTGACGTCCTATTTTGCCGACGTGGCCAGCCTAGCGCCGCGGACCCGGCAGCTGGTGACCACCGCGGCCAAGGTCGAATGGACGGCCGTGGGGACCGAGGTTGAGGCACTGCAG
CTGGAATACACCTGGATCAAGGAGTTCGATCCGCGATTCAACGTCCGCTACCGCGACGACAAGTCCTACCCTGTGCTGGCGGTCACCCTGGGCGAGGAATTTCCCCGGTTGATGGTCTAT
CGCGGTCCGCGGCGCAAGGGTGTGCGCTATTTCGGGCCGTACTCGCACGCGTGGGCAATCCGGGAAACGCTGGATCTGCTCACCCGGGTGTTTCCGGCGCGAACTTGCTCGGCGGGGGTG
TTTAAGCGGCACAGGCAGATCGATCGTCCATGCCTGCTCGGCTACATCGACAAATGTTCCGCGCCGTGTATTGGCAGGGTCGATGCGGCCCAGCACCGCCAGATCGTGGCAGACTTCTGC
GACTTTCTGTCCGGCAAGACCGACCGGTTCGCCCGCGCCTTGGAACAGCAAATGAACGCCGCGGCCGAGCAACTGGACTTCGAACGAGCGGCGCGGCTTCGCGACGACCTGTCCGCACTG
AAGCGTGCCATGGAAAAGCAGGCCGTGGTGCTCGGGGACGGCACCGACGCCGACGTGGTGGCATTCGCCGACGACGAACTCGAGGCGGCGGTGCAAGTGTTCCACGTGCGCGGCGGACGG
GTCCGCGGCCAGCGTGGCTGGATTATCGAAAAGCCAGGAGAGCCAGGAGATTCCGGAATCCAGTTGGTCGAGCAATTCCTGACACAGTTCTACGGCGACCAGGCGGCGTTGGACGACGCC
GCCGACGAATCCGCCAACCCGGTTCCCCGCGAGGTGCTGGTGCCCTGTTTGCCGTCCAACGCCGAGGAGCTGGCCAGCTGGCTGTCCGGCCTGCGCGGCTCAAGGGTCGTGCTGCGGGTG
CCGCGCCGCGGGGACAAGCGGGCACTGGCCGAAACGGTGCACCGAAACGCAGAAGATGCACTGCAACAACACAAGCTGAAGCGGGCCAGCGATTTCAACGCCAGATCCGCTGCGCTGCAG
AGCATTCAGGACTCGTTGGGCCTGGCAGACGCACCCTTGCGGATCGAGTGTGTCGACGTCAGCCATGTGCAGGGCACCGACGTGGTCGGGTCACTGGTGGCGTTCGAAGACGGCCTGCCG
CGCAAGTCGGACTACCGCCACTTCGGGATCCGGGAAGCCGCAGGCCAGGGGCGCTCCGACGACGTGGCCTGTATTGCCGAGGTGACCCGGCGCCGCTTCCTGCGGCACCTGCGCGATCAG
AGCGATCCGGATCTTCTTTCTCCGGAAAGGAAGTCGCGTAGATTCGCCTATCCGCCCAATCTGTACGTCGTCGACGGCGGCGCGCCGCAAGTCAACGCGGCCAGTGCGGTAATCGACGAA
CTCGGTGTTACCGACGTCGCGGTGATCGGCCTGGCCAAGCGGCTGGAAGAGGTATGGGTGCCGTCGGAGCCGGACCCGATTATCATGCCGCGCAACAGTGAGGGACTCTATCTGCTGCAG
CGAGTGCGAGACGAGGCACACCGGTTCGCTATCACCTACCATCGCAGCAAGCGGTCGACGCGGATGACTGCCTCAGCGCTGGACTCGGTGCCGGGATTGGGGGAGCATCGCCGCAAAGCG
CTGGTCACCCATTTCGGATCGATCGCTCGCCTCAAGGAGGCCACCGTCGACGAAATCACCGCTGTTCCCGGTATCGGCGTGGCCACGGCCACGGCCGTCCACGACGCACTGCGACCTGAC
TCATCGGGGGCCGCGCGATGA
>Rv1551_N1 Seen in 1 sample(s).
GTGACTGCACGGGAGGTGGGCCGCATCGGACTGCGAAAGTTGCTGCAGCGCATCGGTATTGTTGCTGAATCAATGACGCCGCTAGCGACCGACCCCGTTGAGGTTACCCAACTGCTGGAT
GCCCGATGGTATGACGAGCGGCTGCGTGCGCTGGCCGACGAGCTCGGACGCGATCCGGACAGCGTGCGCGCCGAGGCGGCAGGCTATCTGCGGGAGATGGCCGCCTCGCTGGATGAGCGG
GCCGTGCAGGCATGGCGCGGCTTCAGTCGCTGGCTCATGCGCGCCTACGACGTACTGGTCGACGAGGACCAGATCACGCAGCTGCGCAAGCTTGATCGCAAAGCCACCCTGGCGTTCGCG
TTCTCGCATCGTTCGTACTTGGATGGGATGCTGCTGCCCGAGGCGATCCTGGCCAACCGGCTCTCGCCGGCGCTGACCTTCGGCGGGGCGAACCTGAACTTCTTTCCGATGGGCGCTTGG
GCCAAACGTACCGGGGCTATCTTCATTCGGCGTCAGACGAAAGATATTCCCGTCTACCGCTTCGTATTACGTGCTTACGCCGCGCAGCTGGTGCAAAACCATGTCAACCTCACCTGGTCG
ATCGAAGGGGGTCGGACCAGAACGGGCAAGCTACGGCCACCGGTGTTCGGGATCCTGCGTTACATCACCGATGCGGTCGACGAAATCGACGGTCCCGAAGTGTATTTGGTGCCGACCTCG
ATCGTGTACGACCAGCTGCACGAGGTGGAAGCCATGACCACCGAGGCCTATGGCGCGGTGAAACGACCCGAAGACCTGCGCTTTCTGGTCCGGTTGGCGCGACAGCAGGGCGAGCGACTG
GGCCGCGCCTATCTCGACTTCGGCGAACCGCTGCCGCTTCGCAAGCGCCTGCAGGAGATGCGCGCCGACAAGTCGGCACCGGCAGCGAGATCGAACGGATCGCGTTGGATGTCGAGCACC
GGATCAACCGCGCCACACCGGTTACCCCCACCGCGGTGGTGAGTCTGGCCCTGCTGGGCGCGGACCGCTCGTTGTCCATCAGCGAGGTGTTGGCGACGGTTCGCCCGTTGGCCAGCTACA
TAGCTGCCCGCAACTGGGCGGTGGCCGGCGCCGCCGATCTGACGAATCGCTCGACGATCCGGTGGACCTTGCATCAGATGGTTGCTTCCGGCGTGGTGAGTGTCTACGACGCGGGCACCG
AGGCGGTGTGGGGCATCGGCGAGGACCAGCACCTGGTGGCGGCGTTTTACCGCAACACCGCGATCCATATCCTGGTCGATCGGGCCGTCGCCGAGTTGGCGTTGCTGGCGGCCGCAGAGA
CCACAACAAACGGCTCGGTTTCCCCGGCGACCGTGCGTGATGAGGCGTTGAGCCTTCGCGACTTGCTGAAGTTCGAGTTCTTGTTTTCTGGCCGTGCCCAGTTTGAGAAAGACCTCGCAA
ACGAGGTACTGCTGATCGGGTCGGTGGTCGACACCTCCAAGCCCGCGGCCGCAGCCGATGTGTGGCGCCTGCTGGAATCGGCCGATGTGCTGCTGGCCCACCTGGTGCTGCGGCCGTTTC
TCGATGCCTACCACATTGTCGCCGATCGGCTGGCCGCCCATGAAGACGACTCTTTCGACGAGGAAGGGTTTCTGGCCGAGTGTCTACAGGTCGGCAAGCAGTGGGAGCTGCAGCGCAATA
TCGCCAGCGCCGAGTCCAGGTCGATGGAGCTGTTCAAGACCGCACTGCGCCTGGCTCGCCATCGCGAGCTGGTCGACGGTGCCGATGCGACGGACATCGCCAAACGCCGACAGCAGTTCG
CCGACGAGATAGCCACGGCAACCAGGCGGGTAAACACAATCGCAGAACTGGCCCGCAGGCAATGA
>Rv1564c_N1 Seen in 1 sample(s).
TCACAACGTCTTACGCAGGACCAGCAGCGAGCGCGCAGGTACCGAAAACGTGTCAGTGGCGGTTACCGTCAGGTCGATGTCACCGACGGGATCGTTGGTATCCAGCTCTCCGGTCCACTG
CTGCGCATAGCCGTCATGCGGCATCACGAACTCCACGTCGTGGTCATGGGCGTTGAAGCACAACAGGAATGAATCGTCGACTACTCGCTCACCACGGGCGTCCGGTGCGGTAATGGCTTC
ACCGTTGAGAAACACCGCAACACACCTGTCGAAGCCTCTGCCCCAATCCTCGTGCGTCATCTCCCGACCGCTCGGTGTCAACCAGGCGATATCGCGGACTTCGTCGCCACTGCGGATCGG
TTCACCCTCAAAGAACCGGCGTCGGCGAAACACCTTGTGGTTCTTGCGCAAGGTCGTCGCCTTGCGTGCGAAAGCTAGCAGATCGGCATTCTTGTCCACCAATGACCAATCCATCCAAGA
TAATTCGGAGTCCTGGCAGTAGACGTTGTTGTTGCCGTATTGGGTGCGCCCAATCTCGTCGCCGTGGGCGATCATCGGCGTGCCCTGGCTGACCATAAGCGTGGCCCACATGTTGCGCAT
CTGGCGGGCACGCAGCGCCAAGATGTCGGGGTCATCGGTGGGGCCCTCGACACCGCAGTTCCACGATCGGTTGTAGCTTTCCCCGTCGCGGTTGTTCTCGCCATTGGCCTCGTTGTGCTT
GTCGTTGTACGAGACCAGGTCGTTGAGTGTGAACCCGTCGTGGGCGGTGACGAAATTGATACTGGCACTGGGCCGGCGGCCGGTTGCTTCGTAGAGGTCCGACGACCCGGTCAGCCGGGA
GGCGAATTCGCCTAGGGTGGCCGGCTCGCCTCGCCAGTAGTCGCGCACGGTGTCGCGGTACTTGCCGTTCCATTCCGTCCACAGTCCTGGGAAGTTGCCAACCTGGTAGCCACCTTCGCC
GACATCCCATGGCTCGGCGATCAGCTTGACCTGACTGACCACCGGATCTTGTTGCACCAGATCGAAGAATGCCGACAGCCGGTCGACGTCGTGCAGCTCGCGGGCCAGCGTGGACGCCAG
GTCGAACCGGAACCCGTCGACGTGCATTTCGATCACCCAGTAGCGCAGCGAATCCATGATCAGCTGCAGGGTGTGTGGGTGGCGGGCATTGAGGCTGTTGCCGGTACCGGTGAAGTCCTT
GTAGAACCTCAAGTCGTGGTCCATCAGTCGGTAGTAGGCGGTGTTGTCGATTCCGCGAAAGTTGATCGTCGGACCCAAGTGGTTGCCTTCAGCGGTGTGGTTGTAGACGACGTCGAGGAT
GACCTCGATGCCGGCTTCGTGCAGGCTGCGCACCATGGTTTTGAACTCGGCTACCGCGCTGCCGGCTTGCCGGGTCGACGCGTATTGATGGTGCGGGGCGAAGAATCCGAAGGTGTTGTA
ACCCCAGTAGTTTCGCAAGCCGAGGTCCAGCAGCCGGGAGTCGTGTAGGAACTGGTGCACCGGCATCAACTCAACGGCGGTGACGTTGAGCTCGTTGAGGTGGTCGATGATCACCGGGTG
GGCAGGCCGGCGTAGGTGCCCCGGAGTTCGGGCGGGATACTGGGATGGGTCTGTGTCATGCCTTTGACATGCGCTTCGTAGATTACGGTCTCGTGGTACGGGGTGCGCGGCGACCGGTCG
TATGCCCAGTCGAAGAACGGATTGATCACGACGCTGGTCATAGTGTGGCCCAGCGAGTCGACCATCGGGGGAGTGCTGTCCGGGTCGACGGCGTTGACGTCATAGGAATACAGCGCCTGC
CCGAAGGTGAAATCGCCGTGGAACGACTTCCCATACGGGTCGAGCAGCAGCTTGCTGGGGTCACACCGATGGCCGGCCGCCGGGTCGAACGGCCCGTGCACACGAAACCCGTAGCGCTGG
CCGGGGGTGATGTTCGGCAGATAGGCATGCCAGACGTACCCGTCCACCTCGTCAAGCGGGATCCGCGACTCGACGCCGTCCTCGTCGATCAGACATAGCTCGACCTTCTCGGCGATCTCG
GAGAACAACGAAAAGTTGGTCCCGGCGCCGTCGTAGGTGGCTCCAAGCGGATAGGCGTTGCCCGGCCACACCGTGGGTAGAGCGGGCCCGGTCCCGTCGGACTCCCCGGCGTTGTTCGAC
GACAT
>Rv1615_N1 Seen in 1 sample(s).
ATGGGCCTACGGCCGGCACGGGTCGTGCGCCCGGCTCGATCTGGCATGCTGAAAGGCGTGACCGATCCCCTGCAGCACGGTGCCTTCGAGCCGGGCTGGCAATCCGCACCACCCGGATAT
CCACCGCCTTATCCGCAATATCCGGGGCCTGGCTCTTACTTTGACCCGTTCGCGCCATATGGTCGCCATCCGGTCACCGGCCAACCATTTTCCGACAAATCGAAGACTGTTGCCGGCCTG
TTGCAGTTGCTTGGACTGTTCGGCATCGCCGGGATCGGGCGAATCTATCTGGGCCATACCGGCCTGGGCATCGCGCAGCTGCTGGTGGGCTGGGTGACGTGCGGTTTGGGCGCCGTCATC
TGGGGCGTCATTGACGCCCTGCTGATATTGACCGACAAAGTCGGCGACCCTTGGGGTCGTCCCTTGCGCGATGGAAGCTAG
>Rv1775_N1 Seen in 1 sample(s).
ATGGCCAGCGATCTGTACCTGGGCTACCGCAACGACGACGCGGACACGCCGTTCGGCAAGTTCTTCAAACCCGAGATGGCCCCGCTGCCACAGCATGTCGTGGTGGCGTTGCAGCATGGC
CCCAGGCCGGGATGGCGTTGCTCGCCTTCGACGACGCCGCGAGCATCGTTGATGAGGGCTATCAGCAGACCGAGAACGGCTACGGGATTCTCGGCGACGGCAGCATGCAGGTATCCGTGC
GCACCGACATGCCCGGGGTCACTCCCGCGATGTGGGCATGGTGGTTCGGCTGGCACGGCAGCGACACCCGCCGCTACAAGCTGTGGCACCCGCGGGCCCATCTATCGGCGCGGTGGAAGG
ACGGCGACCAGGACAGCGGGGCCGGCCGTCGGGGCGCGCAGCGTTACGTCGGCCGCTGGTCGATGATCAGCGAGTACATCGGCTCGACGAAACTGGGTGCCGCAATACAATTCGTCGAGC
CGGCGGCCATGGGTCTGCCCGACGACAGCGACGATACGGTGTCGATCTGTGCGCGGTTGGGCTCTGCTGACGCCCCGGTGGATGCGGGCTGGTTCGTCCATCAGGTCCGATCGACGCCGG
GCGGGTCCGAGATGCGGTCACGGTTTTGGATGGGCGGACCGCACATCGCGGTGCGCAAGGCACCCGAGGTCGCGTCCAAGGCGGTGCGTCCCATCGCGTCGAAGCTAATCGGCGTCTCGG
AATCGACCGCGCGTAATCTGCTGGTGTACTGCGCGCAGGAGATGAACCACCTGGCGGGGTTCTTGGCGGACCTGTGGGAAAGCTTCGGTGACGAGTGA
>Rv1915_N1 Seen in 1 sample(s).
ATGGCCATCGCCGAAACGGACACCGAGGTCCACACACCGTTCGAGCAGGACTTTGAGAAAGACGTAGCCGCCACTCAGCGATACTTCGACAGCTCGCGCTTTGCTGGGATCATTCGGCTC
TACACCGCCCGCCAAGTCGTGGAACAGCGCGGCACGATCCCCGTCGACCACATCGTGGCGCGAGAGGCGGCGGGCGCCTTCTACGAGCGTCTGCGCGAACTCTTTGCAGCCCGCAAGAGC
ATCACGACGTTTGGCCCCTACTCGCCGGGGCAGGCGGTGAGCATGAAGCGGATGGGTATCGAGGCGATCTACCTCGGTGGTTGGGCTACCTCAGCTAAGGGCTCCAGCACCGAAGATCCG
GGGCCCGACCTCGCCAGCTACCCGCTGAGCCAGGTGCCTGACGATGCCGCGGTGCTGGTGCGCGCCTTGCTCACCGCGGACCGCAACCAACACTATCTACGCCTGCAGATGAGCGAGCGA
CAGCGTGCGGCGACACCGGCTTACGACTTCCGCCCGTTTATCATCGCCGACGCCGACACCGGCCACGGCGGCGATCCGCACGTACGCAACCTGATCCGCCGCTTCGTCGAGGTCGGTGTG
CCGGGCTACCACATCGAGGACCAACGACCCGGCACCAAGAAGTGCGGCCACCAGGGCGGCAAGGTCCTGGTGCCGTCCGACGAACAGATCAAGCGGCTCAACGCCGCCCGCTTCCAGCTC
GACATCATGCGGGTGCCCGGCATCATCGTCGCACGCACCGACGCGGAGGCGGCCAACCTGATCGACAGTCGCGCCGACGAGCGTGACCAGCCGTTCCTTCTCGGCGCGACCAAGCTCGAC
GTACCGTCCTACAAGTCCTGTTTCCTGGCAATGGTGCGGCGTTTTTACGAACTGGGCGTCAAGGAGCTCAATGGTCATCTTCTCTATGCGCTTGGCGACAGCGAGTACGCGGCGGCCGGC
GGTTGGCTTGAGCGCCAAGGCATTTTCGGCTTGGTCTCCGACGCGGTCAACGCGTGGCGGGAGGACGGCCAGCAGTCGATCGACGGCATTTTCGACCAGGTCGAGTCGCGGTTCGTGGCG
GCCTGGGAGGACGACGCGGGCCTGA
>Rv2027c_N1 Seen in 1 sample(s).
TCAGCGCAGCGGTGCAGACCACCGCAGCAAGGTGCCTCCGGTCGGCATGTTCTCGACTGTGAATTCGCCGCCCGCGTCGTCGGCACGCTGGCGGAGATTGCGCAGGCCGCTTTCGGTGAT
GTCGCCGGAGATGCCGACACCGTCGTCGACGACCTCGACCCGCACATCATCCTCGACGCTGACGTTGATGGCCAGGCTGGTCGCGTTCGCGTGCCGGACAGCGTTGCTAACCGCCTCCCG
CAGAACCGCTTCGGCGTGGTTGGCCAGGACGGTGTCGACAACGGACAGCGGGCCCGTGTACTGGACCGTGGTGTGCAGCGCGGGGATCGCGAGTTGGTCGATGACCTTGTCCAGTCGGTG
GCGCAGACCCGTCGCCCGGGAGGGCCCGGCGTGTAGGTCGAAGATCGCAGATCGAATCTCCTGAATGATTTCCTGGAGATCGTCGATGCTGCTGTAGATGGATTCCCGGACGGCGGGGAC
ACGTGCTCGCGGAGCGGCACCCTGCAGGGTGAGCCCGACTGCGAAGAGCCGCTGGATGACGTGGTCATGCAGATCACGTGCGATCCGGTCGCGATCGGTCAGGATCTCCACTTCTCGCAT
CTGTCGCTGCGCGGTCGCCAGCCGCCAGGCGAGCGCAGCCTGGTCAGCGAAGGCGGCCATCATATCGAGCTGTTTGTCGCTGAACGGCTGTTCATCGGCACTGCGAAGTGCGACCAGCAC
ACCGGCAACAGTGTCGGCGGCACGCAGCGGCAGCACCAGGGCGGGCCCGGGCTCCACCGGGCCGTCGACCGCGAGGTCAAGCCGGTCGAACCGGCGGGGCGTACGGTCGTGAAAGACTCC
CCCGATCGACGTTCCGCTGACGGCAACCGTCATTTGCTTGACCGCCGGGGAGATCTCTCCGGCCACCTCTACGATGACCAGGTCGTCGACCTCGCAAGCCGGCGCTTGTCGTCGAGCGGC
ACCGCCACCAAGGTGGCTGCCCCAGCCATCAACGTCAACGCTTCCTCGGCGATGAGCCGAAACACCATGGCCGGGTCCGCACCGGCCAGCATCTGCGTTCCGATGTCGCGGGTTGCCTCG
ATCCACGCTTCCCGGGTCCGTGATTCCTCGAAGAGACGGGCATTGTCAACGGCAATCCCGGCCGCGGCGGCCAGCGCCTGCACCAGCACCTCGTCGTCATCGCTGAACGGCTGGCCATCT
GCCTTCTCGGTCAAGTAAAGATTGCCGAACACCTCGTCGCGGATGCGCACTGGAACCCCGAGGAAGGTCCGCATCGGCGGATGGTGCAGCGGAAATCCAACCGATGCGGGATGCCGCGAG
ATATCGTCCAGCCGGATCGGCTTTGGCTCCTCGATCAGCGCGCCGAGAACACCTCGCCCCTCCGGCAATGAGCCGATGAGGTGCCGGGTCTCTTCGTCGATCCCCTCGTAGACGAATTCG
ACCAATCTATGGTCGTAACCGCGCACCCCGAGCGCCCCGTAGCGGGCATCCACCAACTCGGCGGCGGTATGCACAATGGCGCGCAGGGTGGCGTCGAGCTTGAGTCCCGATGTGATCGCC
AAGATGGCGTCGATCAGACCATCCAGCCGGTCGCGGCCTTCGACGATCTGTTCAATCCGGTCTTGGACTTCCAGCAGCAGCTCTCGCAACCGAAGCTGCGACAGTGTCTCGCGCAATGGC
GGGCTGCCAGGGTTAACGTTCGCCCTGTCAGGGTGTGTCAC
>Rv2084_N1 Seen in 1 sample(s).
GTGAGTGACGATTCGTCGTCGGCGTTCGATCTGATTTGCGCCGAGATCGAACGCCAGTTGCGCGGCGGCGAGCTGCTCATGGATGCCGCAGCAGCATCCGAATTACTACTCACCGTGCGG
TATCAGCTCGATACCCAGCCGCGGCCACTTGTCATCGTGCATGGACCGCTGTTTCAGGCCGTCAAAGCGGCCCGCGCACAGGTGTACGGACGCCTGATACAGCTGCGACACGCGCGCTGT
GAGGTGCTCGATGAGCGATGGCAGCTACGGCCGACGGGTCAGCGCGATGTGCGCGCACTGCTGATCGATGTGCTGAACGTGTTGTTGGCGGCCATTACCGCCGCAGGCGTGGAACGGGCA
TACGCGTGCGCGGAGCGGCGGGCGATGGCCGCCGCGGTTGTCGCCAAGAATTACCGGGACGCGTTGGGTGTCGAGCTGCAGTGCAATTCCGTATGCCGAGCCGCCGCCGAGGCGATCCAC
GCGCTGGCGCACCGCACAGGGGCTACCGAGGATGCCGACTGCCTCCCGCCGGTTGATGTGATACACGCCGACGTTACTCGCCGCATGCATGGCGAGGTGGCGACCGACGTTGTCGCGGCC
GGCGAACTGGTGATAGCGGCGCGACACTTGCTGGACCCCATGCCCAGGGGCGAGCTCAGTTACGGCCCACTCCACGAGGGGGGAAATGCGGCCCGTAAATCGGTCTATCGACGCCTGGTT
CAGCTATGGCAAGCGCGCCGGGCTGTTACCGACGGTGACGTCGACCTGCGCGACGCTCGCACGCTGCTGACCGATCTGGACAGCATTTTGCGTGAGATGCGCACGGCCGCAACCATTCAA
CAGGCGTACACACGAGCGGAACGGCGGGCGATGGCGGCGGCGGTCGTCGCCAAGATTCGCGGCGACGCAATGGGCCTCGACGCCCAGCGCGACGCGGTACATCGCGCGGCCGCCGATGCG
CTCCACGCGTTGCAATCGGTTGGCATACACCAATAGGCGACCCTTTGGCAGTTGAGGGTGTAGAGGAGATCGGCGCGTCGTTGCCGGGGCGGGAGTCGACGCCTTCCGATGATGGAGGTT
CCCTACACCCATCAGGAAGACCTCGACGCGTCCATCGCCGCCGGTGGTGCGGGCTTGGCCTGTGCTGA
>Rv2148c_N1 Seen in 1 sample(s).
TCAGGTGACCGTAACCGCCGCGGACCCAATAGCGCGGTACCGACACGCACACAGGTCGAACCATGTTTGACGGCGACTTCAAGGTCGTTGGACATGCCCGCCGACAGACCGATCGCGTGC
GGGAACATCGCACGCACCCGGTTGTGCTCCGATTGCAGCCGGTCAAAGGCCTCGTCCGGGTCCCAATCCAGCGGCGGAATGCCCATCAACCCGACCAGTTCGAGGCCCTCTGACTCCTGC
ACCTGCGCGCAAATCCGGTCTACGGCGCCGGGCGTCGTGCTGTCGACGCCGCCCCGGGATCCGTCACCGTCGAGGCTGACCTGGACGTAAACCCGCAGCCGCTCGCCACGACGGTGTTCG
GCCAGCGCCGCAACAACCGCCCGATCCAGCGCGGTCACCAACCGCGAGCTGTCCACCGAGTGAGCGGTGTGCGCCCAGCGAGCCAGCGACCCGGCTTTGTTGCGTTGAATCCGGCCCACC
ATGTGCCAGTGCACACCCCCCGAGTGACCCAACTCGGCAGCCGCCAACAACCGATTAAGTTCGGCCATCTTGGCTGAAGCTTCCTGTTCGCGCGATTCGCCAACGGACCGACAACCCAAT
CGAAACAAAATCGCAACATCGGTTGCTGGAAAGAATTTGGTAATCGGTAGAAGTTCAATTTCGCCGACATTGCGACCCGCCGCCTCCGCGGCCGCCGCAAGTCGCGATCGCATTGCCGCC
AACGCATGCGTCAATTCCGATTCGCGGTCTGGATACGCCGAAAGATCCGCCGCCAT
>Rv2176_N1 Seen in 1 sample(s).
GTGGTCGAAGCTGGCACGAGGGACCCGTTGGAGAGCGCGCTGCTGGACAGCCGCTATCTGGTCCAGGCCAAGATTGCCAGCGGCGGCACCTCGACGGTCTACCGGGGCCTGGATGTCCGA
CTCGACCGGCCCGTCGCGCTGAAAGTGATGGATTCTCGCTACGCGGGCGATGAACAGTTTCTGACCCGCTTTCGACTGGAGGCCCGTGCGGTTGCCCGGCTAAATAACCGCGCGCTGGTC
GCGGTCTACGACCAGGGCAAAGACGGCAGGCACCCGTTTCTGGTGATGGAGCTCATCGAGGGCGGTACCCTGCGCGAGCTGCTGATAGAACGTGGTCCCATGCCGCCACATGCCGTTGTG
GCGGTGCTGCGCCCAGTGCTTGGCGGGCTGGCTGCCGCCCATCGAGCCGGTCTGGTGCATCGCGATGTCAAGCCCGAGAACATCTTGATCTCCGACGACGGCGACGTCAAACTCGCCGAT
TTCGGGTTGGTCCGCGCGGTCGCCGCCGCTTCAATCACGTCTACCGGCGTCATCCTGGGTACCGCGGCCTACCTGTCCCCTGAGCAGGTCCGTGATGGAAACGCCGATCCTCGAAGCGAC
GTCTACTCTGTCGGCGTTCTGGTCTACGAGCTGCTAACGGGGCACACACCGTTCACCGGCGACTCGGCCTTGTCGATTGCCTACCAACGGCTTGATGCTGACGTGCCGCGTGCCAGTGCT
GTAATCGACGGTGTACCGCCACAATTCGATGAGTTGGTGGCATGTGCAACTGCCCGCAACCCTGCCGACCGATACGCCGATGCGATCGCGATGGGCGCCGATCTGGAGGCGATCGCCGAG
GAGCTGGCCCTGCCTGAATTCCGGGTACCGGCGCCGCGCAACTCCGCTCAACACCGGTCGGCCGCGTTGTACCGCAGCCGGATTACCCAGCAAGGGCAGCTGGGTGCCAAACCGGTTCAC
CACCCTACTCGCCAGCTGACTCGCCAACCCGGCGACTGCTCCGAGCCGGCTTCAGGGTCGGAGCCCGAACACGAGCCGATCACCGGCCAATTCGCCGGCATCGCAATCGAGGAATTCATC
TGGGCGCGACAGCACGCCCGTCGAATGGTGCTTGTCTGGGTGTCGGTGGTGCTGGCGATCACCGGGCTAGTGGCGTCCGCGGCATGGACGATCGGGAGCAACCTGAGCGGCCTGCTCTAA
>Rv2185c_N1 Seen in 1 sample(s).
TCAGCCCTCGACTCGTTTCTTCAGATCCTTCAACGCGCCGTCTATCAACCTGCGTTCCGCCTTACGCTTGAGCATCCCGATCATGGGGACAGCAAGGTCGACGGCAAGCTCGTAGGTGAC
CTCAGTGCCAGAACCCTTGGGCGCCAAGCGATACGTGCCTTCGAGGGACTTTAGCAGCGAGCTGGATTCGAGAGTCCAGCTAAGCGATTGGCGGTCTTCCGGCCACTCGTAGGACATGAT
CAAGGTGTCTTTGAAGATGGCTGCGTCCATCAACATTCGCGCTCGTTTCGGGTAGCCCTCGTCGTCGGCCTCTAGGATCTCGACTTCCTTATACTCCGAAATCCATTGCGGGTAGGCGTC
GATGTCGGCGATCGCCTTCATCACCTCGCCTGGATCCGCGTCGATGTAAATCGTCTGTGTCGTCTTGTCCGCCAC
>Rv2241_N1 Seen in 1 sample(s).
GTGGCGTCGTATTTGCCCGACATTGATCCCGAGGAGACCTCGGAGTGGCTGGAGTCCTTTGACACGCTGCTGCAACGCTGCGGCCCGTCGCGGGCCCGCTACCTGATGTTGCGGCTGCTA
GAGCGGGCCGGCGAGCAGCGGGTGGCCATCCCGGCATTGACGTCTACCGACTATGTCAACACCATCCCGACCGAGCTGGAGCCGTGGTTCCCCGGCGACGAAGACGTCGAACGTCGTTAT
CGAGCGTGGATCAGATGGAATGCGGCCATCATGGTGCACCGTGCGCAACGACCGGGTGTGGGCGTGGGTGGCCATATCTCGACCTACGCGTCGTCCGCGGCGCTCTATGAGGTCGGTTTC
AACCACTTCTTCCGCGGCAAGTCGCACCCGGGCGGCGGCGATCAGGTGTTCATCCAGGGCCACGCTTCCCCGGGAATCTACGCGCGCGCCTTCCTCGAAGGGCGGTTGACCGCCGAGCAA
CTCGACGGATTCCGCCAGGAACACAGCCATGTCGGCGGCGGGTTGCCGTCCTATCCGCACCCGCGGCTCATGCCCGACTTCTGGGAATTCCCCACCGTGTCGATGGGTTTGGGCCCGCTC
AACGCCATCTACCAGGCACGGTTCAACCACTATCTGCATGACCGCGGTATCAAAGACACCTCCGATCAACACGTGTGGTGTTTTTTGGGCGACGGCGAGATGGACGAACCCGAGAGCCGT
GGGCTGGCCCACGTCGGCGCGCTGGAAGGCTTGGACAACTTGACCTTCGTGATCAACTGCAATCTGCAGCGACTCGACGGCCCGGTGCGCGGCAACGGCAAGATCATCCAGGAGCTGGAG
TCGTTCTTCCGCGGTGCCGGCTGGAACGTCATCAAGGTGGTGTGGGGCCGCGAATGGGATGCCCTGCTGCACGCCGACCGCGACGGTGCGCTGGTGAATTTAATGAATACAACACCCGAT
GGCGATTACCAGACCTATAAGGCCAACGACGGCGGCTACGTGCGTGACCACTTCTTCGGCCGCGACCCACGCACCAAGGCGCTGGTGGAGAACATGAGCGACCAGGATATCTGGAACCTC
AAACGGGGCGGCCACGATTACCGCAAGGTTTACGCCGCCTACCGCGCCGCCGTCGACCACAAGGGACAGCCGACGGTGATCCTGGCCAAGACCATCAAAGGCTACGCGCTGGGCAAGCAT
TTCGAAGGACGCAATGCCACCCACCAGATGAAAAAACTGACCCTGGAAGACCTTAAGGAGTTTCGTGACACGCAGCGGATTCCGGTCAGCGACGCCCAGCTTGAAGAGAATCCGTACCTG
CCGCCCTACTACCACCCCGGCCTCAACGCCCCGGAGATTCGTTACATGCTCGACCGGCGCCGGGCCCTCGGGGGCTTTGTTCCCGAGCGCAGGACCAAGTCCAAAGCGCTGACCCTGCCG
GGTCGCGACATCTACGCGCCGCTGAAAAAGGGCTCTGGGCACCAGGAGGTGGCCACCACCATGGCGACGGTGCGCACGTTCAAAGAAGTGTTGCGCGACAAGCAGATCGGGCCGCGGATA
GTCCCGATCATTCCCGACGAGGCCCGCACCTTCGGGATGGACTCCTGGTTCCCGTCGCTAAAGATCTATAACCGCAATGGCCAGCTGTATACCGCGGTTGACGCCGACCTGATGCTGGCC
TACAAGGAGAGCGAAGTCGGGCAGATCCTGCACGAGGGCATCAACGAAGCCGGGTCGGTGGGCTCGTTCATCGCGGCCGGCACCTCGTATGCGACGCACAACGAACCGATGATCCCCATT
TACATCTTCTACTCGATGTTCGGCTTCCAGCGCACCGGCGATAGCTTCTGGGCCGCGGCCGACCAGATGGCTCGAGGGTTCGTGCTCGGGGCCACCGCCGGGCGCACCACCCTGACCGGT
GAGGGCCTGCAACACGCCGACGGTCACTCGTTGCTGCTGGCCGCCACCAACCCGGCGGTGGTTGCCTACGACCCGGCCTTCGCCTACGAAATCGCCTACATCGTGGAAAGCGGACTGGCC
AGGATGTGCGGGGAGAACCCGGAGAACATCTTCTTCTACATCACCGTCTACAACGAGCCGTACGTGCAGCCGCCGGAGCCGGAGAACTTCGATCCCGAGGGCGTGCTGCGGGGTATCTAC
CGCTATCACGCGGCCACCGAGCAACGCACCAACAAGGCGCAGATCCTGGCCTCCGGGGTAGCGATGCCCGCGGCGCTGCGGGCAGCACAGATGCTGGCCGCCGAGTGGGATGTCGCCGCC
GACGTGTGGTCGGTGACCAGTTGGGGCGAGCTAAACCGCGACGGGGTGGCCATCGAGACCGAGAAGCTCCGCCACCCCGATCGGCCGGCGGGCGTGCCCTACGTGACGAGAGCGCTGGAG
AATGCTCGGGGCCCGGTGATCGCGGTGTCGGACTGGATGCGCGCGGTCCCCGAGCAGATCCGACCGTGGGTGCCGGGCACATACCTCACGTTGGGCACCGACGGGTTCGGCTTTTCCGAC
ACTCGGCCCGCCGCTCGCCGCTACGTCAACACCGACGCCGAATCCCAGGTGGTCGCGGTTTTGGAGGCGTTGGCGGGCGACGGCGAGATCGACCCATCGGTGCCGGTCGCGGCCGCCCGC
CAGTACCGGATCGACGACGTGGCGGCTGCGCCCGAGCAGACCACGGATCCCGGTCCCGGGGCCTAA
>Rv2264c_N1 Seen in 1 sample(s).
TCAGCTACCCGAAATCGCGCTGAACTGTCCGATCCCGGGAACCTGAGGTATTCCGGGAGGGGAGCTGCGGAATCTCCGGAATCGGTGGGATCGGCGGGATCGGTGGAGGGCTGGGGGACG
TGGTCGCCGGCGGCTGCGTGGTCGCCGGCGGCTGCGTGGTCGCCGGCGGCGCGGAAGCGGGGGTCGTCGGTGCCGGAGTGATGACATCGGTGGTCACCGCCGGTTGCGTATTCGTCGTTG
TCGGCGGAGGCGGCAACGGCTGCTGCAGCGGCGGCGCGGGCCCACCGGTGGCTGGCGCCTGTACGGGAGGTGCGGGCTCGGCAGTGGGGCCATCGGATGCCGGCGCTGGGGACGGGGGCG
GTGCGGCGGTCGTGGTCACACCCGGCCTCTGCGGGGTCCCCGGCGCCGTCGGCTGGTCGCCGGTGGACAACCCGATCGCCACGGCGGCACCCACCAGCAACACCGCCACCGTCGTGCCGG
TGATGATCACGGCCGGCAGGCGATACCACGGGATTGGCGGGGACTTGGGCTCGGGCTCCGCATGGGCATCGTGGTCGAAGCTCAGCGACGGGCGGGCCGCTGTGTAGCCAGGGGCCGGCC
CGATGTGGGAGTCCTCGTCGGCCTCCGACCAGGCCAAAGCGGGCTGCAGGACCGACGCCGGCGCATCGGCCGGCGCCGTCGCCGTCGCCGAGGTGACCGCGGTCAGCACCGTTGCGCTGG
TGTCGCCGGGTCTGCGTGCCGCCCACAACGCGCCGCCGAAAGCGGCCGTCAATTGCGGACGAGGCGTCCTGACCACCGGCACGCAGAAACGTCCGGACAGCGTCGTGGTGACTGCCGGGA
TATTTGCACCACCACCCACCGAAACGATCGCTACCAGCTCGGCCGTGCGAATTCCGCTGCGGGCCAGGGTTTGTTCCAAGGCCCTGCCCACGCTGTCCAGCGAGTCACGGATTGTGTCCT
CGAGCTCGTTGCGGGTCAACCGGATATCCCCGCCCAACGCGTCGGTCAGCGTGGTCACCGTGCTTGACGAAAGCCGTTCCTTGGCTTTGCGACATTCGATCCGCAGCTTAGTCAGTGAGC
CGATCGCCGAGGTGCCGGCTGGATCGAACGCGCCCGTGCCCGGTAGTTCGGACATGACGTAGCTCAACAGCGACTGATCGATCAGATCGCCGGAGAAAGCCTGATGGCGCACCGTCGCGG
CCACCGGCCGATACTCGTCTGCGGCGTCGACGAGCGTGATGCCGGTCCCGCTGCCACCGAAGTCGCATACCGCGACGATCCCACGGGCCGGTATGCCCGGGTCGGCCCGTATCGCGTACA
GCGCTGCCGCGGCGTCAGGGAGCAGTGACAGTGGCTGGGCCGTACTCGAAGTCCCGTGCGACCATTCCGAGGCCCGACGCAGCGCGCTATCCAACGCTGCTACCGCAGCCGGCCCCCAGT
GGGCGGGATAGGTCACCGTGACACTTCCGGGAAGAGCACGACCGCCGGTAGCGGTGTAGGCCAGCGCCAGCAGTGCGTCAGCCACTAGCGCCTCGCTGCGGTACACCGAGCCGTCGGCAG
CCACGATGCCGACCGAATCTCCCACCCGGTCTACGAAGTCGGTGATCACCAGGCCTGGCTCGTCCAGCCTCGGGTTCTCCGATGGCACACCGACCTCGGGCGGGCGCTGTCGATACAGCG
TCAGCACGGGTTTACGTGTGATGGAGTGATCGGCAGCCACAGCCGCTAGGTTGGTGACACCGATCGACAAGCCTAATGCCGGTCTCGCCCCTGTTGCCAT
>Rv2293c_N1 Seen in 1 sample(s).
TCAGCCAGGGTCCCGCCGCCTGGAAAAAGTTACCGGTATAGCCAAGTGAGCGATCGGGTGCACTACAGGGTTGGCAGCCAAAAACGCTGCCGCCGTTCGGGATGCAAGGAAAAGCCTGGC
CGTTGTTCTTGTCGGAGCTAGACCCGTCACCGCCGACGAACAGTTGCGGCTGGCGCCCCAGGTGGTTCAACCGGACGACCGGAACGTTCCTGCACAGACAGACAGGATTGCCGAGCGTGT
TGATGTTGTCCAGTACAACAGAAAGCGTCTGGGCAGTAGCCAGCATGCCGGGATCGACCCCACGGAATGTTGCCCCGTTGTCCAGGGTCCACCGTGCTGGTATTGCCACGTCCCCAATGC
TGGTGCGGCCGGCACCACCGGCGACGCCCGAGAACATCACGGCGGCAATGGCAATGGAAGAAGCACAGGTAAAGCGTGCGAAGGCGGTCTCGGTGGTGTTGGTAGCGTTCACTAGGCCGA
TGCCGGTCATCGCCACAATCACCTTCTTGCCGCTGATCGAGCCCAGGTAGTAGCGACGACGGTCGGCGACCACCACCGGGTTGGCGTCCAGCGCGGTGTGCGCCAGCACCGCGTCGGCCT
CAGCCGGAAACGCCGACAAGACCAGCGTGCGCTGTTCGCACGGGATCACATTTGCCACGTATCCGGGATCGGCCGCCGCCACGCCACAGCCCAGCGACAACGCGGCCGCCACCAAAAGAC
AGTGCCGCAAAGGCGCGCCCAC
>Rv2339_N1 Seen in 1 sample(s).
ATGGTGCCGGGCGAAGTCCACATGAGTGATACGCCGTCAGGCCCGCACCCAATCATCCCGCGGACGATTCGCCTGGCCGCGATTCCCATCTTGCTGTGTTGGCTGGGATTTACCGTTTTC
GTCAGCGTCGCCGTTCCTCCGTTGGAGGCGATCGGTGAAACCCGGGCCGTGGCAGTTGCCCCCGACGATGCGCAATCGATGCGTGCGATGCGACGTGCGGAAAGGTGTTCAACGAATTCG
ATTCCAATAGCATCGCGATGGTCGTCCTGGAAAGCGATCAACCACTAGGCGAGAAGGCCCATAGGTATTACGACCACCTGGTCGATACGCTCGTACTGGACCAGAGCCATATCCAGCACA
TTCAAGACTTTTGGCGTGATCCCCTGACGGCGGCGGGTGCGGTCAGCGCAGATGGTAAGGCGGCGTACGTTCAACTTTACCTCGCCGGCAACATGGGTGAAGCACTCGCAAACGAATCCG
TTGAAGCCGTCCGGAAAATTGTGGCGAATAGTACACCGCCGGAAGGCATCAGAACCTATGTCACCGGACCGGCGGCCTTGTTTGCCGACCAAATCGCCGCCGGTGACCGAAGCATGAAGC
TGATCACCGGATTAACGTTCGCGGTAATCACCGTGTTGCTGCTGCTCGTCTATCGCTCGATCGCCACCACGCTGCTGATTCTTCCCATGGTGTTTATTGGACTCGGCGCGACGCGTGGCA
CCATTGCCTTTCTTGGATACCACGGAATGGTCGGCCTTTCGACTTTTGTGGTCAATATCCTCACGGCACTTGCCATTGCTGCCGGTACAGACTACGCGATCTTCCTGGTCGGCCGCTATC
AAGAAGCCCGCCATATCGGCCAGAATCGCGAAGCCTCTTTCTACACGATGTACAGGGGCACCGCTAACGTCATTCTCGGATCGGGACTGACCATCGCCGGCGCAACATATTGTCTGAGTT
TCGCCCGGCTGACGCTGTTTCACACCATGGGGCCTCCGTTGGCAATAGGCATGCTGGTTTCGGTCGCGGCCGCGCTGACCCTGGCGCCCGCCATCATTGCCATCGCCGGCCGCTTCGGCT
TGCTCGACCCCAAGCGAAGACTGAAGACCAGGGGCTGGCGTCGTGTGGGTACCGCAGTCGTGCGCTGGCCCGGGCCAATTCTGGCCACGTCGGTCGCGCTTGCCCTGGTGGGATTGCTCG
CACTACCGGGCTACCGGCCCGGCTATAACGATCGCTACTACCTGCGCGCTGGCACGCCTGTCAACCGCGGGTATGCGGCCGCCGACCGGCACTTTGGCCCAGCCCGGATGAACCCCGAGA
TGCTGCTGGTCGAGAGCGATCAAGACATGCGAAATCCGGCCGGGATGCTCGTCATCGACAAGATCGCCAAGGAGGTCCTGCACGTGTCCGGGGTCGAGCGGGTGCAAGCGATCACCCGGC
CGCAGGGGGTGCCCCTTGAGCATGCGTCGATTCCCTTTCAGATCAGCATGATGGGTGCCACCCAGACGATGAGCCTGCCCTACATGCGCGAACGCATGGCCGATATGTTGACCATGAGCG
ACGAAATGCTGGTTGCGATCAATTCCATGGAACAGATGCTCGACTTGGTGCAGCAGCTCAACGACGTTACCCATGAGATGGCAGCCACGACGCGCGAGATCAAAGCTACTACCAGCGAAC
TGCGAGATCACCTTGCGGACATCGACGATTTCGTCAGGCCGTTGCGTAGCTATTTCTACTGGGAGCACCATTGCTTCGACATTCCGTTGTGCTCGGCGACGCGATCACTGTTTGACACCC
TAGACGGCGTCGACACGCTGACTGACCAATTGCGGGCCCTTACCGACGACATGAATAAGATGGAGGCGCTCACACCGCAATTTCTCGCACTGCTGCCGCCAATGATCACGACCATGAAGA
CCATGCGGACCATGATGTTGACCATGCGATCAACAATAAGTGGCGTACAAGATCAAATGGCCGATATGCAAGACCATGCGACTGCGATGGGGCAGGCCTTCGACACCGCAAAAAGCGGCG
ATTCATTCTATCTTCCTCCGGAAGCCTTCGATAATGCAGAATTCCAGCAAGGCATGAAGTTGTTTTTGTCGCCGAATGGTAAGGCGGTGCGCTTCGTAATTTCCCACGAGAGCGATCCAG
CAAGTACTGAAGGTATCGATCGCATCGAAGCGATAAGGGCCGCGACCAAAGATGCCATCAAGGCGACACCATTGCAAGGCGCTAAAATCTATATCGGTGGCACGGCTGCGACCTACCAAG
ACATTCGAGACGGTACCAAGTACGATATCCTCATCGTTGGTATAGCCGCGGTATGCCTGGTATTTATTGTCATGCTCATGATTACCCAGAGCCTGATTGCGTCACTCGTCATTGTTGGCA
CGGTACTTCTGTCATTGGGTACTGCGTTCGGACTGTCCGTGCTCATCTGGCAGCACTTTGTCGGTCTCCAGGTGCATTGGACGATCGTCGCGATGTCTGTCATCGTCTTGCTGGCCGTCG
GTTCTGACTACAACCTCCTTTTGGTGTCCCGGTTCAAGGAGGAGGTCGGCGCTGGATTAAAGACCGGGATCATCCGGGCGATGGCCGGCACCGGCGCAGTTGTCACGTCGGCCGGTCTGG
TATTCGCGTTCACCATGGCGTCCATGGCCGTCAGCGAACTCCGCGTTATCGGACAGGTCGGCACCACCATCGGGCTCGGTCTACTTTTCGATACCCTGGTGGTCCGATCGTTCATGACGC
CATCCATCGCAGCGCTGCTAGGTCGCTGGTTCTGGTGGCCGAACATGATCCACTCGAGACCCACCGTCCCGGAGGCGCACACACGCCAGGGCGCTCGCCGAATTCAGCCGCATCTGCACC
GGGGTTGA
>Rv2437_N1 Seen in 1 sample(s).
GTGCTGCAGCGGACCAACGTTGTCCAACCGCTGAATACTCTGCGCATGGTCTGGATTCAGGTTGCCGGCATAATCCCGGCGACGGCCGGGATCGCGGCCACGGTTTCGCCCAGCTTGCGA
TGGGCGATTCGTGGCGGATCGGGGTGGACGAGCAGGAGAACACCACTCTGGTGCGCACCGGCCCGTTTAAATGGGTGCGTCACCCCATCTACACGGCCATGATGGCGTTTGGCCTCGGGC
TGTTGCTGGTGACTCCGAATCTCGTTGCCCTCGCCGGGTTTATCCTGCTCGTTGCCACGCTCGAGGTGCATGTCCGCCGCGTCGAAGAACCCTACCTGTTGCGGACGCACAGTGCCGTCT
ACCGCGGCTACACCGCCAGCGTCGGCCGGTTCGTCCCGGGTGTGGGGTTGATCCGCTAG
>Rv2526_N1 Seen in 1 sample(s).
ATGACCGTAAAGAGGACCACGATTGAGCTGGACGAAGATCTTGTGCGGGCAGCCCAGGCCGTCACCGGGGACGCGAGCGACGGTCGAGCGCGCGCTGCAGCAGCTGGTGGCCGCGGCTGC
CGAGCAGGCCGCCGCGCGCCGGCGGCGGATCGTCGACCATCTCGCGCACGCCGGCACTCACGTGGACGCAGACGTGCTGCTCTCCGAGCAGGCGTGGCGATGA
>Rv2545_N1 Seen in 1 sample(s).
GTGAGCACCACCATCGTTGCTGGCGTGATCCAGGGTCACCTGCCGGTGATCCTGCCCACGCGCAGGCGGGCTCGCGATCTCGGGCACACGACGGCGTTTTTCGGGCGCAAACGCTCCAAT
GCATATATCTCAGTATCGAATACCTATATGTTTGCTCCATGTCTCGGCGTACAACGATCGACATCGATGACATACTGCTGGCCCGCGCGCAAGCGGCGCTCGGTACCACCGGGCTGAAGG
ACAGGGTCGATGCCGCTTTGCGAGCCGCGGTGCGCTAG
>Rv2587c_N1 Seen in 1 sample(s).
CTATCCCCGTCCCGTCCGAGCCATGGCCCGGCGTTCGCGTGCGACCTGCTGCACCGCTCCCAGGCCGTTGTATGCCGGCTTGGCCAGCAGCGACGATTTGGACGCCAGATACACCAACGG
CCACGTCACCAAGAACACCACGACGAGGTCCAGGATCGTGGTGAGGCCCAGGGTGAACGCGAACCCCTTCACCTGACCGATCGCCAGAAAGTACAGCACGGCAGCGGCCAGGAAAGTGAC
GGCGTTGCCCGACACGATCGTCTTGCGGGCACGCGCCCAACCGCGCGGCACTGCCGACCGGAACGAACGGCCTTCGCGGATCTCGTCTTTGATGCGTTCGAAGAACACCACGAACGAGTC
GGCGGTGGTCCCGATACCGATGATCAGGCCCGCAATACCAGCCAGATCTAGGGTGTAGTTGATATATCGGCCCAAGAGCACCAGGATCGCAAAAACCATTGAGCCAGAAGCCACTAGCGA
CAAGGCCGTGAGCAGTCCCAGCACTCGGTAGTAGAGCAGCGAATACACCAGCACCAACAGCAGGCCGATCGCACCCGCGATCATGCCCGCGCGCAGCGATGACAACCCCAAGGTCGCCGA
AACCGTTTGGGCTTCCGACGGTTCGAAGGACAGCGGCAGCGACCCGTACTTGAGGACGTTGGCGAGCTGGCGTGCGGTCGCCGCGGTGAATGGCGGATCCCCACCGCTGATCTGGGTTCG
GCCGCCGGGGATCGCTTCCTGGATCTGCGGTGCACTGACAACCTGCGAGTCCAGGGTGAACGCCGTCTGGGTGCCGATATGGGCGGCGGTGTAGTCGGCCCAGATGTTGGCCGCCGGACC
CTTGAACTGCAGGTCGACGACGTAGCCGATGCCGCGCTGGTCCATACCCGAGGTGGCGTTTTGGATCTGGTCGCCGCTGATGACCGACGGCGCCAGCAGGTACGCGGTCTTGTGGTCGGT
CGAGCAGGTCACCAACGGCAGTTTCGGGTCGTCGTTGCCGGCCAAAATGTCGTCGCTCTCGCAGCGGGTCGCCTGGAATTGCAGTGCAACCATCTGCATGTATTGGTTGGTGCTCTGCCG
CAGCTTCTTCTCCTGGGCGATGCGCTCGGCGAGATCCTTGCGCGGATCCGTGGCCGGCGCCTCAGCGGGCGGCGCCGGCGGCGGGCTGGCCGGTGAGGTCGGGTTGGGCGATGGCGCCGG
GTCCTGCGGATAGGGCCGCGGTTGGGCCCCAGGTTGCGGTGAAGCCGGCGCCCCCGATTGGGCTGGCGGCGGTGCGGCGGGTTGACCGGGCGGCTGCGGTTCGGCGCTGGGTGCCGGCTG
CGGTTCTTCGGCTGCGGGCTGCGCCGGCATCGAGTTGAGCACCGGCCGGATGTACAGCCGAGCGGTCTGTCCGAGGTTGCGTGCCTCGCTGCCGTCGTTGCCGGGCACCGTGATGACCAG
GTTGTCACCGTCGACGACCACCTCCGACCCGGACACTCCCAGCCCGTTGACCCGCGCGCTGATGATTTGCTGCGCCTGTGCCAGCGCTTCCCGGCTCGGGGCCGAGCCGTCCGGTGTGCG
CGCGGTCAGCGTGACCCTGGTGCCGCCCTGCAGGTCAATGCCGAGTTTGGGGGCGGTGTGCTTGTCCCCGGTGAAAAACACCAGCAAATAGATGCCGATCAGCATCACCAGGAACACCGA
CAGGTAACGGGCAGGGTGCACCGGCGCCGAAGACGATGCCAC
>Rv2848c_N1 Seen in 1 sample(s).
TTACGCTCGTGGAGTATTGCAGGCCGCATGTGCGACGAAACGCGCCACCGCACCGGGGTGTTGCGGCCGGATGGGTATGCAGGTAGGACGCGTGCACGCCGCTGTGCACCGCGCCGTCTC
GCACGTCGTCCACGTCTTGGCCCTGGTACACCCACGCGGGCTGATAGCTATCGGCGAATGTGACTGCGGTTCGGTGGAATTCATGTCCAACCACGCGCTCGCCGACGGAGTACAGCGCCG
AATCAACAACCGCGACGGCGTCGCGATAACCCAGCTTGAGATGCTGGGTGAACCGCGCCGATCCGGCCACCACACCGCACATCGGGTGTCCGTCGAGTTCAGAAACCAGATAGAGCAGGC
CGGCACATTCGGCATGCACCGGGGCGCCGGCAGCGGCCAGTTCGTTGATCTGCCGCCGGACGGTGTCGTTGGCGGACAACTCGGCGGTGAACTGCTCGGGGAATCCGCCGGGCAACACCA
CCGCGTCCGTACCCTCGGGCAGAGTTTCGCTGAGCGGGTCGAACTCGACCACTTCAGCCCCGGCGGCGCGCAACATCTCGGCGTGTTCGGCGTAGCCGAAGGTAAACGCCCTTCCGGCCG
CGATGGCAACCGTGGCTGGCTGGCGGGCGGTGTTGCCGACGGCAATCACCGGGTCCCATGGCGGGTGGGCCGCCTGGCTCCCGGCGCAGGCGATCACCGCGGCCAGATCGACGTGGCGAG
CGACCACAGCAGTCATCGCCTGCACGGCGAGCCGTGCGCGACGGCCGTACTCGACGGCGGTAACCAGACCCAGATACCTTGTCGGCAGCTCTAGTTCAGCTGTGCGTGGAATGGCGCCCA
AGACCGCGACACCGGCCTGGTCACACGCCTGTCGCAGCACCTGTTCATGTCGGGCCGATCCGACCCGGTTGAGGATGACACCGGCGATCCGAGTTGCGGTGTCGAACGTGGAAAAGCCGT
GCAGCAGTGCGGCAACGCTGTGACTCTGGCCGCGGGCATCGACCACCAGGATCACCGGGGCGCCAAGCAGAGCAGCGACGTGCGCGGTGGACCCCGCTGCGGGCGCGCCCCCGGCAGGCC
CAATGCGCCCGTCGAACAGCCCCAGCACCCCTTCGATCACGGCGATGTCCGCGCCCGCAACTCCATGCGCGTACAGGGGGCCGATAAGCCGCTCCCCCACCAGTACCGGGTCGAGATTGC
GGCCGGGCCGTCCCGCGGCCAGGGCGTGATAGCCGGGGTCGATAAAATCCGGGCCTACCTTAAACGGCGCGACGGTGTGACCGGCCTGCCGCAGCGCTCCGATCAAGCCCGTCGCGATCG
TGGTCTTACCGCTGCCCGACGCAGGCGCGGCGACGGCCACCGCGGATACCCGCAT
>Rv2902c_N1 Seen in 1 sample(s).
CTAAGTGCCCTGAGCGCGACCCGTGGCCCGCATTGTCGCTGGGTGGGAACTCTTGCTCCATCTTCCCTCACCCGTCTGTGCCGTCCCGTCCCGAGGGTCGGGTTGGCCGTCGGCGACCTC
TGCGGTGTTCGACCCACTCGCCACCCGGCGAACATTGATGAACGAGTAACGGTGCTGCGGGCAGGGTCCCAATCGGGCCAGCGCCCGGCTGTGCGCCGGGGTGCTGTAACCCTTGTGCTC
CGCGAAACCGTACCCGGGGTAATCGGCGTCCAACGCAACCATCACGCGGTCCCGGCTGACCTTGGCGAGCACGCTAGCCGCGGCGATGCAGGCGGCTGCCGCGTCGCCACCGATCACCGG
CAACGACGGCATCGGCAGTCCTGGCACGCGAAAGCCGTCGCTGAGCACATAACCGGGCCGCACCGCCAGACCGGCCACCGCGCGCCGCATACCTTCGATATTGGCCACGTGCACGCCGCG
GCGGTCGACCTCGGCCGACGGGATGAACACCACGTGATAGGCCACCGCATACCGGCAGATCAGCGGGAACAGCTTCTCCCGCGCTTGCTCGCTGAGCTTCTTCGAATCATCAAGGGCGGC
AAGACTTGCTATCCGCCCGGGGCCAAGCACGCAGGCCGCGACCACCAACGGGCCAGCGCAGGCGCCGCGACCCACTTCGTCGACCCCGGCCACCGGCCCCAGACCACCACGATGCAGCGC
GGACTCCAGGGTGCGCATTCCCCGCAAACCCCCAGATTTACGGATCACCGTCCGCGGTGGCCAGGTCTTGGTCAT
>Rv2947c_N1 Seen in 1 sample(s).
TCACCCACGGCACCATCGACGGCCGCGGCCCGCGGCCCCCGGTGCTTTCGCTCGCCTCAACCGGCGCCTCTGCGGGGGCTGGTACGGGGGCCTCTTCCAAGATCAGATGCGCGTTGGTGC
CGCTGATCCCAAAGGAGGACACCGCCGCCCGGCGCGGACGCCCGTCAACCGACCACTCCCTGGCCTCGGTCAACACCGACACCGCGCCGCTGGTCCAATCCACCCGCGGGGAAGGCTCAT
CCACATGCAACGTCGCCGGCATCACCCCATGACGCATCGCCTGCACCATCTTGATCACCCCGGCGACCCCCGCGGCGGCCTGGGTGTGGCCCATGTTCGACTTGATTGAGCCCACCCACA
GCGGCTGCTCCGCTGGACCTCCCTGCCCGTAGGTGGACAGCAATGCCTGCGCTTCGATGGGATCACCCAACGTGGTGGCGGTCCCGTGTGCCTCCACCACGTCTACGTCTGCGGCGGACA
ACCCGGCGTTGGCCAACGCCGCCTGGATCACTCGCTGCTGGGCGAGCCCATTGGGCGCGGTCAGCCCATTGGACGCACCATCCTGGTTGACCGCGCTCCCCCGCACCACCGCCAGCACCG
AATGCCCCAACCGCCGGGCGTCCGATAGCCGCTCCAGCACAACCACCCCGGCGCCCTCGCCCCACCCGGTGCCGTCGGCCGCGGCCGCAAACGCCTTACATCGCCCATCGGCAGCCAACC
CCCGCTGCCGGGAAAACCCCACAAAAATCGACGGCAGCCCCATCACCGTCACCCCACCGGCCAACGCCAAATCACACTCCCCGGAGCGCAATGACGACATCGCCCAATGGATCGCCACCA
ACGACGACGAACAAGCGGTATCCACTGACACCGCCGGGCCCTGCAGCCCCAATACGTACGACACACGTCCCGAGGCCACGCTGATTGACGTGCCGGTCAACCCGTACCCTTGCAGCCCCC
CGGTATCCCTATTGCCGTAACTCGCCGCGAAAATGCCGGTGTACACCCCGGTCGCCGAACCACGCAACGACAACGGGTCAATCCCCGCGTGCTCCAACGCCTCCCACGAAACCTCCAGCA
TCAACCGCTGCTGAGGATCCATCGCCAACACTTCACTAGGAGCGATGCCGAAGAACCCGGCGTCAAAGCCGGTGGCGTCGTCTAGAAATGCCCCCCATCGCGTGTAGGTTTTGCCCTCAG
CGTCGGGATCCGGATCGTATAGCCCCTCAACATCCCAGCCCCGATCGGTCGGAAACTCCGACACCACGTCGCGCCCCGCCGAAACGACATCCCAGAGTCCGTCCGGGCCATCCACGCCGC
CCGGAAATCGGCAGCCGATTCCCACCACCGCCACCGGTTCTGTCGCGCGTTGCTCATATTCACGCAGCCGAGCGCGTGTCTCATCGAGCTCGACAGCAACCTTCTTTAGGTAGTGAAAAA
GCTTTTCGCTCTGCTGGTCGGCACCTTCAACGCTCATCGTCCGTTGCTCCTCTATCAC
>Rv2975c_N1 Seen in 1 sample(s).
TCAACGCGCCGGCCGCGAGAGCGGCCGCAACCCGCGCCACGTCTTCGGCGTCAGCCTGCGAATTCGCGTGCAAATCAGCTTCTACGACCGCGGCACGCATGGTGAACAGCATGTTGACGC
CGGTATCGGAGTCAGCGACCGGGAACACATTGAGCCGGTTGATCTCGTCGATGTGGAGGATCAGATCGCTGACGACGGCGTGTGCCCAGTCCCGCAAGGCCGAGGCGTCCAACGGCCGAT
CCGCCGTCCCCAC
>Rv3091_N1 Seen in 1 sample(s).
ATGCCGATCCCCTTTGCCGATGGGATGCTCAGCCGGCTGGGTCGCCGCGGGGCAGCGCTCGACCTGATCGAGGAGTTCGAGGACGAGTCCGGGGAGCCCCCCGCATCCCTGAGCCCCGCC
GACCTGCTGGCCGCCGAACCGGCCCTGCTGCTGCAGAAGATGGAGAACCGCCTCGTCCGGCACCACCTAGCCAATCCGGACGTGTTGAGCGGCGAACAGCTGCGCAAGCTGCGCTACATC
CTCAATTTCGCCAGGCTGGCCGACTTCGAACCGGGGGCCGCGGGGCCGGGCGGAAGCCGCGGTCGCGGGGACATCTCGGTGGGCGGCCAAGTCGCGCCTTGGCGGTCCCGGGTCGTCGAC
GCGTTGTACGCACCGCTGCGCGAGGAGCCCGATCCGGTCACGGCGCTGGAGGGCGCGAAAGACGTGCTGGCGACGCTGGTCGACGACCAGGACGATCAGCGTCGAGTGCTCATCGAGCGC
CACGGCAGCGACTTCTCCGCGACGGAACTCGACGCCGAGGTCGGCTACAAGAAGCTGGTGACCGTCCTCGGCGGCGGCGGGGGCGCGGGCTTCGTCTACATCGGCGGCATGCAACGGCTG
CTGGCGGCCGGCCAGGTGCCCGACTACATGATCGGCTCGTCGTTCGGGTCGATCATCGGCAGCCTGGTGGCCCGTGAACTGCCGGTGCCGATCGACGAGTACGCCGAGTGGGCCAAAACG
GTGTCCTACCGCGCCATCCTGGGCCCGGAGCGGCGGCGCAGCCGCCACGGGTTGGCCGGAATGTTCACCCTGCGCTTCGACCAGTTCGCCCATACCCTGCTCAGCCGTGCGGACGGCGAA
CGGATGCGCATGTCGGATCTGGCAATCCCGTTCGATGTCGTCGTCGCCGGTGTGCGCAGGCAGCCTTATGCGGCGCTGCCGTCCAGGTTCCGCCATCGCGAGCGGTCTACACTGACGTTG
CGGTCGCTGCCGTTTCTGCCGATCGGTATCGGCCCGTGGGTGGCGGCACGCATGTGGCAAGTCGCGGCCTTCATCGACTTGCGGGTGGTCAAGCCGATCGTCATCAGCGCCGACGGCGCG
ACACGCGACGTCAACGTCGTTGACGCGGCGTCTTTCTCGTCGGCCATCCCCGGTGTGCTGCACCACGAAACCAGCGACCCGCGGATGCTGCCAATCCTCGACGAGTTGTGCGCCGACCAG
GACGTCGCGGCGATGGTCGACGGCGGCGCGGCCAGCAACGTCCCGGTCGAATTGGCGTGGGAGCGGGTCCGCGACGGGCGGCTCGGCACCCGCAACGCATGTTATCTGGCGTTCGACTGC
TTCCATCCGCACTGGGACCCCCGACATCTGTGGCTGGTACCGATCACCCAGGCGGTCCAGCTGCAGATGGTGCGCAACCTGCCCTACGCCGACCACCTCGTCCGATTCGAGCCGACGCTG
TCGCCGGTGAACCTGGCGCCGTCCGCGGCGGCCATCGACCGGGCTTGCCGGTGGGGGCGCGACAGCGTCGAACCGGCGATTGCGGTGACATCGGCGCTGCTGGAGCCGACGTGGTGGGAA
GGCGACAGGCCCCCCGCCGCCGAACCCAAGGAACGCACAAAGTCGGCGGCCTCGTCGATGAGCGCCGTGATGGCCGCGATTCAGGCGCCGACGGGCCGGTTTCGGCGATGGCGAAGCCGC
CACCTGACCTAG
>Rv3197_N1 Seen in 1 sample(s).
ATGGATGATGGGAGTGTGTCAGATATCAAACGGGGCCGCGCCGCGCGCAATGCGAAGCTGGCCAGCATCCCGGTCGGCTTCGCCGGTCGGGCGGCGCTCGGGCTCGGCAAGCGACTGACC
GGTAAGTCAAAAGACGAGGTTACCGCCGAGCTGATGGAGAAGGCCGCCAATCAGTTGTTTACCGTCCTCGGCGAACTCAAGGGTGGCGCGATGAAGGTCGGCCAGGCGCTGTCGGTGATG
GAGGCCGCCATTCCCGACGAGTTCGGCGAACCCTACCGGGAAGCACTGACCAAGCTGCAGAAGGACGCCCCACCGCTGCCCGCCAGTAAGGTGCACCGGGTGCTCGACGGACAGCTGGGC
ACCAAATGGCGGGAGCGGTTCAGCTCGTTCAACGACACCCCAGTGGCATCTGCCAGCATCGGCCAGGTGCACAAAGCAATCTGGTCGGACGGCCGAGAAGTGGCCGTCAAGATCCAGTAT
CCCGGCGCCGACGAGGCGCTGCGCGCGGACCTCAAGACCATGCAGCGCATGGTCGGCGTGCTCAAACAGCTCTCACCCGGCGCCGACGTCCAAGGGGTGGTCGACGAACTGGTTGAACGC
ACCGAAATGGAACTCGACTACCGGCTGGAGGCCGCCAACCAGCGCGCCTTCGCCAAGGCGTACCACGACCACCCGCGCTTCCAGGTGCCTCACGTCGTGGCAAGCGCACCGAAGGTGGTG
ATCCAGGAGTGGATCGAAGGTGTGCCGATGGCAGAGATCATCCGTCACGGGACCACCGAGCAGCGTGATCTGATCGGTACGCTGCTCGCCGAGCTCACCTTCGACGCACCACGGCGGCTG
GGGTTGATGCACGGCGACGCCCACCCCGGTAATTTCATGCTGCTGCCCGACGGCCGGATGGGCATCATCGACTTCGGTGCCGTGGCACCGATGCCCGGCGGCTTCCCGATAGAGCTCGGG
ATGACGATTCGACTGGCCCGCGAGAAGAACTACGACCTCCTGTTGCCGACGATGGAGAAGGCCGGGTTGATCCAGCGAGGACGACAGGTGTCGGTTCGCGAGATCGACGAGATGCTGCGC
CAATACGTCGAGCCCATCCAGGTCGAGGTCTTCCACTACACCCGCAAGTGGTTACAGAAAATGACCGTCAGTCAGATCGACCGCTCGGTTGCGCAGATCAGAACGGCGCGCCAGATGGAC
CTGCCGGCCAAGCTCGCGATTCCGATGCGGGTTATCGCATCGGTGGGCGCGATCCTATGCCAGCTGGACGCGCATGTGCCGATCAAGGCCCTGTCGGAGGAGCTGATCCCGGGTTTCGCC
GAGCCCGACGCGATCGTCGTCTGA
>Rv3234c_N1 Seen in 1 sample(s).
TCAGCACCACGTCGTGGACGTCACAGTCGTAGCGAGCCCGCACCGTGCGATAGTCATCAAGACTTGCACGGGCAACCGTAAATCGCCGATTACGCGACACGGTGGCATTGAGCGGGCTAC
TGGGCGCGGTGCCCCGTGCCACCGTGCGGGCGATATCGAGAACCTTGCGGCCCGTCTCGACGAGTTGGCCGGAATTCGTTACCAACCCGGCGACCGCGGATCCGACGGCCTGTAGTTGTG
CGCCCGGCCGCACCAGCCAGTCCCCGACCGCGCGCAGCAGCAACCGCGTGGTGCCGGGGTCCCGTTCCGGGACCCAGATGTCTTCCGGAAACGCCGGTGGACGCCGCGTCCGGTCGGCGA
TCACGTGGCCTATCGCCAGCGCGGTCACCCCGTTGATCAGGGCTTGGTGCGACTTGGTGTAGAGGGCAATGCGATTCTTTTCCAGACCCTCGACGAGATACATCTCCCACAATGGCCGCG
ATTTGTCCAGCGGCCGAGCGGCCAGCCGTGCGATCAGCTCGTGCAGTTGCTCGTCACTACCCGGCGACGGCAGGGCCGACCGCCGGACGTGGTAGGTGATGTCGAAGTCGCGATCGTCGA
TCCACACCGGCCTGGCCAGGCCCAATTTCACTTCCTGGACTTTCTGACGATAGCGCGGTATCTGCGGCAGCCGCTGTTCGACGGTTTCCAGCAGTGCCTCGTAGCTCAATCCGGCACGCG
GACGGCGCAGGATCAACAGCAACCCGACATACATTGGGGTGGCTGTGTTCTCCAGCTGATAGAAGGAGGCGTCCGATGCAGACAACCGGGTGACCAC
>Rv3253c_N1 Seen in 1 sample(s).
TCAACACCTCCGGGTCGCGCTCTCTCGAGCTTGCCGAAGGCCCTGCGCCGAGTGCCGGCGCCCGTAGCCGACATAAATCGCGGTTCCGGCCACCAGCCAGATCCCGAACCGGATCCAAGT
CAACGCGGTGAGGTTCAGCATCAGCCACAGGCACGCGCACACTGCGGCGATCGGAAGTAACGGCACCCACGGAGCTGTGAACCCCCGCTGAAGGTCGGGTCGGGTCCGGCGCAGCACGAC
CACTCCGGCCGAGACGAGGATGAACGCGAACAGTGTCCCGACGTTGACCATCTCCTCAAGCTTGGTGATCGGAAACACCGACGCCGTCGTGGCCACCAACACCGCGACCAGCACCGTGAC
CCGGACCGGGGTGCCGCGCGAACCGGTCTTGGCCAATTGCCGCGGCACCAAGCCGTCGCGCGCCATGGCGAACAGCACGCGGCATTACCCGAGCATCAACACCATCACCACCGTGGTAAG
CCCGGCCAGCGCGCCGACGGAGATGATGCCGCTGGCCCAGTACACCCCGTTGGCCTGGAACGCGGTGGCCAGATTTGCCGGCCCGCGGCCCGGTACGGTCCGCAGTTGGGTGTATGGAAC
CATGCCCGACAGCACCACCGATACCGCGACGTAGAGAAGGGTCACGACCCCCAGCGACGCGAGAATCCCTCGAGGGACGTCTCGTTGAGGACGCTTGGTCTCCTCGGCCATGGTGGCCAC
GATGTCAAACCCGATAAACGCGAAGAACACGATCGATGCCCCGGCCAGCACGCCGTACCATCCGTAGTGGCTGCCTTGGGCTCCGGTCAGCAACGAGAAGACGGATTGATCGAGCCCGCC
GCCGTGGTGCTGGACTTCGGGCTCGGGAATGAACGGCGAGTAGTTGGCGGCCCTGATGTAGAAGGCACCGACGACCACCACCAAGACGACCACCGACACCTTGATTGCGGTGACCACCGC
GGAAAATCTCGACGACAATTTGGTGCCCAACGCGATCAGGGTCGCCACCAACGTGACGATCACGAGCGCACCCCAGTCGAGCTGCAGCGATCCGAGATGGCCTGTGCCATTACCGAATCC
GAACACGGTGCCCAAGTAGCTGGACCAGCCTTTGGCGACCACGGCCGCACCCATCGCCAGTTCCAGCACCAGATTCCAGCCGATCACCCAGGCCAAGAACTCCCCGAAGGTGGCATAAGA
GAAGGTATAGGCGCTGCCGGCCACCGGCAGCGTCGAGGCGAACTCGGCGTAGCACAGCGCGGCCAGCGCACAGGTCGCCGCCGCGATCAGAAACGATATCCAGATGGCCGGGCCGGTGAT
ATCGCCAGCGGTCGACGCGGTAACCGTGAATATTCCGGCGCCAATCACCACCGAGACGCCGAAAACAACCAGGTCCCACCAGGTGAGGTCCTTGCGCAGCCGAGTGGTGGGCTCGTCGGT
GTCGGCGATTGACTGTTCTACCGACTTCATGCGCCGTCGACCGGCCAT
>Rv3725_N1 Seen in 1 sample(s).
ATGCAGAATGCCACCATGCGCGTTCTGGTCACCGGCGGTACGGGATTTGTGGGCGGGTGGACTGCCAAAGCCATCGCTGACGCGGGCCACTCCGTCCGGTTCCTGGTGCGAAATCCCGCA
CGGCTGAAGACGTCTGTCGCGAAACTGGGCGTCGACGTGTCGGACTTTGCGGTTGCAGACATATCCGACCGCGATTCGGTACGGGAGGCGTTGAACGGATGCGACGCCGTCGTGCACAGC
GCCGCGCTGGTGGCAACCGACCCGCGTGAGACTTCGCGGATGCTGAGTACGAACATGGCGGGCGCCCAAAATGTTCTCGGTCAAGCCGTCGAGCTCGGAATGGATCCGATCGTGCATGTG
TCGAGCTTCACGGCGCTGTTTCGTCCCAACTTGGCGACGCTGAGCGCTGATCTGCCGGTTGCCGGTGGGACGGATGGATACGGACAATCCAAAGCGCAGATCGAAATCTATGCGCGCGGT
CTTCAGGACGCCGGCGCACCGGTGAACATCACTTATCCTGGCATGCTCCTCGGCCCGCCGGTGGGCGATCAATTCGGTGAAGCCGGGGAGGGTGTCCGGTCCGCATTGTGGATGCATGTC
ATTCCCGGGCGCGGCGCGGCGTGGTTGATCGTCGACGTCCGAGATGTGGCGGCACTGCACGCGGCGTTGTTGGAATCCGGGCGTGGGCCGCGCCGCTACACTGCGGGAGGTCATCGGATT
CCGGTGCCCGAGCTCGCGAAAATTCTGGGCGAGGTCGCCGGCACCACGATGCTGGCCGTCCCGGTGCCCGATTCCGCGCTGCGTGTCGCGGGATCGGTGCTGGATCAAGCCGGGCCCTAT
CTGCCTTTCAATACTCCGTTCACCGCGGCAGGTATGCAGTACTACACACAGATGCCGGAGTCCGACGATTCGCCGAGCGAAAAAGAACTAG
>Rv3736_N1 Seen in 1 sample(s).
ATGTCCGTGGTGCGCGGGACCGCTCTGGCTAACTACCCGAGCCTAGTTGCCGGGTTGGGCGGTGACCCGGCCACTCTGCTACGGGCCGCGGGTGTTCGGGATCAGGATGTCGGCAACTAT
GACGCGTTCATTTCGATCCGGGCAGCGATTCGGGCAATCGAATCGGCCGCAGCGGTCACCGCCACAATGGATTTCGGGAGACGATTGGCACAGCGGCAAGGGATTGAGATCCTGGGACCG
GTCGGTGTGGCGGCCCGCACGGCCGCCACGGTCGGTGACGCTCTGGCGATCTTCAACACCTTCATGGCGGCCTACAGCCCAGTTATCGCCATCCGGATCACGCCGCTGGCCGGACAGCGG
TCATTTATTGCACTCGAGTTCCTGCTCGACGAGCCGGCGTCGTATCCGCAGACCATGGAGCTGGCGCTCGGGGTGGCGCTCGGGGTGATCCGGTTGTTGTTGGGCGCTGACTACGCCCCA
CTGGCCGTGCACTTACCCCACGACCCACTCACACCCGAAGCCTTCTACCTGCAGTACTTCGGCTGCCGGCCTTACTTCGCCGAACGTGTTGGTGGTTTCACCATGCGCACCGCGGACCTG
AGCCGTCCCCTCAACCGCGACGATGTCGCCCACCGGGTGGTCGTCGACTACCTGAGCAGCATCACGCCGCTGGGCGAGGGGATCGTGGAATCGGTGCGCACCATCGTGCGCCAGCTGCTG
CCCACCGGAGCGGCGACGCTCAACGTGGTCGCCGGGCAGTTCCACCTGCACCCGAAAACGCTGCAACGTCGACTTGCGGAGGAGAACACCACATTCGTTATTCTGGTCGATCGGGTCCGC
AAGGATGTCGCCGATCGCTACCTAAGGACCACCGGGATCGGCCTTACCCATTTGGCACGTGAACTGGGCTACGCCGAACAAAGCGTGTTGACCCGCTCGTGCAAACGCTGGTTCGGAACC
GGACCGGCCGCCTACCGCAACCAGGCCAGGTTACAGACAACCGTGAGCGCACCTGGCAGCGGGCGTGGTCCGAATCCAGGTAACGTCTCAGTATCCTGCTGA
>Rv3825c_N1 Seen in 1 sample(s).
TTACTGTGACGACAGCGCGGCAGCAGGAGCGTCGTCGGGCGCCAGCTGTTCATACAAGTGATCCGCTAAGCCCCGCACCGTGGCGCTGACGTTCTTGGGTGCCAACCGGATTCCGGTCTC
GGTCTCGATCCGAGTGCGCAGCTCTAGTGCGCCCAACGAATCAAGTCCATACTCGGGTAGCGGGCGGTCAGGGTCGACGGTGCGCCGCAGAATCAGGCTGACCTGCTCGGCGACCAGCTG
CCGAAGCCGCGCCGGCCACTCGTCGCGTGGCAGCTCGTTCAGCTCGACGCGGAATTTGCTTGTGCCCGAACCGTTGCTGCTGGAGAACACTTCGAAAAACCGGCTGCGCTCTGCGAAGGC
GACCAGCCACGGGGCTCCGATGACCGGGGCATAGCCGGTATAGACGCGGTTGTGGCGCAATAGCGCCTCGAACGCGTAAGCACCTTCGTCGGGAGTGATCGCCGTGTAGTTGCTTTCCTC
CAATGCCGAAGCCCGCGCGGGCGATGCCGACCACCACCCCAACTGGCCGATATCCGACCAGGCTCCCCACGCGATCGCGGTAGCCGGCAGGCCCTGAGCTTGCCGCCAATGCGCGAAGGC
GTCCAGCCAGCTGTTGGCCGCTGAGTAGGCACTCTGTCCCGGCGAGCCGGTGAGAGCTGCCGCCGACGAAAACAAGCAGAACCAGTCAAGCGGCTGTCCGCTGGTTGCTTCATGCAACTC
CCAGGCACCGTGAACCTTTGGCGCCCAGTCGCGCGCCAGCAACTCGTCGGTGATATTGGCCAAGGTGGCGTCCTCGACCACCGCGGCCGCGTGTAGCACGCCTCGTACCGGAAGCCCGGT
GGCCACAGCGGTCGCCACCAACCGCTCCGCGGTACCCGGTTGGGCGATGTCACCGCATTCCACCACGACTTCAGAGCCCATCGCCGCGATGGCCTCGATCGTTTCCCTCATCTTTTGCGT
CGGCTGGGTGCGGGAATTCAGCACGATCCGGCCGCAACCGGCCGCGGCCATCTTCTCGGCCAGGAACAGCCCTAGCCCACCGAGGCCGCCGGTGATGATGTAGGAGCCGTCGGGACGGAA
CACCTGAGCTTGTTCCGGAGGCAGGGTAACGAGGCTTTTTCCGGTCTGTGGGATGTGGAGGACGAGTTTGCCGGTGTGCTCGGCGTTGCCCATCACACGGATGGCGGTGGCCGCCTCGAC
GAGGGGGTAATGGGTGCTCTGCGGCATCGGCAACTCGCCGGCTGCGGTCAAGCGATAGACCGTGCCGAGCAGGTCGCGCAGCTCTTCTGGGTGTGTCGCAGACAGCAACCCCAGGTCTAC
GGCGTAGAAGGACAGGTTGCGCCGGAAGGGAAAGAGCCCCAGCTTGGTGTCACCATAGATGTCGCGCTTGCCAATCTCGACGAACCGTCCCCGGAAGGCGAGCAGTTTCAGCCCGGCAAG
TTGCGCGGCGCCGGTCACCGAGTTGAGCACGACATCGACACCCCGGCCGTTAGTGTCCCGCCGAATCTGCTCGGCGAACTCGATGCTGCGCGAGTCATAGACATGCTCAATACCCATGTT
GCGCAATAGCTCTCGACGCTGTGGGGTACCGGCGGTGGCGAAGATCTCAGCGCCCGCCGCGCGGGCTATAGCGATCGCCGCTTGTCCGACCCCGCCGGTGCCGGAGTGAATTAGCACCGT
GTCACCCGCCCTAATCCGGGCGAGCTCATGCAGTCCGTACCAGGCGGTGGCGTGCGCGGTGGTCACCGCAGCGGCCTGTGCGTCACCCAGGCCCGGTGGCAGCGTCGCGGCCAGCCGAGC
GTCACACGTGACGAATGTGCCCCAGCAGCCGTTAGGCGACATGCCACCAACATGGTCACCAACCTTGTGGTCAGTGACGCCTGGTCCGACCGCGGTCACCACGCCGGCGAAATCCGTGCC
CAGCTGGGGCAGGTGTCCCTCGAAGCTGGGGTAGCGACCGAAAGCGATGAGTACATCGGCAAAGTTGACGCTGGACGCACGGACCGCAACCTCGATCTGTCCTGGTCCTGGTGGAACGCG
GTGAAACGCGGCCAGCTCTATCGTTTGCATATCGCCGGGGGTACGGATCTGCAGGCGCATGCCGCTCTGCTGATGATCCGCGACGATGGTGCGCCGCTCCTGAGGACGCAACGGGGTCGG
ACACAAGCGCGCCACGTACCACTCGTTGTCTCGCCAGGCGGTCTCGTCTTCTTCCGACGTGGCCAGCAATTGGCGTGCCAGCTGCTCGACACCGGTCTGTTCGTCCACGTCGATCTGGGT
GGCACGCAGGTGAGGGTGCTCGGCGCCGATCGTCCGCAGTAGACCACGCAGCCCGCCCTGCTCAAGATTGACGCAGTCGTCGGCCAGCACCCGCTGGGCACCCCGCGTCACGACGTACAT
GCGCGGCACCGCCCCGGGAAGGTCTGACAATTCGCGAGCGATACCCACCAGCCGGCGAACGTACTCAGCGCCGCGATCCGCGCTCCCCTGATGCGGCGTACCGGTGTTCGACCCGGTGAG
CACGACCACGCCGCTAAACTCGTCGCTACCAACTTGATCGCGTAGCTGGTCGGCGGCGGCCAACTGGTCGTCGTGCAGTGGCCACCGCATCGTCGTGCACGCCGCGCTGTGTTCCCTAAA
CGCGTCCGCTAGCCGGGTAGCGGTCACATCAGAGGCAGCGCAGTCACTGATCAGCAGCCATTTTCCAGCGCCAGAGGGGTCCATCTCGGGCAGCTCACGCTGGTGCCATTCGATGGTGAG
TAAGCGCTCATTCAGCACCCGATTGTGTTTATCGCGCTCGGACACTCCCGTACCGATTCGCAGTCCGCACACGGCCAGCAACACCGTGCCGTGCGCGTCCAGCACGTCGATATCGGCCTC
GACGCCGACCAACTCGACTTTGGTCACCCGCGTGTAGCAATAGCGAGCGGTACGCACCGGAGCATAGGCACGGACTCGGCGCACCCCCAACGGCACCAATAGGCCGCTACCTACCGACTG
GCTATCGGGATGCGCGCCGACCGACTGGAAACAGGCATCCAGGAGGGCCGGGTGGATTGCGTACAGGCCCTGCTGCGAACGAATCGAGCCGGGCAGCGCGACTTCGGCCAGCATTGTGGC
GGTCGCATCCTCCGCGACATAGGCCACGGCCAGGCCGGTGAAGGCCGGACCATATTGCACACCGTGCTTGTCGAATTGCCGGCGCAGATCCTCACCGTCCACGCGGCAAGGGTGGGCTTC
CAATAAGGAGGCCATGTCGTACGCCGGCGGCTCGCATTCGCCGGATACCTGCTGCAGCACCGCCGACGCACGCCGCAAGTGATGCCCAACGCCTTCCTGCAAGGCCTCGACGGCGAAGTC
GACGACACCGGGCGAGGTCACCGTTGCCACGGTGGACACCGGGGTCTGGTCATCCAGCAGCAGCATCGCCTCAAAGCGCATGTCGCGTACTTCGGACTGCTCGCCGAGGACGGCACGGGC
CGCAGACAACGCCATCTCGCAGTAGGCGGCCCCTGGAAGAGCAGCCACGTTGTGTATCCGGTGATCGCCCAACCAGGGCAAGGTTGCGGTACCAACATCGGCCTGCCAGGCGTGGCGTTC
CGGCTCTTCGGGCAATCGCACGTGTGCGCCCAACAACGGGTGCACGGCTACCGTGGAGCCACCCGGCGACCGATTGTCAACGCCTTCGCGGTCATAGAACAGGAACCGGTGCGACCACGC
CGGCAGCGGAGCATCGACCAAGCGGCCTTGGGGACAGAGCACCGAGAAGTCCACTGCCGCACCAGCGTTGTGCAGATCCGTCAGCAGGCGACGGAGCCCCAGCGGCAATGGCTGCTCCCG
CCGCATACCGGCCAGCGCGGCAACCGGCATGCCTACACTGCCGGCAATCTGATCGACCGCGTGGGTCAGCAGCGGGTGCGGCGAAAGCTCGGCGAAGACTCGGTACCCGTCGTCGAGCGC
CGAGCGCACCGCAGCGGAGAACCGCACGGTGTGGCGCAAATTGTCGGCCCAGTAACGCGCGTCGCACGCCGGCGCTTCGCGCGGGTCGAAAAGCGTCGCCGAATAGTAGGGAATCTCAGG
AGCTTTCGGATTCAGGTCGGCCAGCGCAGCTATCAACTCGTCGAGGATCGGATCCACCTGCGGCGAATGCGAAGCCACGTCGACGGCCACCGCCCGCGCCAGCACGTCTCGCCGCTCCCA
TATGTCGACCAGCTTGCGCACCGACTCGGTGCCTCCGGCGATCACGGTGGACTGCGGCGCGGTCACCACGGCGACCACCACATCGTCGATGCCTAGAGCGGTCAATTCCGACTGCACAGC
TAAGGCAGGCAACTCCACCGACGCCATCGCCGCGGAACCGGCGATCGTCGCCATCAGTTTTGATCGTCGGCAGATGACGCGTACCCCATCTTCGGCTGACAGCACCCCTGCGACCACAGC
CGCGGCCGACTCACCCATTGAGTGGCCGATCACGGCGCCCGGGCGCACTCCGTATGCCGCCATCGTGGCTGCCAACGCGACCTGCATCGCGAAGATGGTCGGCTGAACTCTGTCGATGCC
AGTCACGGTCTCGGGCGCCGTCATCGCCTCGGTGACCGAGAACCCGGACTCCGCGGCGATCAATGGCTCTAGCTCCGCAACGGTCGCGGCGAACACCGATTCGTTCGTCAGCAGATCGGC
GCCCATCGCTGCCCACTGCGACCCTTGCCCGGAGAATAACCAGACCGGCCCGCGGTCATCCTGCCCCACCGCGGGCTGGTAAACGGTGTCACCGTCGGCGACCTCGCCCAAGCCGGCAAT
CAGCTCGTCGACGCTGCTCGCGATGACCGCCGTGCGCACCGACCGGTGCGTACGCCGCCGCGCCAGCGTGTACGCAAGATCCGAGAGCACCAGGGAGTCGGCGTGCTGCTGTATCCAGTC
GGTCAACCGCTGAGCAGTCTGCCGCAGCGCGTCGGCCGAGGAAGCGGACAGCGTGAACAAGGCAGGGGTGCCGGTCGGGGGGGTGCTCGCCGCGTGGGGCTGGGCTTCGGTTTGCGGAGC
TTGCTCCACAACAGCGTGCACGTTCGTTCCCGAGAACCCATAAGACGACACTGCCGCCCGCCGGGGCACCTGACGACCGTTGGTGGGCCACGGTGTGGTCACCTCGGGCACGAAGAGGTT
GGTGGTGATGCCAGCAATCTCATCGGGCAGCCGAGTGAAGTGCAGATTACGTGGAACCACACCATGTTTCAGAGCGAGAACCACCTTGATTAGCCCTAGCACCCCGGCGGTCGACTGGGT
GTGTCCGAAGTTGGTCTTCACCGATGCGAGTGCGCACGGGCCGTCGACCCCATACACCTCGGAGACACTTGCATATTCAATGGGGTCACCGATCGGGGTGCCGGGGCCGTGCGCTTCGAC
CATGCCGACCGTCGCGGCGTCCACGCCACCGGCAGCCAACGCCGCTCGATAAGCCGCAACCTGTGCGGGCTGCGAAGGCGTCGCGATATTGACCGTGTGGCCATCCTGATTTGCGGACGT
GCCACGAATTACCGCCAGGATCCGGTCACCGTCGGCCAATGCATCCGGCAACCGCTTGAGCACCACCACGGCACAACCCTCGCCTGACACGAACCCGTCAGCCGCGACATCGAACGCGCG
ACAACGTCCGGTCGGGGACAACATGCCCAAAGCGGATCCAGCAGCGGCCTTGCGTGGCTCCAGCATCAAGGCGACACCCCCCGCCAAGGCAACGTCGCTTTCACCCTCGTGCAGGCTGCG
ACACGCCATGTGCACGGCCGTCAGGCCGGACGAGCATGCGGTATCAACGGTTATTGCCGGACCGTGCAGTCGCATCGCGTAGGCGACCCGGCCCGACGCCATGCTGAAGCTGTTGCCCAG
ATATCCGTACGGCTCCTCCAATTGTTTGGCGTCGGCCGCCACCATCGTGTAGTCACCATGGGTGACACCCGCGAACACGCCGGTCGCCGAGCCTGCCAGCGTTTGCTGAGTAAGACCGGC
GTGCTCCATGGCCTCCCAGGACGTCTCCAGCAACAGACGTTGCTGCGGATCGATCGCAATCGCCTCCCGCTCGCCGATGCCAAAGAACTCGCAATCGAAATCCGCGGGGTTATCCAGGAA
ACCGCCCCACTTGCACACCGTCCGACCGGGCACGCCCGGCTGCGGGTCGTAGAACTCGTCGCAATCCCACCGGTCCGGCGGCACCTCGGTGATCAGGTCGTCGCCTCGTAACAACGCCTT
CCACAACAACTCGGGGGAATCGATCCCGCCGGGCAGCCGGCAAGCCATGCCGATAACAGCAACCGGAGTCACACGTGGTTCAGCCAACGTCCATGCACCCCTATCTGCACCAGTGCCTGA
CGCCGCCGACCCCAAGCCCAA
>Rv3829c_N1 Seen in 1 sample(s).
CTACCGACCACTCAAAACGCAACAGTTGGCAGCCCTACGATCGGCCAGCGCCTGACGGGCGGCGTTATATCCAGGGATGAACGTGATTCCCGGCCCACCGTGGACAACCGGCACTGCCCA
GGTACAACCCGGCTATCGGGATCGGCTGGCCGATAAAGCCTTTCGGGCCAGGCCTGTTGGGGCCGATCTGGTCCGAGTGCAGCAGGGCATGGCAGTAGTCCCCACCCGGGGCACCGAACA
TCACACCCATGTGTTTGGGGGTAAAGGTGGTGTACCGGAGAATGCTGCCTTTGAAGTTCGGTGCCAACCTAGTGATCTTGTCGATCACGTTCTGCCCCATTTCGACCTTTGCCCGGCCGT
ACCCTCCGTATTTTGAGCCACCCTCGATCGGGAACCACATTGCGAACGCCGACGCGGCCTGCTTACCCGCCGGGGCCAGGCTGGGATCATGCAGCGACGGGATCTGCAACACCACGGTCG
GATCGGCCGGGACGATCCCACGCCGGCAATCCTCCCACTGCTGCTGAACCTGCTCCGGTGTACAGAAAATGCCCATCGATGCCTGCATGCTCGGATCGTTGAGTGCCTGGTAGGGCGCCG
CGAAGGCCGGTGGCTGCGCGAGCGCAAAATGCATCTGCAGATAGCTGCCGCGGTGGTCGATGCGCAAATAGCGATCGCGGATTTCCGACGGCAACACTGCCGGATCGATCAGCTCGTTGA
TGGTGACGTCGGGTGCTATGGCGGAGACCACGATCGGGGAGGTCAAGGTGTCCCCCGCCGCGGTGCGCACGCCCCGCACGCGGGCTGACGACCGACTATTGTCAACCACGATCTCGGTCA
CCTTGGAACGTAACCGGACCTCGCCGCCGGTGCGTTCCAGCAATTGCGACAGATGGGTGGTAAGCGCGCCGATGCCACCGCGCAATTTCTTCCACCGCACGAAGTCGCCCTCCGGGACAC
CCAATCCGAAGGCGAGCGCGGCAGCGCTGCCCGGTGTGGCCGGCCCGCGATAGAGCGTGTTCACGGCCAGCACGGTCATCGACCCGCGCAGGGCGCCGTGCTTCTCGCGGTCCGGGAAAT
GGCGGTCCAACACGTCGGTGACCGATCCGAACAGCATGTCATCGATCGCTGACCGTTCGAATTCATTTGTGGCACAGGCATACATCTCGTCGAAGCTCTTGGGCAGAGTTCCGGCTTCGA
AACGCCCCAGCGCCCGGGTCGGCGCCTGGCTCCACGCCAGCAGGCCCGCCATCCCGGTGACGGCGTCTGCCCCGTGCACCCGATGGAGGTGGGTAAGCATCTTCGTCGGGTCGGTGAATT
GGACCACCGGATCGTCCCCGACACCGCGCAACGCTACCGACATCACCTCCAGATCGACCGTCGGCAAGCTGTCCAGGCCTAACTCGCTGCTGACCGCCGAGGAGGTCGGGAACTGCACCG
ATCCGGCGATCTCGAACCGGTACCCGTCGAACAGCTCCACCGTGGAGGCCATCCCGCCGGCGTAGCGCTTAGCGTCCAGACACGCGGTCCGCAGTCCGGCTCGCTGCAGCAGCACTGCCG
CGGTCAGCCCGTTGTGCCCGGCGCCGATAACTATCGCGTCATAACCAGTCAT
>Rv3830c_N1 Seen in 1 sample(s).
TTATGACGAAAACTGTGAGGGTGGTCCAGGTGTCGGAGATGCCGACGCGCAGCGACTCCAGTGCGACGTGGCAGACCCGCGCCAGCTCCCCGAGCGACCGGTCACTCCCAAGCATCCAGG
CTTCCATCGCGCCGAACACCGCCGCGGCGACGCATCGTGCGGTGACGGCGATGTGCAATCGGGCATCGGGTGCACCCGCGATATCGCAGTTACGTCGCCGCAATTGGGCCTGGATGGCAT
CGGCGAAGTCGGCTTCCACCTCGCGCATATGGCGGACGATCCGGCTCGGCTCCAACTCGCCGCGCCGCAACGACGCAATCTTCGTCACTGCGTCAACGTCATAAGGAAACGAGAAGATAG
CCGCTTGCACGGAATCGATGATCGATTCGTCGGCCGGTCTAGCATCCAGCGCCGCGCGAAACCAGTGCAGTCCGGCGTCGTAGTCGGCAAACAGCAAATCGTGCTTGGATCTGAAGTGGC
GATAGAAAGTACGCAGCGACACCCCGGCGTCCTCCGCAATCTGCTCGGCTGAGGTAGCCTCGACGCCCTGGGCCAGAAATCGCACCAGGGCGGCCTGGCGCAGTGCCTCGCGAGTGCGTT
CGCTGCGCGCCGTCTGCGGGGGCCGGACCAT
>Rv3847_N1 Seen in 1 sample(s).
ATGGGTACCGGGTCAGGTGGGCCTATTGGGGTTTCTCCCTTCCATTCGCGTGGTGCCCTGAAAGGGTTCGTGATCTCTGGACGTTGGCCTGATTCGACCAAAGAGTGGGCCCAGCTGCTG
ATGGTCGCAGTTCGGGTCGCGTCGTTGCCCGGCTTGCTCTCCACCACAACGGTGTTTGGTGCCCGCGAAGAGTTGCCCGACGAACCCGAGCCGGGGACCGTCGGTCTGGTGCTGGCCGAG
GGCACCGTCTTCGGTGAATCAGCAATTCAGCCAGGATATTTCGCTGATCATCAACCCCCTGCATTGCTGATGCTGCATCCACCCTCGGAGACCACGCCGTCGCTGCCGGAATGCACCGGG
GCGGCGTCAGGGTGCGTGCTGCTGCCGGGATTACCGTATCTGGGATTGGAACATCGTGCGGCTTGGGTGGAGGCTGAAGCCGACGGCACCATCACATCTATGGTGAGCCGGGTGGGCGTC
GACCCGATAAGCCATCCCGACACGCAATTCTGGCAATGCTGCTTGCAGCATAA
>Rv3848_N1 Seen in 1 sample(s).
ATGCTCGCCGCCACACTGCTAAGTCTGGGAGCCGTTTTCCTTGCTGAGCTCGGCGACAGATCCCAGCTCATCACGATGACCTACACACTTCGCTACCGCTGGTGGGTGGTGCTGACCGGG
GTGGCGATCGCAGCGTTCACGGTGCACGGGGTAGCGGTGGCGATCGGCCACTTTTTGGGCTCGACCGTGCCGGCCCGGCCGGCCGCCTGCGTATCGGCGATCGCATTCCTGATCTTTGCC
GTGTGGGTCTGGCGGGAGGACACGGCCAGCGACAGCGAAACCTCGCCAACCGCTGCCGAACCCCGACTCGCGCTGTTCACCGTGGTCTCGTCGTTCGCACTGGCTGAGCTGGGTGACAAG
ACAACGTTGGCGACGGTGACCTTGGCCAGCGATCACCACTGGGCCGGCGTATGGATCGGCACCACCCTGGGCATGATCCTGGCCGACGGCCTGGCGATCGGCGCAGGGCTGCTGCTGCAC
CGGCGCCTTCCGGAGCGGTTGCTGCAGGTCCTGACTGGCCTGCTGTTCCTGCTGTTCGGACTGTGGTTGCTGTTCGACGACGCGTTGGGCTTCAGATCGGTTGCCATCGCCGTGACAGCG
GCGGTGGTGCTGGCCGCGGCAACTACGGCGGTATCGGTGCGGGTGGCGCAAACTCGTCGGCGGCGGCCAACCGCTGCTGCGACACCAGAAGATGACTCGACACGCCCCGAGCGGTCGTCG
GTCGCGCCGGGCCATCCCGGGAGCATCTTGCTACCGCTTCCGGAAGTGTCTTTGCGGGGGCGCCGACCGCCCTCAGGGTCGCCTGACGAGCGCTGTGCGGACCCAGGCAGCAAAGGAGGC
TCTCGGCGAATCTCCGTTGGCTGCTAGTTGCCCGGAGTCGGCCGCATCCGCCCGACACGGTCATCCTGA
In [12]:
# Description of novel alleles; number of mutations, and description of the mutation;
cat sample.call.novel.txt
Sample Locus Novel_id MinKmerDepth Nmut Desc
SRR6152708 Rv0024 N1 42 1 From allele 98, Del of len 1 at pos 719.
SRR6152708 Rv0035 N1 46 2 From allele 227, Subst C->G at pos 47, Subst A->T at pos 76.
SRR6152708 Rv0045c N1 61 4 From allele 62, Subst C->T at pos 318, Del of len 2 at pos 650, Subst A->G at pos 652.
SRR6152708 Rv0063 N1 55 1 From allele 140, Ins of base G at pos 334.
SRR6152708 Rv0101 N1 50 2 From allele 1541, Subst A->G at pos 5360, Subst A->G at pos 6088.
SRR6152708 Rv0134 N1 66 2 From allele 25, Subst G->T at pos 374, Del of len 1 at pos 386.
SRR6152708 Rv0165c N1 58 3 From allele 16, Subst T->C at pos 147, Ins of base G at pos 163, Ins of base G at pos 164.
SRR6152708 Rv0195 N1 63 2 From allele 88, Subst G->A at pos 185, Subst T->C at pos 191.
SRR6152708 Rv0226c N1 66 2 From allele 59, Subst A->C at pos 36, Subst A->G at pos 1229.
SRR6152708 Rv0276 N1 42 2 From allele 108, Subst C->G at pos 50, Subst T->C at pos 57.
SRR6152708 Rv0290 N1 51 2 From allele 253, Subst G->A at pos 691, Subst G->A at pos 1033.
SRR6152708 Rv0551c N1 47 2 From allele 68, Subst T->C at pos 472, Del of len 1 at pos 483.
SRR6152708 Rv0654 N1 49 2 From allele 187, Subst C->T at pos 1303, Subst G->A at pos 1315.
SRR6152708 Rv0739 N1 63 3 From allele 87, Ins of base C at pos 14, Ins of base G at pos 13, Subst A->G at pos 62.
SRR6152708 Rv0757 N1 49 2 From allele 141, Subst T->C at pos 364, Subst T->C at pos 373.
SRR6152708 Rv0818 N1 56 3 From allele 89, Subst G->A at pos 275, Subst C->T at pos 290, Subst G->A at pos 293.
SRR6152708 Rv0826 N1 49 2 From allele 84, Subst C->G at pos 901, Subst G->A at pos 920.
SRR6152708 Rv0888 N1 50 1 From allele 187, Ins of base G at pos 1145.
SRR6152708 Rv0908 N1 55 2 From allele 271, Subst T->C at pos 2197, Subst C->T at pos 2207.
SRR6152708 Rv1001 N1 45 2 From allele 68, Subst T->C at pos 982, Del of len 1 at pos 999.
SRR6152708 Rv1097c N1 62 3 From allele 135, Subst A->G at pos 302, Del of len 2 at pos 311.
SRR6152708 Rv1128c N1 38 2 From allele 223, Subst T->C at pos 421, Del of len 1 at pos 439.
SRR6152708 Rv1145 N1 48 2 From allele 2, Subst A->G at pos 818, Ins of base A at pos 829.
SRR6152708 Rv1225c N1 55 2 From allele 110, Ins of base G at pos 435, Subst T->C at pos 451.
SRR6152708 Rv1258c N1 47 2 From allele 82, Ins of base G at pos 681, Subst T->C at pos 697.
SRR6152708 Rv1269c N1 55 2 From allele 63, Ins of base G at pos 110, Ins of base C at pos 123.
SRR6152708 Rv1326c N1 54 2 From allele 76, Subst A->G at pos 2007, Subst T->C at pos 2034.
SRR6152708 Rv1330c N1 58 2 From allele 169, Subst G->T at pos 436, Subst T->C at pos 447.
SRR6152708 Rv1363c N1 43 2 From allele 61, Subst C->A at pos 59, Subst C->G at pos 75.
SRR6152708 Rv1413 N1 60 2 From allele 139, Subst C->A at pos 65, Subst G->A at pos 80.
SRR6152708 Rv1420 N1 47 2 From allele 42, Subst C->G at pos 56, Subst T->C at pos 212.
SRR6152708 Rv1551 N1 48 2 From allele 279, Subst A->G at pos 907, Del of len 1 at pos 917.
SRR6152708 Rv1564c N1 45 2 From allele 39, Del of len 1 at pos 1564, Subst G->C at pos 1581.
SRR6152708 Rv1615 N1 47 2 From allele 97, Subst C->T at pos 275, Subst C->T at pos 288.
SRR6152708 Rv1775 N1 56 2 From allele 167, Del of len 1 at pos 124, Subst A->C at pos 128.
SRR6152708 Rv1915 N1 58 2 From allele 1, Subst G->A at pos 536, Ins of base T at pos 886.
SRR6152708 Rv2027c N1 55 2 From allele 5, Del of len 1 at pos 948, Subst G->C at pos 969.
SRR6152708 Rv2084 N1 61 2 From allele 138, Ins of base T at pos 23, Ins of base G at pos 20.
SRR6152708 Rv2148c N1 75 2 From allele 87, Subst G->C at pos 19, Del of len 1 at pos 4.
SRR6152708 Rv2176 N1 53 2 From allele 53, Subst C->T at pos 75, Subst T->C at pos 95.
SRR6152708 Rv2185c N1 41 2 From allele 41, Subst G->T at pos 342, Subst T->G at pos 358.
SRR6152708 Rv2241 N1 37 2 From allele 296, Subst T->C at pos 2532, Subst T->G at pos 2545.
SRR6152708 Rv2264c N1 48 3 From allele 1, Ins of base A at pos 57, Subst C->T at pos 28, Subst T->C at pos 322.
SRR6152708 Rv2293c N1 60 1 From allele 66, Ins of base A at pos 602.
SRR6152708 Rv2339 N1 57 2 From allele 524, Del of len 1 at pos 219, Subst A->G at pos 224.
SRR6152708 Rv2437 N1 64 2 From allele 78, Del of len 1 at pos 107, Subst C->G at pos 113.
SRR6152708 Rv2526 N1 53 6 From allele 4, Subst A->G at pos 45, Del of len 2 at pos 72, Del of len 3 at pos 73.
SRR6152708 Rv2545 N1 46 2 From allele 48, Subst G->A at pos 91, Del of len 1 at pos 99.
SRR6152708 Rv2587c N1 51 2 From allele 344, Subst C->T at pos 906, Subst T->C at pos 924.
SRR6152708 Rv2848c N1 60 2 From allele 238, Ins of base G at pos 58, Subst A->G at pos 66.
SRR6152708 Rv2902c N1 46 2 From allele 2, Subst T->C at pos 253, Subst G->A at pos 261.
SRR6152708 Rv2947c N1 37 1 From allele 48, Ins of base G at pos 780.
SRR6152708 Rv2975c N1 51 3 From allele 42, Subst G->A at pos 32, Del of len 2 at pos 6.
SRR6152708 Rv3091 N1 59 2 From allele 166, Subst T->C at pos 1289, Subst G->A at pos 1299.
SRR6152708 Rv3197 N1 66 2 From allele 10, Subst A->G at pos 342, Subst C->A at pos 351.
SRR6152708 Rv3234c N1 60 2 From allele 19, Ins of base C at pos 18, Subst G->T at pos 57.
SRR6152708 Rv3253c N1 36 2 From allele 278, Subst G->A at pos 447, Subst T->C at pos 462.
SRR6152708 Rv3725 N1 44 3 From allele 79, Subst G->C at pos 526, Subst A->G at pos 747, Ins of base A at pos 752.
SRR6152708 Rv3736 N1 75 2 From allele 83, Subst T->C at pos 510, Subst A->G at pos 755.
SRR6152708 Rv3825c N1 40 2 From allele 480, Subst T->C at pos 4426, Subst G->A at pos 4453.
SRR6152708 Rv3829c N1 61 2 From allele 328, Subst T->A at pos 86, Ins of base G at pos 103.
SRR6152708 Rv3830c N1 61 1 From allele 105, Ins of base G at pos 147.
SRR6152708 Rv3847 N1 57 2 From allele 64, Del of len 1 at pos 504, Subst G->A at pos 532.
SRR6152708 Rv3848 N1 53 2 From allele 271, Subst G->A at pos 866, Subst G->T at pos 878.
You can update your MLST scheme with the novel alleled detected by MentaLiST, specially after running it on many different samples. In the scripts folder, there are python scripts to help select alleles and build an updated scheme. To do that, you will perform the following steps:
Each step will be described below.
In [ ]:
# First pass call:
mentalist call -o my_dataset_calls1.txt --db mtb_cgmlst.db -1 SRR6*_1.fastq.gz -2 SRR6*_2.fastq.gz
# Parse the novel alleles output, possibly filtering some alleles.
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o all_novel_alleles
# Add the novel alleles to the scheme FASTA files:
create_new_scheme_with_novel.py mtb_cgmlst_fasta/*fasta -o MTB_novel_scheme -n all_novel_alleles.fa
# Build a new MentaLiST db for this scheme:
mentalist build_db --db mtb_novel_cgMLST.db -k 31 -f MTB_novel_scheme/*.fasta
# Second pass mentalist call:
mentalist call -o my_dataset_novel_calls1.txt --db mtb_novel_cgMLST.db -1 SRR6*_1.fastq.gz -2 SRR6*_2.fastq.gz
In [30]:
# Download a 4 sample tuberculosis dataset:
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/002/SRR6397472/SRR6397472_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/006/SRR6398036/SRR6398036_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR615/008/SRR6152708/SRR6152708_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/003/SRR6398023/SRR6398023_{1,2}.fastq.gz --no-clobber
File ‘SRR6397472_1.fastq.gz’ already there; not retrieving.
File ‘SRR6397472_2.fastq.gz’ already there; not retrieving.
File ‘SRR6398036_1.fastq.gz’ already there; not retrieving.
File ‘SRR6398036_2.fastq.gz’ already there; not retrieving.
File ‘SRR6152708_1.fastq.gz’ already there; not retrieving.
File ‘SRR6152708_2.fastq.gz’ already there; not retrieving.
File ‘SRR6398023_1.fastq.gz’ already there; not retrieving.
File ‘SRR6398023_2.fastq.gz’ already there; not retrieving.
You can run MentaLiST in many samples at one time, by passing all files at once, using the -1
and -2
parameters:
In [34]:
mentalist call -o my_dataset_calls1.txt --db mtb_cgmlst.db -1 SRR6*_1.fastq.gz -2 SRR6*_2.fastq.gz
[ Info: Opening kmer database ...
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6397472. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398023. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398036. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.
In [94]:
# optional: select the python environment and/or PATH to run the scripts;
conda config --set changeps1 False # just avoid the PS1 change here on Jupyter, not needed in your console;
conda activate mentalist1
PATH=$PATH:/rhome/pfeijao/sfu/MentaLiST/scripts
The 'parse_novel_alleles.py' script collects all novel alleles, creates a report and also outputs a FASTA file with selected alleles, to include in an updated MLST scheme.
In [36]:
parse_novel_alleles.py -h
usage: parse_novel_alleles.py [-h] [-f F [F ...]] [-o O] [-t THRESHOLD]
[-m MUTATION]
[-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
Given a list of FASTA files with novel alleles found with MentaLiST, output a
FASTA with a unique list of novel alleles.
optional arguments:
-h, --help show this help message and exit
-f F [F ...] Fasta files with novel alleles.
-o O Output Fasta file with alleles above the threshold
requirement(s).
-t THRESHOLD, --threshold THRESHOLD
Minimum number of different samples to appear, to
include a novel allele in the output fasta.
-m MUTATION, --mutation MUTATION
Also include if novel allel has equal or less than
this number of mutations, regardless of times seen.
Disabled by default.
-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level
You must give the novel allele FASTA file(s) found by MentaLiST as parameter -f. If you ran all your samples at once (like we did in this example), there is a single FASTA file, but you can also combine different FASTA files for different runs of MentaLiST on the same MLST scheme.
Any given novel allele will be included in the output file (parameter -o) if this exact allele is present in at least (-t) samples. Also, if the parameter -m is given, any novel allele that has -m or more mutations will be excluded; this is useful if you want to include only novel alleles that are very close to existing alleles.
Let's run the script, choosing alleles present in at least 2 of the 4 samples:
In [38]:
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o novel_alleles -t 2
02:10:09 PM (217 ms) -> INFO:Reading the new alleles ...
02:10:09 PM (245 ms) -> INFO:Writing output ...
02:10:09 PM (248 ms) -> INFO:Done.
There are three files created as output:
In [49]:
ls novel_alleles*
novel_alleles.fa novel_alleles.samples.txt novel_alleles.txt
Both .txt files have a report on all novel alleles found in the dataset, including how many times this each allele is seen, the number of mutations, and on which samples.
In [62]:
head -n 30 novel_alleles.txt
Locus Alleles found Samples x (mutations)
Rv0024 1 1x (1)
Rv0025 1 1x (1)
Rv0035 1 1x (2)
Rv0045c 1 1x (4)
Rv0049 1 2x (2)
Rv0051 1 2x (3)
Rv0058 1 1x (2)
Rv0063 1 1x (1)
Rv0101 1 1x (2)
Rv0104 1 1x (2)
Rv0134 1 1x (2)
Rv0139 1 2x (2)
Rv0165c 1 1x (3)
Rv0187 1 1x (2)
Rv0195 1 1x (2)
Rv0226c 1 1x (2)
Rv0236c 1 1x (2)
Rv0276 1 1x (2)
Rv0289 1 1x (2)
Rv0290 1 1x (2)
Rv0311 1 2x (3)
Rv0325 1 2x (2)
Rv0347 1 1x (2)
Rv0551c 2 2x (4), 1x (2)
Rv0574c 1 2x (2)
Rv0585c 1 2x (3)
Rv0592 1 1x (3)
Rv0634c 1 1x (2)
Rv0654 1 1x (2)
For instance, locus Rv0311 has one novel allele, which was seen on 2 samples, and it has a distance of 4 mutations in relation to an existing allele in the scheme. On the other has, Rv0551c has 2 novel alleles, where one was seen in two samples, and the other on just one.
On the .samples.txt
file, we can see which samples exactly have the novel alleles.
In [63]:
head -n30 novel_alleles.samples.txt
Locus Count Samples
Rv0024 1x SRR6152708
Rv0025 1x SRR6397472
Rv0035 1x SRR6152708
Rv0045c 1x SRR6152708
Rv0049 2x SRR6398023,SRR6398036
Rv0051 2x SRR6398023,SRR6398036
Rv0058 1x SRR6398023
Rv0063 1x SRR6152708
Rv0101 1x SRR6152708
Rv0104 1x SRR6397472
Rv0134 1x SRR6152708
Rv0139 2x SRR6398023,SRR6398036
Rv0165c 1x SRR6152708
Rv0187 1x SRR6397472
Rv0195 1x SRR6152708
Rv0226c 1x SRR6152708
Rv0236c 1x SRR6397472
Rv0276 1x SRR6152708
Rv0289 1x SRR6397472
Rv0290 1x SRR6152708
Rv0311 2x SRR6398023,SRR6398036
Rv0325 2x SRR6398023,SRR6398036
Rv0347 1x SRR6397472
Rv0551c 2x SRR6398023,SRR6398036
Rv0551c 1x SRR6152708
Rv0574c 2x SRR6398023,SRR6398036
Rv0585c 2x SRR6398023,SRR6398036
Rv0592 1x SRR6397472
Rv0634c 1x SRR6397472
The third file is the .fa
FASTA, which has all the filtered novel allele sequences. Comparing with the original FASTA file, we can see that from the 172 unique alleles found by MentaLiST, we are keeping 56.
In [50]:
grep -c ">" my_dataset_calls1.txt.novel.fa
172
In [51]:
grep -c ">" novel_alleles.fa
56
We can check that the locus Rv0551c has only one allele in this FASTA file, even though two new alleles were found. This is because one of the alleles was seen in only one sample, and therefore was filtered.
In [65]:
grep "Rv0551c" -A1 novel_alleles.fa
>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
We can run the script again, this time without the parameter -t 2
, meaning that we don't filter any allele, and we get all novel alleles:
In [95]:
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o all_novel_alleles
03:03:56 PM (935 ms) -> INFO:Reading the new alleles ...
03:03:56 PM (1062 ms) -> INFO:Writing output ...
03:03:56 PM (1065 ms) -> INFO:Done.
In [67]:
grep -c ">" all_novel_alleles.fa
172
In [69]:
grep "Rv0551c" -A1 all_novel_alleles.fa
>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGAGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
Even in the case that you don't want to filter any allele, you have to run the parse_novel_alleles.py
script, as its output will be used on the next step.
In [70]:
create_new_scheme_with_novel.py -h
usage: create_new_scheme_with_novel.py [-h] [-n NOVEL] [-o OUTPUT] [-i ID]
[-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
files [files ...]
Adds novel alleles to an existing MLST scheme.
positional arguments:
files MLST Fasta files
optional arguments:
-h, --help show this help message and exit
-n NOVEL, --novel NOVEL
FASTA with novel alleles.
-o OUTPUT, --output OUTPUT
Output folder for new scheme.
-i ID, --id ID Start numbering new alleles on this value, later will
implement from last allele id +1.
-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level
So, to add the novel alleles from the previous step in the small MLST scheme from the initial example, we run:
In [71]:
create_new_scheme_with_novel.py mtb_cgmlst_fasta/*fasta -o MTB_novel_scheme -n all_novel_alleles.fa
02:33:14 PM (133 ms) -> INFO:Opening the novel alleles file ...
02:33:14 PM (158 ms) -> INFO:Opening the MLST schema and adding novel alleles ...
02:33:36 PM (21784 ms) -> INFO:Done.
We can see that the new scheme has more alleles, for instance, on locus Rv0551c:
In [80]:
grep ">" mtb_cgmlst_fasta/Rv0551c.fasta | tail -n5
>Rv0551c_364
>Rv0551c_365
>Rv0551c_366
>Rv0551c_367
>Rv0551c_368
In [81]:
grep ">" MTB_novel_scheme/Rv0551c.fasta | tail -n5
>Rv0551c_366
>Rv0551c_367
>Rv0551c_368
>Rv0551c_369
>Rv0551c_370
In [87]:
mentalist build_db --db mtb_novel_cgMLST.db -k 31 -f MTB_novel_scheme/*.fasta --threads 4
[ Info: Opening FASTA files ...
[ Info: Combining results for each locus ...
[ Info: Saving DB ...
[ Info: Done!
In [90]:
mentalist call -o my_dataset_novel_calls1.txt --db mtb_novel_cgMLST.db -1 SRR6*_1.fastq.gz -2 SRR6*_2.fastq.gz
[ Info: Opening kmer database ...
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6397472. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398023. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398036. Opening fastq file(s) and counting kmers ...
[ Info: Voting for alleles ...
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.
Comparing this call with the previous, we can see that the novel alleles (marked as "N") have been called in the new output:
In [91]:
# OLD:
cut -f10-20 my_dataset_calls1.txt | column -ts $'\t'
Rv0023 Rv0024 Rv0025 Rv0033 Rv0034 Rv0035 Rv0036c Rv0037c Rv0038 Rv0039c Rv0040c
1 N 1 1 2 N 4 1 1 2 2
1 1 N 1 2 1 1 1 1 1 2
1 1 1 1 2 1- 1 1 1 1 2
1 1 1 1 2 1- 1 1 1 1 2
In [92]:
# New:
cut -f10-20 my_dataset_novel_calls1.txt | column -ts $'\t'
Rv0023 Rv0024 Rv0025 Rv0033 Rv0034 Rv0035 Rv0036c Rv0037c Rv0038 Rv0039c Rv0040c
1 314 1 1 2 562 4 1 1 2 2
1 1 123 1 2 1 1 1 1 1 2
1 1 1 1 2 1- 1 1 1 1 2
1 1 1 1 2 1- 1 1 1 1 2
In [ ]:
Content source: WGS-TB/MentaLiST
Similar notebooks: