Novel allele detection with MentaLiST 1.0

MentaLiST detects and reconstructs putative novel alleles, while also calling non-present loci, allowing its use on wgMLST schemes. Below we will show how to include novel alleles in the analysis.

First, let's do a re-cap of MentaLiST methods. If you are familiar with MentaLiST, you can skip until the section 'Updating an MLST scheme with the detected novel alleles'.

Running MentaLiST 1.0

Because of the new calling algorithm, new information has to be stored on the MentaLiST database, so databases created with previous 0.1.x or 0.2.x versions are not compatible.


In [1]:
# depending on how you installed mentalist, you might have to add it and julia to the PATH:
PATH=$PATH:/rhome/pfeijao/sfu/MentaLiST/src:/rhome/pfeijao/bin

On this example we will call MLST alleles on a M. tuberculosis WGS sample. Follow the 'Basic Usage' tutorial to download and create the MLST database and to obtain the FASTQ files for this sample. To run the MLST caller:


In [2]:
# Go to the tmp folder:
mkdir -p /tmp/mentalist_quick_start
cd /tmp/mentalist_quick_start

Then, we run MentaLiST on the M. tuberculosis MLST scheme, setting the --kt parameter to 10 to search more agressively for novel alleles (default is 6).


In [3]:
mentalist call -o sample.call --db mtb_cgmlst.db --kt 10 --output_votes --output_special -1 SRR6152708_1.fastq.gz -2 SRR6152708_2.fastq.gz


[ Info: Opening kmer database ... 
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.

Description of output files

Here, a brief description of each output file. All output files have the same prefix, given by the -o option when running MentaLiST call. 'sample.call' in this example.


In [4]:
ls -l sample.call*


-rw-rw-r--. 1 pfeijao pfeijao   27602 Feb 14 10:48 sample.call
-rw-rw-r--. 1 pfeijao pfeijao   27643 Feb 14 10:48 sample.call.byvote
-rw-rw-r--. 1 pfeijao pfeijao  128763 Feb 14 10:48 sample.call.coverage.txt
-rw-rw-r--. 1 pfeijao pfeijao   89361 Feb 14 10:48 sample.call.novel.fa
-rw-rw-r--. 1 pfeijao pfeijao    5838 Feb 14 10:48 sample.call.novel.txt
-rw-rw-r--. 1 pfeijao pfeijao  205768 Feb 14 10:48 sample.call.special_cases.fa
-rw-rw-r--. 1 pfeijao pfeijao     719 Feb 14 10:48 sample.call.ties.txt
-rw-rw-r--. 1 pfeijao pfeijao 1209964 Feb 14 10:48 sample.call.votes.txt

In [5]:
# The 'main' file is sample.call, with the allele calls. Let's show the first 12 calls:
cut -f1-12 sample.call | column -ts $'\t'


Sample      Rv0014c  Rv0015c  Rv0016c  Rv0017c  Rv0018c  Rv0019c  Rv0021c  Rv0022c  Rv0023  Rv0024  Rv0025
SRR6152708  5        2        1        1        2        1        1        1        1       N       1

The file sample.call.coverage.txt has a description of each the call for each locus. There are different types of call possible:

  • 'Regular' called alleles - the most voted allele that has 100% coverage; this is the most common case.

In [6]:
head -n12 sample.call.coverage.txt | grep Called


SRR6152708	Rv0014c	1.0	43	Called allele 5.
SRR6152708	Rv0015c	1.0	18	Called allele 2.
SRR6152708	Rv0016c	1.0	45	Called allele 1.
SRR6152708	Rv0017c	1.0	45	Called allele 1.
SRR6152708	Rv0018c	1.0	33	Called allele 2.
SRR6152708	Rv0019c	1.0	39	Called allele 1.
SRR6152708	Rv0021c	1.0	35	Called allele 1.
SRR6152708	Rv0022c	1.0	32	Called allele 1.
SRR6152708	Rv0023	1.0	57	Called allele 1.
SRR6152708	Rv0025	1.0	55	Called allele 1.
  • Missing loci - either no $k$-mers of these loci have been detected, or some $k$-mers but below 50% threshold, so it is declared missing. Technically should not happen in a cgMLST scheme, so it might be caused by poor sample coverage.

In [7]:
grep "Not present" sample.call.coverage.txt


SRR6152708	Rv1417	0.0	0	Not present; allele 58 is the best covered but below threshold with 188/435 missing kmers.
  • Novel alleles - when none of the top voted alleles in scheme for this locus the have 100% coverage, MentaLiST looks for a novel allele that has 100% coverage, using existing alleles as "template" for creating a novel allele. Some novel alleles in the first 100 calls:

In [8]:
head -n 100 sample.call.coverage.txt | grep Novel


SRR6152708	Rv0024	1.0	42	Novel, 1 mutation from allele 98: Del of len 1 at pos 719
SRR6152708	Rv0035	1.0	46	Novel, 2 mutations from allele 227: Subst C->G at pos 47, Subst A->T at pos 76
SRR6152708	Rv0045c	1.0	61	Novel, 4 mutations from allele 62: Subst C->T at pos 318, Del of len 2 at pos 650, Subst A->G at pos 652
SRR6152708	Rv0063	1.0	55	Novel, 1 mutation from allele 140: Ins of base G at pos 334
SRR6152708	Rv0101	1.0	50	Novel, 2 mutations from allele 1541: Subst A->G at pos 5360, Subst A->G at pos 6088
  • Multiple possible alleles: when more than one allele has 100% coverage. In the output, the depth of coverage and number of votes of each allele is shown; the best allele is chosen on the call file, and a character "+" is added after the allele number on the call file.

In [9]:
grep Multiple sample.call.coverage.txt


SRR6152708	Rv0471c	1.0	43	Multiple possible alleles:1, 118 with depth 43, 39 and votes 0, -724. Most voted (1) is chosen on call file.
SRR6152708	Rv1318c	1.0	36	Multiple possible alleles:302, 1 with depth 36, 36 and votes 4427, 0. Most voted (302) is chosen on call file.
SRR6152708	Rv1319c	1.0	35	Multiple possible alleles:8, 3 with depth 35, 35 and votes 3930, 3402. Most voted (8) is chosen on call file.
SRR6152708	Rv1911c	1.0	26	Multiple possible alleles:1, 118 with depth 26, 26 and votes 0, -244. Most voted (1) is chosen on call file.
SRR6152708	Rv2319c	1.0	28	Multiple possible alleles:7, 1 with depth 28, 30 and votes 101, 0. Most voted (7) is chosen on call file.
  • Partially covered alleles: coverage for the best allele is >50%, but less than 100%, which triggers a search for a novel allele, but none was found. Most likely either an undetected novel allele or an existing allele that was not fully covered in the WGS sample, for some reason. The most covered allele is chosen on the call file, and a character "-" is added after the allele number on the call file, to indicate partial coverage.

In [10]:
grep Partially sample.call.coverage.txt


SRR6152708	Rv0275c	0.986	0	Partially covered allele or novel allele; Best allele 26 has 10/696 missing kmers, and no novel was found. Gaps on positions: (635, 665)
SRR6152708	Rv0581	0.973	0	Partially covered allele or novel allele; Best allele 1 has 5/186 missing kmers, and no novel was found. Gaps on positions: (115, 119)
SRR6152708	Rv0860	0.987	0	Partially covered allele or novel allele; Best allele 331 has 28/2133 missing kmers, and no novel was found. Gaps on positions: (2096, 2133)
SRR6152708	Rv1860	0.9526	0	Partially covered allele or novel allele; Best allele 152 has 45/948 missing kmers, and no novel was found. Gaps on positions: (855, 901)
SRR6152708	Rv1999c	0.9575	0	Partially covered allele or novel allele; Best allele 5 has 55/1293 missing kmers, and no novel was found. Gaps on positions: (1182, 1236)
SRR6152708	Rv2249c	0.998	0	Partially covered allele or novel allele; Best allele 1 has 3/1521 missing kmers, and no novel was found. Gaps on positions: (1, 3)
SRR6152708	Rv3017c	0.775	0	Partially covered allele or novel allele; Best allele 1 has 75/333 missing kmers, and no novel was found. Gaps on positions: (1, 75)
SRR6152708	Rv3394c	0.9614	0	Partially covered allele or novel allele; Best allele 85 has 60/1545 missing kmers, and no novel was found. Gaps on positions: (1462, 1541)
SRR6152708	Rv3795	0.9995	0	Partially covered allele or novel allele; Best allele 1 has 2/3267 missing kmers, and no novel was found. Gaps on positions: (28, 29)

Novel alleles output

MentaLiST outputs two files with novel alleles information: one FASTA file called <SAMPLE>.novel.fa with the sequences, and one text file called <SAMPLE>.novel.txt with a description of each novel allele found.


In [11]:
# novel alleles found:
cat sample.call.novel.fa


>Rv0024_N1 Seen in 1 sample(s).
GTGAATACAGCGAGGTCGAGCTGTTGAGTCGCGCTCATCAACTGTTCGCCGGAGACAGTCGGCGACCGGGGTTGGATGCGGGCACCACACCCTACGGGGATCTGCTGTCTCGGGCTGCCG
ACCTGAATGTGGGTGCGGGCCAGCGCCGGTATCAACTCGCCGTGGACCACAGCCGGGCGGCCTTGCTGTCTGCTGCGCGAACCGATGCCGCGGCCGGGGCCGTCATCACCGGCGCTCAAC
GGGATCGGGCATGGGCCCGGCGGTCGACCGGAACCGTTCTCGACGAGGCTCGCTCGGATACCACCGTTACTGCGGTTATGCCGATAGCCCAGCGCGAAGCCATACGCCGTCGTGTGGCGC
GGCTGCGCGCGCAACGAGCCCATGTGCTGACGGCGCGACGACGGGCACGACGGCACCTGGCGGCGCTGCGTGCGCTGCGGTACCGGGTGGCGCACGGCCCGGGGGTCGCGCTGGCCAAAC
TTCGGCTGCCGTCGCCGAGCGGTCGCGCCGGCATCGCGGTCCACGCCGCGCTGTCGCGACTTGGCCGTCCCTATGTCTGGGGCGCAACGGGGCCCAACCAGTTCGACTGTTCCGGTTTGG
TCCAGTGGGCCTACGCCCAGGCGGGTGTTCACCTGGATCGCACCACCTATCAACAGATCAACGAGGGGATCCCGGTGCCGCGCTCACAGGTCCGGCCGGGCGATCTGGTCTTCCCGCACC
CCGGGCACGTGCAGCTGGCGATCGGCAACAATCTGGTCGTCGAGGCGCCCCATGCGGGCGCGTCGGTTCGGGTCAGCTCGCTGGGCAACAACGTGCAGATTCGGCGACCGCTGAGTGGCA
GATAA
>Rv0035_N1 Seen in 1 sample(s).
ATGACGGCGGCCTTGCTTTCACCAGCCATCGCCTGGCAGCAGATCTGGGCTTGCACGGACCGCACGCTGACGATCTCTTGCGAGGATTCCGAGGTAATCAGCTATCAGGACCTCATCGCG
CGCGCGGCGGCATGCATCCCCCCGCTACGGCGTCTTGACCTCAAACGCGGTGAACCCGTGCTGATCACCGCCCACACCAACCTGGAATTCCTGTCCTGCTTTTTGGGCCTCATGCTCCAT
GGCGCTGTGCCGGTACCCATCCCGCCGCGGGAGGCACTGAAGACCACCGAGCGTTTCATGACTCGGCTCGGCCCACTGCTGCGCCATCACCGCGTGCTGATCTGCACACCGGCCGAACAC
GACGAGATACGCGCTGCCGCCAGCACCGACTGCCAGATCAGCAGATTTACTGCCCTAGCCGAGGCTGGCGACGAGCAGTTCGGCCGCGCCACGGCCCAGCAACTCGCCGACACCGCCACC
GCCGACTGGCCGCTATGCACCCTCGACGACGACGCCTACGTCCAATACACCTCTGGCAGCACCGCAGCACCACGCGGAGTGGTCATCACCTACCGCAACCTGCTGTCCAACATGCGCGCA
ATGGCCGTGGGCTCACAATTCCAGCACGGCGATGTCATGGGCAGCTGGCTGCCCTTGCACCATGACATGGGGCTGGTGGGCAGCCTATTCGCCGCACTCTTCAACAGTGTCAGCGCGGTA
TTCACCACGCCACACCGGTTTCTGTATGACCCGTTGGGATTCCTCAGACTGCTCACCAGCTCCGGGGCTACCCACACGTTCATGCCTAACTTCGCTCTGGAGTGGCTGATCAACGCCTAC
CACAGGCGCGGCGCCGACATCGAAGGCATCGACCTACACAAAATGCGCCGCTTGATCATCGCCTCCGAACCCGTCCATGCCGAGGGCATGCGGAGATTCGCCGCCACCTTCGCCGGCGTC
GGACTTGCCCCCACGGCCCTGGGTTCGGGCTATGGCCTGGCCGAAGCGACCGTCGCCGTGTCAATGTCAGCGCCCAACACGGGATTCCGCACCGAAACCCACGCCGCCGCGGAGGTCGTC
ACCGGCGGCCGAGTGCTGCCTGGCTACGAGGTGCGCATTGACGCCGCACCAGGTGCCCGGGCCGGAACGATCAAACTGCGCGGCGACAGCGTGGCCGCCAAAGCCTATGTGGGCGGGAAG
AAGCTGGACGCGCTCGACGAGGAAGGCTTCTGCGACACCCACGACTTGGGTTTTCTTGTAGACGACGAAATCGTCATCCTTGGCCGGCAGGACGAGGTGTTCATTGTCCACGGAGAAAAC
AGATTCCCCTACGACATCGAGTTCATCATTCGCGGGGAATCCGAGCAGCACCGGACCAAAGTCGCATGTTTCGGGGTCAACGAACGCGTCGTGGTTGTGTTGGAAAGCCCATTGGACAGC
ATCATCGACAAGGCCGAAGCCGACCGACTGAGATGTCAAGTCGTTGCCGCGACTGGGCTGCAGTTGGATGAACTGATCACGGTTCGGCGCGGCGCGATTCCCACCACCACCAGCGGCAAG
CTCAAACGACGCGCCGTCGCGCAGGCTTATCGAGACGGCACACTGCCCCGTCTTGCCACCCACGCGTGGACGGCGGATCCCGATAGCGCTCCCAAAACGACCCGGTCCAGCCTGGAAGGC
GCCCACTGA
>Rv0045c_N1 Seen in 1 sample(s).
TCAGCGTGTGTCGAGCACCCCGCGCACGATCTCGATCAGGGCGCGCGGTTGGTCACTTTGCACCGAGTGGCCTGACTTCTCGACGATGTGAACGCCACGGAAATGCGTTGCACGCCTGTG
GAGTTCGGCGGTGTCCTGGTCGGTGACGAAGCCCGACGAGCCGCCGCGCACGAGTGTGATCGGCGCGGACAGGGCGTCGACGTCGTCCCAGAGCCCTGCGAAATCTCCGAACGTGCGGAT
CGCGTCATAGCGCCACACCCAGTTGCCGTTGTCCAGCCGGCGGGAGTTGTGGAACACGCCGCGGCGCAACGACTTGATATCGCGGTGCGGGGCCGCGGCGATCGTTAGGTCCAGCATGGC
CTGAAAGCTGGGGAATTCCCGCTCGCCGTGCATCAGCGCCACCGTGCCGCGCTGCTCGGCGGTCAGCTCGGCGTGCCGTTGCAATGCCGACGGGGTGACGTCGACGAGAACGAGTTCGCC
GACCAGGTCGGGTGCCATCGCGGCCAGCCGTATCGCAGTCAACCCGCCCAGCGACATGCCGACCACGAATTCGGCACCCGGCGCAAGCTCGCGTAGCACCGGCGCCAAGGTCTCGGAGTT
GAGCTGCGGCGAGTAATTGCCGTCCTCCCGCCAAGCGGAATGGCCGTGCTGGAAGGTCCACCGCCAGCGCCGGCTCACCCAGGCCGACGATCACGGTGTCCCAGGTATGGGCGTTCTGTC
CGCCGCCGTGCAGAAAGATCACCCGCGGCGCAGAGCCGCCCCAGCGCAGCGCGCTGATGGCTCCCGCTTGGACCCGCTCGACTTCAGGCAGTGGACCATTGACACCGGCCTGCTCAGCGT
TCTCAGCCAGCAGGGCAAACTCGTCCAGTCCGGTCAGTTCGTCGTCAGATAGCAC
>Rv0063_N1 Seen in 1 sample(s).
TTGGCGCGTGAGATCTCACGCCAGACGTTTCTGCGGGGTGCCGCCGGAGCGTTGGCCGCCGGCGCGGTCTTCGGCTCGGTCCGGGCTACCGCGGATCCGGCTGCCTCTGGCTGGGAGGCT
CTTTCTTCCGCCCTCGGAGGGAAAGTGCTACAACCGGACGACGGTCCCCAATTCGCAACGGCCAAGCAGGTTTTCAACACCAACTACAACGGCTATACGCCGGCGGTGATCGTTACCCCG
ACATCGCAGCTGGACGTGCAGAAGGCGATGGCGTTCGCTGCCGCGAACAACCTCAAGGTGGCCCCACGCGGTGGCGGGCACTCCTACGTGGGGGCGTCCACGGCCAACGGCGCCATGGTG
CTCGACCTACGTCAGCTACCTGGGGACATCAACTACGACGCCACCACCGGGCGGGTCACGGTGACGCCCGCCACCGGTTTGTACGCCATGCACCAGGTGTTGGCCGCGGCCGGCCGGGGC
ATCCCGACCGGCACCTGCCCGACGGTCGGTGTCGCGGGACACGCGCTGGGCGGCGGGCTGGGCGCCAATTCCCGGCACGCCGGCCTGCTCTGTGACCAATTGACGTCGGCGTCGGTGGTG
CTGCCCAGCGGCCAGGCGGTCACCGCGTCCGCCACCGACCACCCCGACCTGTTCTGGGCGTTGCGCGGTGGCGGTGGCGGCAACTTCGGCGTGACAACCTCGCTGACCTTCGCGACGTTC
CCCAGCGGGGACCTCGACGTCGTGAACCTCAATTTCCCACCGCAGTCGTTCGCGCAGGTTCTGGTCGGTTGGCAGAATTGGCTGCGAACCGCCGACCGAGGCAGCTGGGCACTGGCCGAT
GCCACCGTCGACCCGCTGGGCACGCATTGCCGCATCCTTGCGACCTGCCCGGCCGGGTCGGGCGGCAGCGTGGCGGCCGCCATCGTTTCGGCCGTCGGAACGCAACCGACCGGCACCGAA
AACCACACGTTCAACTATCTGGACCTGGTCAGATATCTGGCCGTCGGGAACCTCAACCCGTCGCCGCTGGGATATGTCGGCGGATCCGATGTCTTCACGACGATCACTCCGGCGACCGCC
CAGGGAATCGCCTCGGCGGTCGACGCCTTTCCGCGTGGAGCGGGCCGCATGTTGGCGATCATGCACGCCCTCGACGGCGCGCTCGCCACTGTGTCACCGGGGGCCACGGCCTTCCCGTGG
CGTCGGCAGTCGGCGCTGGTGCAGTGGTACGTCGAAACATCCGGCTCCCCGTCGGAAGCGACTAGCTGGCTCAACACCGCACATCAAGCGGTGCGAGCGTATTCGGTTGGCGGCTATGTG
AACTATCTCGAGGTAAACCAACCGCCGGCACGTTACTTTGGCCCGAATCTGTCCCGGCTGAGCGCAGTACGTCAGAAGTATGACCCCAGCCGGGTCCATGTTCTCCGGGCTGAACTTCTA
G
>Rv0101_N1 Seen in 1 sample(s).
GTGCACCGAGTTCGGTTAAGCCGCTCGCAGCGCAACCTCTACAACGGCGTGCGCCAGGATAACAATCCCGCGTTATATCTGATCGGCAAGAGCTATCGGTTCCGCCGGTTGGAGCTGGCG
AGATTCCTGGCCGCTCTGCACGCAACGGTACTGGACAACCCCGTGCAACTTTGCGTCCTGGAGAATTCGGGGGCAGACTATCCGGATCTGGTGCCGCGGCTACGGTTCGGCGACATCGTG
CGGGTGGGGTCAGCCGATGAGCACCTGCAGAGCACATGGTGTTCGGGCATCCTGGGCAAGCCACTGGTGCGGCATACGGTGCACACCGACCCGAACGGGTATGTGACCGGTCTGGACGTT
CACACCCACCACATCCTGCTGGACGGCGGCGCGACCGGGACGATCGAAGCTGACCTGGCGCGTTACCTGACCACCGACCCGGCGGGCGAAACCCCCAGTGTCGGTGCGGGTCTAGCCAAG
CTCAGGGAGGCGCACCGTCGTGAGACGGCCAAGGTGGAAGAATCGCGGGGGCGCCTGTCGGCTGTCGTGCAGCGTGAACTCGCCGACGAAGCATACCACGGCGGGCACGGGCACAGCGTT
AGCGACGCTCCCGGGACCGCGGCCAAGGGCGTCCTGCACGAATCGGCAACGATCTGCGGCAACGCGTTTGATGCCATCCTGACCCTTTCGGAAGCGCAGCGGGTCCCGCTTAATGTGCTG
GTGGCTGCGGCGGCCGTCGCGGTGGACGCGAGCCTTCGGCAGAACACCGAAACCCTCTTGGTGCACACGGTGGACAACCGGTTCGGAGATTCTGATCTGAATGTCGCGACCTGTTTGGTC
AATTCGGTTGCCCAGACCGTCCGGTTTCCCCCATTTGCGTCGGTGTCCGATGTCGTTCGAACGCTTGACCGCGGCTATGTCAAGGCGGTAAGACGCCGGTGGCTTCGTGAGGAGCATTAC
CGCCGAATGTATTTGGCGATCAACCGGACATCTCACGTGGAGGCGTTGACGCTAAATTTCATTCGCGAGCCATGCGCACCTGGCCTGCGCCCGTTCTTGTCGGAGGTCCCGATTGCCACG
GATATCGGTCCGGTCGAGGGCATGACGGTGGCGTCTGTTCTGGACGAAGAACAGCGCACACTGAACCTAGCCATCTGGAACCGAGCCGATCTGCCCGCGTGCAAGACACACCCCAAGGTC
GCGGAACGGATAGCGGCAGCGTTGGAATCGATGGCGGCGATGTGGGATCGGCCGATCGCCATGATCGTCAACGACTGGTTCGGGATCGGCCCGGACGGGACTCGCTGCCAAGGCGATTGG
CCAGCCCGTCAGCCGTCGACGCCCGCGTGGTTTCTCGATTCCGCAAGGGGCGTCCACCAATTTCTCGGCAGGCGCCGCTTCGTCTACCCGTGGGTCGCGTGGTTGGTGCAACGCGGCGCC
GCACCGGGTGATGTTCTGGTGTTCACCGACGACGACACCGACAAGACCATTGACCTGCTCATCGCGTGTCACCTTGCGGGTTGCGGGTACAGCGTCTGCGACACCGCTGACGAAATTTCC
GTGCGGACCAATGCGATTACCGAGCACGGCGATGGCATCTTGGTGACAGTGGTCGACGTGGCCGCCACCCAGCTGGCGGTTGTCGGCCATGACGAGCTGCGGAAGGTCGTTGACGAGCGC
GTCACACAGGTGACACACGACGCACTGCTGGCCACCAAGACCGCCTACATCATGCCGACCTCGGGAACTACCGGACAACCCAAGCTGGTGCGAATCTCACACGGCTCGCTCGCGGTTTTC
TGTGATGCGATCAGCCGCGCCTACGGTTGGGGAGCCCACGACACCGTTCTGCAGTGCGCTCCGTTGACATCGGACATCAGCGTCGAGGAGATTTTCGGTGGCGCGGCCTGTGGCGCGCGA
CTGGTGCGATCCGCGGCTATGAAAACCGGCGACCTGGCGGCGCTGGTTGACGATCTCGTCGCCCGCGAGACGACAATCGTCGACCTGCCGACCGCCGTCTGGCAGCTGTTGTGCGCCGAC
GGCGACGCCATTGACGCGATCGGCCGCTCGCGCCTGCGGCAGATCGTAATCGGCGGTGAAGCCATCCGCTGTAGCGCCGTGGACAAGTGGCTTGAATCGGCTGCTTCACAAGGGATCTCG
CTGCTCTCGAGCTATGGTCCAACAGAAGCCACGGTCGTCGCCACCTTCTTGCCGATCGTTTGCGACCAGACCACCATGGACGGCGCACTGCTCAGGCTCGGCCGGCCGATCCTACCGAAC
ACGGTGTTCCTCGCGTTCGGTGAAGTCGTCATTGTCGGGGATTTAGTCGCCGACGGCTACCTCGGGATCGACGGCGACGGCTTCGGCACCGTGACGGCCGCAGACGGTTCCCGACGCCGT
GCCTTTGCCACTGGCGACCGGGTGACCGTCGACGCCGAAGGATTTCCGGTCTTCTCCGGACGCAAAGACGCCGTCGTCAAGATCTCCGGCAAGCGTGTCGATATCGCTGAGGTAACCAGG
CGCATCGCCGAAGACCCCGCGGTGTCAGATGTCGCCGTCGAGTTGCACAGCGGAAGCCTCGGAGTGTGGTTCAAGAGCCAACGGACCCGCGAGGGCGAACAAGACGCTGCCGCGGCGACC
CGGATCAGGCTCGTCCTCGTGAGTCTGGGAGTGTCGTCGTTTTTCGTTGTCGGCGTGCCGAATATCCCGAGGAAGCCCAACGGGAAGATCGACAGCGACAACCTGCCGAGGCTGCCTCAG
TGGTCAGCTGCTGGGCTAAACACCGCCGAGACGGGTCAGCGAGCGGCCGGCCTCTCGCAGATCTGGAGCCGGCAGCTCGGCCGGGCAATCGGGCCGGACTCGTCGCTGCTTGGTGAGGGC
ATCGGCTCGTTGGATCTCATCAGAATACTGCCCGAGACGCGTAGGTATCTGGGGTGGCGCCTCTCGCTGCTGGATCTGATCGGTGCCGATACCGCCGCCAATCTGGCCGATTACGCGCCA
ACGCCCGACGCGCCGACGGGCGAAGATCGGTTTAGGCCGCTGGTGGCCGCGCAACGGCCCGCGGCGATTCCGTTGTCGTTTGCCCAGCGGCGACTATGGTTTCTCGACCAGTTACAGCGA
CCCGCTCCGGTCTACAACATGGCGGTGGCGTTGCGGCTGCGCGGGTATCTCGATACCGAGGCGTTGGGCGCGGCGGTCGCCGATGTCGTGGGCCGCCACGAAAGCCTACGGACGGTGTTT
CCGGCGGTCGACGGGGTCCCTCGGCAGCTGGTCATCGAAGCGCGGCGGGCAGATCTTGGCTGCGACATCGTCGATGCCACCGCATGGCCGGCTGACCGGCTGCAACGGGCCATCGAGGAG
GCGGCGCGCCACAGCTTCGATTTGGCAACCGAGATACCTTTGCGGACGTGGCTTTTCCGGATCGCCGACGACGAACATGTGCTGGTGGCGGTTGCACACCATATCGCCGCCGACGGCTGG
TCGGTGGCTCCGCTGACGGCCGATCTGAGTGCGGCATATGCCAGCCGTTGTGCGGGTCGGGCACCGGACTGGGCGCCATTGCCAGTGCAGTATGTCGATTACACGCTGTGGCAGCGGGAA
ATCCTCGGTGATCTCGACGACAGCGACAGCCCGATCGCCGCGCAGCTGGCCTACTGGGAAAATGCGTTGGCCGGTATGCCGGAACGGCTGCGGCTGCCCACCGCTCGGCCCTATCCACCG
GTTGCCGATCAGCGCGGCGCCAGTTTGGTGGTGGATTGGCCGGCGTCGGTGCAACAGCAGGTGCGTCGGATCGCCCGCCAGCACAACGCGACCAGCTTCATGGTGGTAGCTGCCGGGCTT
GCCGTGCTGCTGTCGAAACTCAGCGGAAGCCCCGATGTGGCGGTCGGATTTCCCATCGCCGGCCGCAGCGATCCTGCGCTGGATAACTTGGTGGGCTTTTTTGTCAACACCTTGGTGTTG
CGGGTCAACCTGGCCGGTGATCCCAGCTTCGCCGAACTGCTGGGGCAGGTGCGAGCGCGCAGCCTGGCCGCCTACGAAAATCAAGACGTACCTTTCGAGGTGCTCGTTGATCGCCTCAAA
CCCACTCGAGCCCTGACCCATCACCCGCTGATCCAGGTGATGTTGGCCTGGCAGGACAATCCGGTTGGACAGCTGAATTTGGGTGATCTGCAGGCCACCCCGATGCCGATCGACACCCGC
ACCGCCCGCATGGACTTGGTGTTTTCGTTAGCGGAACGCTTCAGCGAGGGTAGCGAACCTGCCGGGATCGGCGGAGCGGTGGAATACCGCACCGATGTGTTTGAAGCCCAAGCAATCGAC
GTGCTTATCGAGCGGTTGCGGAAGGTGTTGGTGGCGGTGGCCGCTGCTCCGGAACGGACGGTGTCGTCGATCGATGCGCTGGATGGGACCGAGCGTGCCCGGTTGGATGAGTGGGGTAAC
CGCGCTGTGCTGACTGCGCCCGCGCCCACGCCGGTGTCGATCCCGCAGATGTTGGCCGCCCAGGTGGCACGTATCCCCGAAGCGGAGGCGGTGTGTTGCGGGGACGCGTCGATGACGTAT
CGGGAACTCGACGAGGCGTCCAACCGGTTAGCGCATCGGCTGGCAGGTTGTGGGGCCGGCCCGGGCGAGTGTGTGGCGCTGCTGTTCGAGCGGTGCGCGCCGGCGGTCGTGGCGATGGTG
GCAGTGCTCAAAACCGGGGCGGCGTATCTGCCGATCGATCCGGCGAATCCTCCGCCGCGGGTGGCGTTCATGCTCGGCGACGCGGTGCCCGTGGCCGCGGTCACCACGGCTGGGCTGCGC
TCCCGGTTGGCGGGACACGACTTGCCGATCATCGATGTCGTCGATGCTTTAGCGGCATATCCGGGCACGCCCCCACCCATGCCGGCCGCAGTGAACCTCGCCTACATCCTGTACACCTCG
GGCACTACCGGCGAGCCCAAAGGCGTGGGGATCACCCATCGCAACGTCACCAGGCTGTTCGCATCACTGCCGGCACGCTTGTCGGCGGCGCAGGTGTGGTCGCAGTGTCATTCCTATGGC
TTCGACGCCTCGGCGTGGGAGATCTGGGGCGCGTTGCTAGGTGGTGGGCGACTGGTGATCGTGCCCGAGTCGGTGGCGGCCTCGCCGAACGACTTTCATGGGCTGCTCGTGGCCGAACAC
GTCAGCGTGCTGACTCAGACTCCGGCTGCGGTGGCAATGTTGCCGACGCAGGGTTTGGAGTCGGTGGCGTTGGTGGTGGCCGGTGAGGCATGTCCGGCAGCGCTGGTGGATCGGTGGGCG
CCCGGGCGGGTGATGCTAAATGCTTATGGCCCAACCGAGACCACGATCTGTGCGGCGATAAGTGCGCCGTTGCGACCGGGTTCGGGGATGCCGCCGATTGGTGTTCCGGTGTCGGGGGCG
GCGTTGTTTGTGCTGGATAGCTGGTTGCGCCCGGTACCGGCCGGGGTGGCCGGAGAGTTGTACATTGCCGGTGCGGGCGTCGGTGTTGGGTATTGGCGTCGGGCGGGGCTGACCGCGTCA
CGGTTTGTGGCCTGCCCATTCGGCGGTTCCGGGGCACGCATGTATCGCACCGGGGATCTGGTGTGTTGGCGCGCCGATGGCCAGTTGGAGTTCCTGGGGCGCACCGACGATCAGGTCAAG
ATCCGCGGGTATCGCATCGAGCTCGGCGAGGTTGCGACCGCGCTGGCCGAGCTGGCTGGGGTAGGTCAAGCGGTTGTAATCGCCCGTGAAGACCGCCCTGGGGACAAGCGCCTAGTCGGG
TATGCCACCGAAATTGCCCCCGGGGCAGTGGACCCGGCCGGGCTGCGGGCGCAACTAGCCCAGCGATTGCCCGGTTACCTGGTGCCAGCCGCGGTGGTAGTGATCGATGCGCTTCCGTTG
ACGGTCAACGGCAAACTTGATCATCGTGCGTTGCCGGCACCGGAATACGGTGATACCAACGGATATCGCGCTCCGGCCGGGCCGGTTGAGAAGACCGTGGCCGGCATCTTTGCCCGGGTG
CTTGGGCTTGAGCGGGTCGGCGTCGACGACTCGTTCTTCGAGCTCGGCGGCGATTCGCTGGCGGCAATGCGGGTTATCGCCGCGATCGACACCACCCTAAACGCCGATCTGCCGGTGCGC
GCGTTGCTGCACGCGTCGTCGACGAGAGGTTTAAGCCAGCTGTTGGGGCGAGATGCCCGACCGACCAGCGATCCGCGCTTGGTGTCTGTGCACGGCGACAACCCCACCGAGGTGCATGCC
AGCGACCTCACGCTGGACCGGTTCATCGACGCCGACACGCTGGCCACCGCCGTCAACCTGCCGGGCCCGAGCCCCGAGCTACGGACGGTCCTGCTGACGGGCGCGACGGGTTTCCTCGGA
CGGTATCTGGTCCTTGAATTGCTGCGGCGGCTGGACGTCGACGGCAGGCTGATCTGTTTGGTGCGGGCGGAGTCCGACGAGGATGCGCGGCGTCGTCTGGAGAAGACCTTCGATAGCGGT
GACCCGGAATTGCTGCGGCACTTCAAGGAGCTTGCCGCCGACCGGCTGGAGGTCGTCGCAGGCGACAAGAGCGAACCCGACCTGGGCCTGGACCAACCGATGTGGCGGCGGCTGGCCGAA
ACCGTGGATTTGATTGTCGATTCCGCGGCGATGGTCAACGCGTTTCCCTACCACGAATTGTTCGGGCCCAACGTCGCGGGCACCGCCGAGCTGATCCGAATCGCGCTTACCACCAAGCTC
AAACCCTTCACCTACGTGTCAACCGCCGACGTGGGTGCTGCGATCGAGCCGTCGGCGTTCACCGAGGACGCCGACATCCGGGTAATCAGCCCCACCCGCACCGTCGACGGCGGCTGGGCT
GGCGGCTACGGCACCAGCAAGTGGGCCGGTGAGGTGCTGCTGCGCGAGGCCAACGACCTGTGCGCGCTGCCGGTCGCGGTGTTTCGCTGCGGGATGATCCTGGCCGACACCAGCTATGCC
GGACAGCTCAACATGTCGGACTGGGTCACCCGGATGGTGTTGAGCTTGATGGCTACCGGCATCGCGCCTCGTTCGTTCTACGAACCGGACTCCGAGGGCAATCGGCAACGCGCGCACTTC
GACGGGCTGCCAGTCACCTTCGTTGCCGAGGCGATCGCGGTGCTGGGCGCGCGGGTGGCCGGCTCATCGTTGGCGGGATTTGCGACCTATCACGTGATGAACCCGCACGACGACGGTATC
GGGCTCGATGAGTATGTGGACTGGCTGATTGAGGCCGGCTACCCGATACGCCGCATCGATGACTTTGCGGAGTGGTTGCAGCGGTTTGAGGCCAGCCTGGGCGCTCTGCCGGATCGGCAA
CGCCGGCACTCGGTGCTGCCGATGCTGCTGGCGAGCAATTCCCAGCGATTGCAGCCGCTTAAGCCGACCAGGGGGTGCTCCGCGCCGACCGACCGATTCCGTGCCGCGGTGCGAGCGGCG
AAAGTCGGCTCCGACAAGGACAATCCAGACATCCCGCACGTGTCGGCGCCGACCATCATCAACTACGTCACCAACCTACAACTGCTCGGACTGCTGTAG
>Rv0134_N1 Seen in 1 sample(s).
ATGATCGCTCTGCCCGCCTTGGAAGGTGTCGAACATCGGCACGTGGATGTGGCGGAAGGCGTCAGGATCCACGTTGCGGACGCCGGGCCGGCCGATGGTCCGGCGGTAATGCTGGTGCAC
GGCTTCCCGCAGAACTGGTGGGAGTGGCGCGACCTCATCGGCCCGCTGGCCGCCGACGGCAACCGGGTGCTGTGTCCCGACCTGCGCGGCGCGGGCTGGAGTTCGGCGCCCCGCTCGCGG
TATACCAAGACCGAGATGGCTGACGATCTGGCTGCGGTTTTGGACGGCCTGGGTGTGGCCAAGGTCAAGCTGGTGGCCCACGATTGGGGTGGGCCGGTCGCGTTCATCATGATGTTGCGC
CATCCCGAGAAGGTGACCGGGTTTTCGGCGTGAACACCGTGGCACCCTGGGTGAAGCGCGATCTTGGCATGCTCCGCAATATGTGGCGGTTCTGGTATCAGATCCCCATGTCGCTGCCGG
TGATCGGCCCGCGGGTGATCAGCGATCCTAAGGGCCGCTACTTCCGGCTGTTGACCGGGTGGGTCGGGGGCGGATTTCGGGTTCCCGATGACGACGTGCGCCTGTACTTGGACTGCATGC
GCGAGCCGGGGCACGCCGAGGCCGGATCGCGGTGGTATCGCACCTTTCAGACCAGGGAAATGCTGCGCTGGCTGCGCGGCGAGTACAACGACGCTCGGGTCGATGTCCCGGTCCGATGGC
TGCACGGCACCGGAGATCCGGTGATCACGCCCGACCTGCTGGACGGCTATGCCGAGCGGGCCAGCGATTTCGAGGTGGAGCTGGTCGACGGCGTGGGCCATTGGATCGTCGAGCAGCGAC
CCGAGCTGGTGCTCGACCGGGTGCGTGCGTTCCTAGCTGCGGGGACCGAGCAGCGCGATTGA
>Rv0165c_N1 Seen in 1 sample(s).
TCAGCCAGGGCCTCCGTCAGCCTGCGTGCCCCATCGGTGAACTGCCAGACGGTGTGCTCGATTACGGCGGCTGTGTCGCGGCGGCGCAGCGCGGCGATCAGCTGCCGATGACTGTTCACC
GCGTCCGCGCCCCATCGCGGGTCGGCCGCGAACACCTGCGCCGGCATATAGCGCGCGGCATTAAGCAGGAACCAGGCCAACTTGATCCGGCGGCTCGCTTTGTTGAAGACGCGGTGGAAC
GCGAACTCGATCGACGCGATGGTTTTGGCATCACCGGACCCGATAGCACCGGCCAGCGCATTGTTGATGCGGTCCAGCTCGTCGATCTCAACGTCGGTGATGTGAGCGGTGGCCGATGTG
GCAAGTTCTTGGGCAATGGTGGCCTGCAGCCAGAAAATGTCGTCGATGTCTTGGCGGGTCAACGGCAGCACCACGTGGCCGCGATGTGGCTCCAGCCCGACCATCCCCTCACCGCGCAGT
TTCAGCAGCGCCTCCCGCACCGGCGTGACGCTGACTCCGAGCTCGGCTGCCGTCTCGTCCAGACGGATGAACGTTCCAGAGCGCAGGGCGCCCGACATGATGGCGGCCCGCAGGTGGCCC
GCGACCTCGTCGGACAACTGTGCCCGGCGCAGGGGAAGCTGGCTCCGCGGCTTCGCCGATAGAGGTGCGTTCAC
>Rv0195_N1 Seen in 1 sample(s).
ATGGCACCGGTGAATGTCATTTCGGTCGCGGTGGTGGCGAGCGACCCGTTGACCCGCGATGGAGCTTTGGCCCGACTCTCGTCTCACCGGGAGCTCGACGTGCGCGCTTGGCAGGCTGGA
TGCGAAACCTCGGTCCTGCTCGTGCTGGCCACCACGATCACCGCGCCTCTTCTATGCCAGATCGAGGACGCGCAGAAGGATGGCCCCAGTCACGCGCCGAAACTGGTCGTCGTCGCCGAC
GAATTCTCCGCTGAACAAGTTTTCCGGATGATCAAGCTGGGGTTGACCGGGTTGTTGTATCGCAGCCAGAGCACGTTCGACTGCATCGTCGAGACAATCCGGTTGTCCGCCGAAGGCCGC
CTGCGACTCCCCGAACGTGTCCAGCGTTACCTGGTCGGCCGCATCAAGTCCACCCCGACCGCCGAACCTGACACACCGTGCGCCGCCGCTCTTGCCGAGCGTGAGGTGGCGGTGCTGCGT
CTGCTAGCGGACGGCTTGAGCACGCACCAAGTGGCGGTGCAGCTCAACTATTGCGAGCGCACGATCAAGAACATCGTTCATGACATAGTGACGCGGCTGAAGCTCCGCAACCGCACGCAT
GCCGTCGCACATGCGCTGCGCGCGGGCCTCATTTGA
>Rv0226c_N1 Seen in 1 sample(s).
TTAATCCTGGGCGCGGCTCGCCGGCGTGTCTTCGCCGTGGTGTAAGTGTCGGCGCACCCAATAGCCGGCCGCGCCAGCGCCGCCGACCAGCAGCATCGAAAGCCACGCCCAATGCGCGAG
CATTGTCGCTTTGAGGCGGGCCGACGATGCACCGGAGGTTTGGCCGCCGACCCGATAAAGAGCCAATTCGTCGTCGCGGTGCGCCGCTGCTAGCCGGCCGAGGGTGCGTGCGGCCGCGCC
CATGTCGCCGGCGCTGTCGGATTCGACGACCAGCCACCCGACGCCGGCCGCGGCCAAGGTTGACGGATGGGGCCCGGTGAGCAGCAGCTCCTGGACCGCCCGGGCGTGCGCGTCTTCGCC
GGGAACGGTCACCCCGGAAATGACCAGATCACCTGTGGTCAGCACATCGGCGCGAACCCAACGGGGGAGCGGATCGAGTACCGGTGCCGAACCGGACCACGAGAAGCGCCGCATGGTGCC
CGCGGGCAAGACCGCAACCGTCCGGGGATCGGCATTGATCGCCGCTGCCACCGCCGCCCAACCGGACGGGTAGTGCACAGGCGCAACCTTGCCCCACACCCCCCACGCCAAGTCAGCCAG
CGTTAGGACCAGCGCCAGACAGCAGACCACCGCCGCCGTTGCCGGTCGCAGCCAGCGTCGCAGCGTTAGCACCGTGCCCGCACCGGAGAGTGTGTATCCGGGTACCGCCAGCGCGACCCA
CTTCTGTCCGTCGCGCAGCACGCCCAGGCCGGGTGCGGCATCGACCACCACCCGTAGCGCGTGCAGACCTGGGCCGGTCGCAAGGACAGCCGGGACCATCACGGACACCGCCGCTAGTGT
CAGCAGCGGCACTGCCACGGGCCGGCGCGCCACAGTCGGTAGTCCGATCGCCACCATGGCGAGTAGTACGACGGCGGATGCCACTGCGAAAAGCGTTGTCCGCGAGCTAGGTACGGCCTC
GCCGTTCCAGATCCCACCGAGACTGGCCAAGCTGCCAAGCGTGCCCAGCCCCGGTTCGGCGCGTGGCGCGAACGCGGTAACCCCAAGCTGATTGGCTGCCGTGTGGCTGGTCAACGACGA
GCCCAGCGCCGACGCCGTCAGCCAGGGCAGCGCACCCACCAGCGCGGAGCCCAACGCCGCGACCCCACATTGCCAGCGCGGGCGGCCCGCGCCGGGCATCGCCACGCACACCACCGCAAC
TGTCGCGGCGAGTAGCAGCCCGGACGGGGTCAGGCCGGCCAGCGCAACCCAGAACGCCAGCCCAAAAAGCCCGAACCAACCCGCGCCAACCGTTGTTCGCATCGTTAACATCGCGGTCGC
AACCCAGGGCAGACACCCATAGCCGACCAGCAGGCTCCAATGGCCCTGCAAAAGTCGTTCGGCCACATAGGGATTCCAGATCGCCAGCGTGATCGCGACAAACTGGCCGGCTGCCCCCGC
TGCGGGTAGTGCCGTTGCGACCAGTCGGGCCGCGCCCCAGCCCGCCAGCCAAAGCCCCAGCAGCAGCAGCGCTTTCACCACGACGCCGCCGTCGACGAGGTGTGACGCCAAAGCGACCGC
GAAGTCCTGCGGAGTCGCCCGGGGCGCCGATGTCAGCCCTAGGGCGTTGGCCGACACATACGACCGTGGTGTGGACACTGCATCGCGCAGCAGTAGGTATCCGGGCCGCAGTAGCGGCGC
GGCCAACAGCAGCACCAAGACCAGCGCGTACCCCGGTCGGAACCAGCGCAC
>Rv0276_N1 Seen in 1 sample(s).
ATGGCGATTTCGCTGGTGGCTCACCAGCCCATCCCCCACGTCGAGCGTCGCATGGCCGACCCACCCCGTCTCCAGCTGGCCAGGCGCCGGCGATCGGCGGCCGGCCCCGGCGGTAACGAG
GACAGCTTGATGGGAGTGGCGCTGCTAGCCGGCCCGGCCAACGTGATCATGGAGTTGGCGATGCCGGGTGTCGGCTACGGCGTGTTGGAGAGCCGTGTCGAAAGCGGCCGGCTGGACCGC
CATCCGATCAAGCGGGCGCGCACCACCTTTACCTACGTTGCGGTGGCCGTTGCCGGCAGCGACGACCAGAAGGCGGCCTTTCGTCGCGCGGTGAATAAGGTTCACGCGCAGGTGTATTCG
ACTCCGGAGAGCCCGGTGTCCTACCACGCGTTCGATCCCGAACTACAGCTGTGGGTGGCGGCATGCCTCTATAAGGGCGGCGTCGACGTCTACCGCACCTTCGTCGGCGAGATGGACGAC
GAAGAGGCCGACCATCATTACCGCGCGGGCATGGCGATGGGCACCACGTTGCAGGTGCCGCCGCAGATGTGGCCACCGGATCGGGCGGCCTTCGACCGCTACTGGCGGCAATCACTGGAC
AGGGTGCACATCGATGACGTCGTTCGCGACTACCTGTATCCGATCGTGGCGCTCCGAATTCGCGGGATCGCACTGCCGGGTCCGCTGCGGCGGCTGTCGGAGGGTATCGCGCTGCTGATC
ACCACCGGTTTCCTGCCGCAGCGGTTTCGCGACGAGATGCGGTTGCCGTGGGACGCGACCAAGCAGCGGCGCTTTGACGCGCTCATGGCCGTGCTGCGCACGGTGAATCGCCTGATGCCG
CGGTTTGTCCGGGAGTTCCCGTTCAACCTGATGCTCTGGGACCTGGACCGGCGGATGAGGCGCGGGCGCCCGCTGGTGTAA
>Rv0290_N1 Seen in 1 sample(s).
ATGTCCGGCACCGTCATGCAGATCGTCCGCGTCGCCATTCTTGCGGACAGCAGGTTGACCGAGATGGCCCTGCCCGCGGAGTTGCCACTGCGCGAAATCCTGCCCGCGGTACAACGCTTG
GTGGTTCCCTCGGCGCAAAACGGCGATGGTGGCCAAGCCGACTCCGGCGCTGCCGTGCAACTGAGTTTGGCGCCCGTCGGCGGGCAGCCGTTTAGCTTGGATGCCAACCTGGACACCGTC
GGTGTCGTCGACGGTGATCTGTTGGTGTTGCAGCCGGTGCCCACCGGTCCGGCCGCGCCGGGCATCGTCGAAGACATCGCCGACGCCGCGATGATCTTTTCGACGTCGCGGTTAAAGCCC
TGGGGCATAGCGCATATCCAACGAGGAGCGCTGGCCGCGGTGATTGCCGTGGCTCTGCTGGCTACCGGTTTGACGGTGACCTATCGGGTTGCCACCGGTGTGCTGGCCGGGCTGCTGGCG
GTGGCCGGGATCGCGGTGGCTAGCGCGCTGGCCGGATTGTTGATCACCATCCGTTCGCCACGTTCGGGTATCGCGCTGTCGATCGCCGCGCTGGTCCCCATCGGCGCGGCCCTGGCGTTG
GCGGTGCCAGGAAAGTTCGGGCCGGCGCAGGTATTGCTGGGTGCAGCTGGGGTAGCCGCATGGTCGCTGATCGCGCTGATGATTCCCAGCACCGAACGGGAACGCGTCGTCGCCTTCTTC
ACCGCAGCGGCGGTGGTCGGGGCGTCGGTGGCGCTGGCGGCCGGTGCGCAATTGCTGTGGCAGCTGCCGTTGTTGAGCATCGGCTGCGGGCTGATTGTGGCGGCGCTGTTGGTCACCATC
CAGGCGGCTCAGCTTTCCGCACTGTGGGCGCGGTTCCCGTTGCCGGTGATCCCGGCGCCGGGGGATCCCACCCCGTCGGCCCCGCCGTTGCGCCTGCTGGAGGATTTGCCTCGGCGGGTG
CGGGTCAGTGACGCCCATCAAAGCGGCTTCATCGCCGCGGCCGTGCTGCTCAGCGTGTTGGGGTCGGTGGCCATCGCGGTGCGCCCAGAGGCGCTCAGCGTTGTGGGCTGGTATCTGGTG
GCGGCGACTGCGGCCGCGGCCACCCTGCGCGCGCGGGTGTGGGATTCGGCCGCATGCAAGGCGTGGCTGCTGGCTCAGCCCTATCTGGTAGCCGGGGTCCTGTTGGTGTTCTACACCGCG
ACCGGACGCTATGTCGCCGCGTTCGGCGCGGTGCTGGTGCTAGCCGTGCTCATGCTGGCCTGGGTTGTGGTGGCACTGAACCCGGGCATCGCTTCGCCGGAGAGCTACTCGCTGCCGCTG
CGCCGGCTGCTGGGTTTGGTCGCCGCCGGGCTGGATGTTTCGCTGATCCCCGTCATGGCCTACCTGGTCGGATTGTTCGCTTGGGTGCTCAACAGATGA
>Rv0551c_N1 Seen in 1 sample(s).
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCAC
CGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTT
CTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTC
GACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGG
TCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGA
TCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGAGGCG
CCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATT
CTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGG
TTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACT
TGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGC
TGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAG
CCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGC
GCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGA
TGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGT
GGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
>Rv0654_N1 Seen in 1 sample(s).
ATGACCACCGCACAAGCCGCCGAATCCCAAAACCCATATCTCGAGGGCTTCCTGGCGCCGGTGAGCACCGAGGTAACTGCCACCGACCTGCCGGTCACCGGCCGCATTCCGGAACACCTC
GACGGGCGTTATCTGCGTAACGGCCCCAACCCGGTCGCGGAGGTCGACCCGGCCACCTACCACTGGTTCACCGGCGACGCCATGGTGCACGGAGTCGCGCTGCGCGACGGGAAGGCCCGC
TGGTATCGCAATCGCTGGGTCCGCACACCCGCGGTGTGCGCCGCCCTGGGCGAGCCCATTTCGGCCCGGCCTCACCCGCGCACCGGGATTATCGAGGGCGGTCCCAACACCAACGTGCTG
ACCCACGCCGGACGCACCCTGGCCTTGGTTGAGGCCGGCGTGGTCAACTACGAACTCACCGATGAGCTGGACACCGTGGGACCCTGTGACTTCGACGGCACCCTGCACGGCGGTTACACC
GCCCATCCGCAGCGTGATCCGCACACGGGTGAACTGCACGCGGTGTCCTACTCGTTCGCCCGCGGACACAGAGTGCAGTACTCGGTGATCGGCACCGACGGACACGCTCGTCGGACGGTT
GATATCGAGGTGGCGGGATCGCCGATGATGCACAGCTTCTCCCTGACCGACAACTACGTGGTGATCTACGACCTGCCGGTGACCTTCGACCCAATGCAGGTGGTGCCGGCGTCCGTGCCA
CGCTGGCTGCAACGGCCCGCCAGGTTGGTGATCCAGTCGGTCCTGGGCCGTGTCCGCATCCCCGACCCGATAGCGGCGTTGGGCAACCGGATGCAGGGTCACTCCGATCGCCTCCCGTAC
GCCTGGAACCCCAGCTACCCGGCGCGCGTCGGTGTCATGCCGCGCGAGGGTGGCAACGAGGACGTGCGGTGGTTCGACATCGAACCCTGCTACGTATACCACCCACTTAACGCCTACTCG
GAGTGCCGGAACGGCGCTGAGGTGCTGGTGTTGGACGTGGTGCGCTACTCACGGATGTTTGATCGCGACCGGCGGGGTCCCGGCGGTGACAGCCGGCCCTCGCTGGATCGCTGGACCATC
AACCTGGCGACCGGTGCGGTGACCGCCGAATGCCGCGACGATCGGGCGCAGGAGTTTCCCCGCATCAACGAGACTCTGGTGGGTGGGCCGCATCGCTTCGCCTACACCGTCGGCATCGAG
GGTGGGTTTCTCGTCGGCGCCGGCGCTGCGTTGTCGACTCCGCTGTATAAACAGGACTGCGTGACCGGGTCCAGCACGGTCGCCTCGCTCGATCCCGACCTGTTGATCGGCGAGATGGTG
TTCGTGCCGAACCCGTCGGCGCGTGCAGAAGATGACGGGATTCTCATGGGCTACGGCTGGCACCGCGGCCGCGACGAAGGCCAGCTGCTCTTGCTGGATGCCCAGACTCTCGAGTCGATC
GCCACCGTGCACCTGCCACAGCGTGTGCCGATGGGCTTCCACGGCAACTGGGCGCCGACCACCTGA
>Rv0739_N1 Seen in 1 sample(s).
TTGGTGCTGACGCGGCGCGCGCGCGTGAAGTGGCGCTGACACAGCACATTGGGGTATCCGCGGAGACCGATCGGGCCGTCGTCCCCAAGCTGCGCCAGGCCTATGACAGCCTGGTGTGCG
GTCGCCGCCGGCTTGGCGCCATTGGAGCCGAGATCGAGAACGCGGTGGCCCATCAGCGCGCGCTGGGCCTTGACACCCCGGCCGGTGCCCGTAACTTCTCCCGGTTTCTCGCCACCAAAG
CACACGACATCACGCGAGTGCTGGCAGCAACCGCCGCGGAATCCCAGGCCGGCGCGGCGCGGTTGCGATCCCTGGCTTCGTCCTATCAGGCTGTGGGATTTGGCCCCAAACCCCAGGAGC
CGCCTCCGGATCCAGTGCCATTTCCGCCCTACCAGCCGAAGGTGTGGGCGGCGTGCCGGGCGCGTGGCCAAGACCCGGACAAGGTCGTCAGGACGTTCCATCACGCGCCGATGAGCGCGA
GATTCCGCTCGCTACCGGCCGGAGACTCCGTGTTGTACTGCGGCAATGACAAGTACGGGCTGCTGCACATTCAGGCCAAGCATGGACGCCAATGGCACGATATTGCGGATGCACGATGGC
CGAGTGCAGGCAATTGGCGCTATCTCGCCGATTACGCAATCGGTGCCACACTGGCCTACCCGGAGCGAGTGGAGTACAACCAAGACAACGACACGTTCGCCGTATACCGGAGAATGTCGT
TGCCAGACGGCAGATACGTTTTCACAACCCGCGTCATTATTTCGGCACGCGACGGGAAGATCATTACGGCCTTCCCGCAGACGACGTGA
>Rv0757_N1 Seen in 1 sample(s).
ATGCGGAAAGGGGTTGATCTCGTGACGGCGGGAACCCCAGGCGAAAACACCACACCGGAGGCTCGTGTCCTCGTGGTCGATGATGAGGCCAACATCGTTGAACTGCTGTCGGTGAGCCTC
AAGTTCCAGGGCTTTGAAGTCTACACCGCGACCAACGGGGCACAGGCGCTGGATCGGGCCCGGGAAACCCGGCCGGACGCGGTGATCCTCGATGTGATGATGCCCGGGATGGACGGCTTT
GGGGTGCTGCGCCGGCTGCGCGCCGACGGCATCGATGCCCCGGCGTTGTTCCTGACGGCCCGTGACTCGCTACAGGACAAGATCGCGGGTCTGACCCTGGGTGGTGACGACTATGTGACA
AAGCCCTTCAGTCTGGAGGAGGTCGTGGCCAGGCTGCGGGTCATCCTGCGACGCGCGGGCAAGGGCAACAAGGAACCACGTAATGTTCGACTGACGTTCGCCGATATCGAGCTCGACGAG
GAGACCCACGAAGTGTGGAAGGCGGGCCAACCGGTGTCGCTGTCGCCCACCGAATTCACCCTGCTGCGCTATTTCGTGATCAACGCGGGCACCGTGCTGAGCAAGCCTAAGATTCTCGAC
CACGTTTGGCGCTACGACTTCGGTGGTGATGTCAACGTCGTCGAGTCCTACGTGTCGTATCTGCGCCGCAAGATCGACACTGGGGAGAAGCGGCTGCTGCACACGCTGCGCGGGGTGGGC
TACGTACTGCGGGAGCCTCGATGA
>Rv0818_N1 Seen in 1 sample(s).
TTGTTGGAGTTATTACTGCTGACCTCGGAGCTGTATCCGGATCCGGTCCTGCCGGCGCTGTCGCTGCTGCCCCACACCGTGCGGACGGCGCCGGCCGAGGCGTCTTCGTTGCTGGAGGCG
GGAAACGCAGACGCTGTGCTCGTCGACGCGCGCAACGACCTGTCGTCCGGGCGAGGCCTGTGCCGCCTGTTGAGCTCGACCGGCCGGTCGATCCCGGTACTGGCGGTGGTGAGCGAAGGC
GGGCTGGTGGCGGTCAGCGCTGACTGGGGGCTGGACGAGATCCTGCTGCTCAACACCGGGCCCGCTGAGATCGACGCCAGACTGCGGCTGGTGGTTGGCCGGCGCGGAGATCTGGCTGAC
CAGGAGAGTCTGGGCAAGGTGAGCCTGGGCGAGCTGGTGATCGACGAAGGCACCTACACCGCCCGGCTGCGTGGCCGCCCGCTGGATCTCACCTACAAAGAGTTCGAGCTGCTGAAATAC
CTGGCGCAGCATGCCGGCCGGGTGTTCACTCGGGCGCAGCTGCTGCACGAAGTATGGGGGTATGACTTCTTCGGGGGCACCCGGACTGTTGATGTGCACGTGCGGCGGTTGCGGGCCAAA
CTCGGCCCCGAGCATGAAGCGCTGATCGGCACGGTGCGCAACGTCGGATACAAAGCTGTTCGGCCGGCGCGCGGCCGACCGCCGGCCGCGGACCCCGACGACGAAGACGCCGATCCCGGC
CGGGATGGTATGCAAGAACCACTGGTCGACCCGTTGCGCAGTCAGTGA
>Rv0826_N1 Seen in 1 sample(s).
GTGACCCAAGATACGTCTGCTACCTGTCCGCTGACCAGCACCGTGCAGGATTCCTCGCCGGTTGCGGGCCAGCTTGGCAGGCCTATAGGGTTCCGCGGACTGGCCGGCGGTTGCCCCGTG
TCACCGCTGGGTTACGAATCGCCGCCGCTGCCGCTGGGGCCGGATTCGCTGACGTGGCGATACTTCGGTGACTGGCGCGGGATGCTGCAGGGACCGTGGGCGGGATCCATGCAGAATATG
CATCCGCAGCTGGGCGCGGCGGTCGAAGATCATTCGACGTTCTTCCGGGAACGCTGGCCACGGCTGCTGCGGTCGTTGTACCCGATCGGCGGAGTTGTCTTCGACGGCGATCGAGCCCCA
GTCACCGGTGTGCAGGTGCGTGACTACCACATCACCATCAAGGGTGTCGACGGTGCGGGCCGTCGCTACCACGCGTTGAATCCCGACGTCTTCTACTGGGCGCACGCCACCTTCTTTGTC
GGCACGTTGCATGTGGCCGAGCGGTTCTGCGGTGGCCTGACCGAGGCGCAGCGGCGCCAGCTATTTGACGAGCACGTCCAGTGGTACCGCATGTACGGCATGAGCATGCGGCCGGTGCCG
GCGACCTGGGAGGAGTTTCAGGACTACTGGGACCACATGTGCCGCAACGTGCTGGAGAACAACTTCGCGGCGCGTGCCGTGCTCGACCTGACCGAACTACCCAAACCGCCATTCGCCCAA
CGAGTTCCGGATTGGCTGTGGGCCGCGCCGCGCAAGTTGCTGGCCCGGTTCTTCGTCTGGCTGACCGTCGGACTCTACGATCCGCCCGTGCGCGAGCTGATGGGCTACCGGTGGTTGCGC
CGCGACGAATGGTTGCACCGCCGCTTTGGCGACATCGTCCGGCTCGTCTTTGCCTTGGTGGCATTCCGGTTTCGCAAGCACCCGCGGGCTCGCGCCGGCTGGGACCGTGCCACCGGCCGC
ATCCCCGCCGATGCGCCGCTAGTACAGACGCCCGCGCGCAACCTGCCGCCGCCCGACGAGCGTGACAACCCGACGCACTACTGCCCTAAGGTCTGA
>Rv0888_N1 Seen in 1 sample(s).
ATGGATTACGCCAAACGCATCGGCCAGGTTGGGGCGTTAGCCGTTGTCCTGGGGGTGGGGGCGGCGGTGACTACCCACGCGATCGGCTCTGCCGCGCCGACGGATCCGAGCTCCTCGAGC
ACCGATTCGCCGGTCGACGCGTGCTCGCCGTTGGGTGGGTCCGCCAGTTCGTTGGCTGCGATACCGGGCGCCAGTGTGCCACAGGTCGGCGTGCGACAGGTAGACCCCGGAAGCATCCCC
GATGACTTGCTCAATGCCCTGATCGACTTTCTGGCCGCGGTACGCAACGGGTTGGTGCCCATCATCGAAAACCGCACTCCGGTAGCGAATCCGCAACAAGTCAGCGTCCCTGAGGGGGGG
CACCGTCGGCCCGGTCCGGTTTGACGCCTGCGACCCCGATGGCAACCGGATGACCTTCGCGGTGCGCGAGCGCGGTGCACCCGGTGGACCCCAGCATGGCATCGTGACCGTCGACCAACG
AACGGCCAGCTTCATCTACACAGCCGATCCGGGTTTCGTTGGCACCGATACCTTCAGTGTGAACGTCAGCGATGACACCAGCCTGCACGTGCACGGTCTGGCGGGATACCTGGGTCCGTT
CCATGGGCACGACGACGTCGCCACCGTGACCGTGTTCGTCGGCAACACCCCGACCGACACCATCAGCGGCGACTTCAGCATGCTCACCTACAACATCGCGGGGCTGCCCTTCCCGCTATC
CAGCGCAATTCTGCCCCGGTTCTTCTACACCAAAGAGATTGGGAAGCGGCTCAACGCCTACTACGTCGCGAACGTCCAGGAGGATTTCGCCTACCACCAATTCCTCATCAAGAAATCCAA
GATGCCCAGCCAGACCCCGCCGGAGCCGCCTACCTTGCTGTGGCCTATCGGTGTGCCCTTCTCCGACGGGCTCAATACCCTCTCGGAGTTCAAGGTGCAGCGGCTGGACCGGCAGACATG
GTATGAGTGCACATCCGACAACTGCCTCACCTTGAAGGGCTTCACCTACAGCCAGATGCGGCTTCCCGGCGGTGACACGGTCGACGTCTACAACTTACATACCAACACCGGTGGAGGGCC
GACCACCAACGCCAACCTCGCGCAGGTCGCCAACTACATCCAGCAGAACTCGGCGGGCCGCGCGGTCATCGTCACCGGCGACTTCAACGCGCGGTACTCCGACGACCAAAGCGCTCTGTT
GCAATTTGCGCAGGTCAACGGGCTCACCGATGCCTGGGTGCAGGTAGAACACGGCCCCACCACACCGCCGTTCGCGCCCACTTGCATGGTCGGCAACGAGTGCGAGCTGCTCGACAAGAT
CTTCTATCGAAGCGGCCAGGGAGTGACGTTGCAGGCCGTCAGCTACGGCAACGAGGCGCCGAAATTCTTCAATTCCAAGGGTGAGCCACTGTCGGATCACAGCCCGGCGGTGGTCGGCTT
CCACTACGTCGCGGACAACGTGGCCGTACGGTGA
>Rv0908_N1 Seen in 1 sample(s).
ATGACCCGTTCGGCTTCGGCGACAGCCGGTTTGACCGATGCCGAAGTGGCGCAACGGGTCGCCGAAGGCAAGAGCAACGATATCCCGGAACGGGTCACCCGCACCGTCGGGCAGATCGTC
CGGGCCAACGTATTCACGCGGATCAACGCGATTCTGGGCGTTTTGCTGCTCATCGTCTTGGCGACGGGCTCGTTGATCAACGGGATGTTCGGCCTGCTCATCATCGCCAACAGCGTCATC
GGCATGGTCCAGGAGATCCGTGCCAAGCAGACGCTGGACAAACTCGCGATCATCGGACAGGCGAAACCGTTGGTGCGCAGGCAATCCGGAACGCGCACGCGGTCGACCAACGAGGTGGTG
CTGGACGACATCATCGAACTTGGGCCCGGGGACCAGGTTGTCGTCGACGGCGAGGTCGTCGAGGAGGAAAACTTGGAGATCGACGAATCATTGCTGACCGGCGAGGCCGACCCGATTGCC
AAAGACGCTGGCGATACCGTGATGTCGGGCAGTTTCGTCGTCTCCGGTGCCGGCGCCTACCGCGCCACCAAGGTCGGCAGCGAAGCATATGCAGCCAAACTGGCCGCCGAGGCCAGCAAG
TTCACCCTGGTGAAATCCGAATTGCGCAACGGCATCAACAGGATTCTGCAGTTCATCACTTACTTGTTGGTGCCGGCCGGCCTGCTGACCATCTACACCCAGTTGTTCACCACACACGTG
GGATGGCGGGAATCCGTGTTGCGGATGGTGGGCGCGCTGGTGCCGATGGTTCCCGAAGGCCTGGTGCTGATGACCTCGATCGCCTTCGCCGTCGGGGTGGTCAGGCTCGGCCAGCGTCAA
TGCCTGGTGCAAGAGTTGCCCGCCATCGAGGGGTTGGCGCGGGTGGACGTGGTCTGCGCCGACAAGACCGGCACACTGACCGAAAGTGGCATGCGGGTCTGCGAGGTCGAAGAGCTCGAC
GGGGCTGGTCGACAGGAAAGTGTCGCCGATGTGCTGGCCGCCCTGGCCGCCGCCGACGCCCGTCCCAACGCGAGCATGCAGGCAATCGCCGAGGCCTTTCACTCGCCGCCGGGCTGGGTC
GTGGCCGCGAACGCGCCTTTCAAGTCGGCCACCAAGTGGAGCGGCGTCTCCTTTCGCGATCACGGTAACTGGGTGATCGGCGCGCCCGACGTGCTGCTCGATCCGGCTTCGGTGGCGGCC
AGACAGGCCGAGCGGATCGGAGCGCAGGGATTGCGGGTGCTGCTGCTGGCTGCTGGCAGTGTGGCCGTCGACCATGCCCAAGCGCCGGGTCAGGTCACCCCGGTAGCGCTGGTTGTGCTG
GAGCAGAAGGTGCGGCCCGACGCCCGTGAAACGCTGGATTATTTTGCTGTTCAGAATGTTTCGGTCAAGGTGATCTCCGGTGACAACGCGGTGTCGGTTGGTGCGGTCGCCGACCGGCTC
GGGCTGCATGGCGAGGCGATGGATGCGCGTGCGCTGCCGACGGGCCGCGAAGAACTGGCCGACACACTGGACTCTTACACCAGTTTTGGCCGTGTGCGGCCGGACCAGAAGCGTGCGATC
GTGCATGCTCTGCAATCACACGGGCATACCGTGGCGATGACCGGCGACGGCGTCAACGACGTGCTTGCCCTCAAGGACGCTGATATCGGTGTGGCGATGGGCTCGGGCAGCCCGGCCTCG
CGTGCGGTGGCACAGATCGTGTTGCTGAACAACCGGTTTGCCACGCTGCCCCATGTGGTCGGCGAGGGGCGTCGGGTCATCGGCAATATCGAACGGGTCGCCAATCTATTCCTGACTAAG
ACGGTGTATTCCGTGTTGCTGGCGCTGCTGGTGGGTATTGAGTGCTTAATTGCCATACCGCTGCGGCGTGATCCGCTGTTGTTCCCGTTCCAGCCGATCCACGTCACCATCGCGGCCTGG
TTCACTATCGGGATCCCAGCGTTCATCCTGTCCTTGGCGCCCAACAACGAGCGGGCCTATCCGGGCTTCGTTCGGCGAGTTATGACGTCTGCGGTGCCGTTCGGACTAGTCATCGGTGTC
GCGACTTTCGTCACCTATCTGGCCGCTTACCAGGGTCGCTACGCCTCGTGGCAGGAGCAGGAACAGGCGTCGACCGCTGCGCTGATCACGTTGTTGATGACCGCGTTATGGGTGCTGGCG
GTGATCGCACGCCCCTATCAGTGGTGGCGACTGGCGCTGGTGCTTGTCTCCGGACTGGCCTATGTGGTGATCTTCAGCCTTCCGCTGGCGCGGGAGAAGTTCCTGCTGGATGCCTCGAAC
CTGGCGACGACGTCAATCGCGCTGGCGGTTGGCGTGGTGGGTGCGGCGACCATTGAGGCGATGTGGTGGATCCGAAGCAGGATGCTCGGTGTGAAACCGAGAGTGTGGCGATAA
>Rv1001_N1 Seen in 1 sample(s).
GTGGGTGTCGAATTGGGGTCAAATTCCGAGGTCGGCGCGCTAAGAGTGGTCATCCTGCACCGCCCGGGGGCCGAACTGCGCCGGCTCACACCGCGCAACACCGACCAGCTGCTGTTCGAC
GGCCTGCCCTGGGTATCCCGCGCGCAGGACGAGCACGACGAATTCGCCGAGCTGCTGGCTTCCCGCGGTGCGGAAGTGCTGTTGCTGTCGGACCTGTTGACTGAGGCACTACATCACAGC
GGGGCCGCCCGCATGCAGGGGATCGCCGCTGCCGTCGACGCACCGCGGCTGGGACTGCCGCTGGCGCAAGAGCTTTCGGCCTACCTGCGTAGTCTCGACCCAGGCAGGTTGGCGCATGTG
CTGACGGCCGGCATGACCTTCAACGAGCTCCCGTCGGACACGCGGACCGACGTGTCGTTGGTGTTGCGTATGCACCATGGCGGAGACTTCGTCATTGAGCCGTTGCCGAACCTGGTGTTC
ACCCGCGACTCGTCGATATGGATCGGGCCGCGGGTGGTGATCCCGTCGCTGGCATTACGGGCACGGGTGCGCGAAGCGTCGCTGACCGACCTCATCTATGCTCATCACCCGCGGTTCACC
GGTGTGCGGCGTGCCTATGAATCGCGCACCGCTCCGGTCGAGGGTGGCGACGTGTTGTTGCTCGCCCCGGGTGTGGTCGCTGTCGGAGTGGGCGAGCGGACTACACCAGCAGGCGCGGAA
GCATTGGCGCGCAGCCTTTTTGACGATGATCTTGCGCATACCGTGCTCGCCGTGCCGATCGCTCAGCAGCGCGCGCAAATGCATCTGGACACGGTGTGCACGATGGTCGACACCGATACG
ATGGTGATGTACGCCAACGTTGTCGACACGCTCGAGGCGTTCACGATCCAGCGCACACCCGACGGCGTGACCATCGGCGATGCGGCCCCGTTCGCGGAGGCGGCTGCCAAGGCGATGGGA
ATCGACAAGCTGCGGGTAATTCATACCGGAATGGACCCGTCGTCGCTGAACGCGAACAGTGGGACGACGGCAACAACACGTTGGCGTTGGCGCCCGGTGTCGTTGTCGCCTACGAGCGCA
ACGTACAGACCAACGCCCGCCTGCAGGACGCGGGCATCGAAGTGCTTACCATCGCCGGCTCCGAATTGGGTACCGGCCGTGGCGGGCCCCGCTGCATGTCCTGTCCGGCCGCCCGCGATC
CGCTTTAG
>Rv1097c_N1 Seen in 1 sample(s).
TCAGGACTTGTTGACCTTCAGCGCCTCAATGACCCTCTCGACGGTGGCGCGCGAGGTTGCATCACCGATGGGGGTGGCGCCCAGGAAGACGGTGACCGGCTTGGTGTCGACCGCGATGAT
CGTGACCGAATCACCTTTGACGTTGCGTGAACTGTCGGCGATTGTGATATCGGCGTCTACCCGGGCGGCCCTGACCCCGTCGACGGTGATCGACGACGTCTTGGTCGGGCCCAGGGTGGG
CGACGAGCCTGCGTAGCCGGGGCCGTCGGCCACGCATTGCATCAACTTCGATGCTTGCGCGGCGACGTCCGGTGGTGACGAAGTTGGTTATCGCAACCTCGGCTTGCATCATCCACTGGT
CGGCACCGGCCACCTCGTGGCCGACGCCCACCGCGTCGATGAGGTTCGGGTTCTGGTCGTCGGAGAACGCCGACCACCCGGGTGCCGCGCTGGTCGGGAACGACAGCTTACCCGCACTGA
TCGAATCGCCGATGGGCTGCACACCGCCGGACACATTTGGGGTACAACCGGTTGCGGTTTGCTGGGAAAACGGTTGCGACGTGGGAGCACTCGTCGCCGGAGAGGTTGCCGTGGTCGACT
TGTTGTCGCCGCGGAGGCCGATCACCAGGATCACCACCAGTAGGATGACACCCAGCACCGCGAGGCCGGCGAGGATCAGCCACGGTGTCTTCGATCCTGGCCCGGGCGGAGGTGGTCCTG
GCGGATAGGGCCCCGCCGGCCAGCCGGGCGGATACTGCTGGGGTGGGTAGGCCGGCGGATAGGAGCCGCCCTGCGGTTGGCCTCCCCAATACGGGTCCTGCCCATACGTATTCGGGCCGT
AGGGGTAGTTGCCGTAGGGGCCAGCGGGAGGAACCGTCAT
>Rv1128c_N1 Seen in 1 sample(s).
CTACGGGTCGCCTTCGTCGTCTGCCAGGAGCTTTTCCGGGTGATGGAACGTATTGACTCGAGGTTGGCCGTGGTCGAGATGTGGCGGCGGTAGCCACTCGGTGTCGCCGTGGGCGTTCTT
GCGGGTCGTCCAGCCACGTTCGGCTAACGGATGATGGCCACCGCAGCCGAGTGTCAGGTCATTGACGTCGGTGTTGCGGCACTGGGCGTACGGCGTGACATGATGGACTTCACAGTAATA
GCCGGGCACGTCGCAACCAGGTGCGCTGCAGCCACTGTCCTTGGCGTACAACATAATTCGCTGCGCCGGGGAGGCCAGGCGCTTGGTGTGGTAGAGCGCCAGGGCCTTGCCTCGATCGAA
TATCGCGAGGTAGTGGTTTGCGTGGCGGGCCAGCCGGATCACATCCGATATGGGCAAGATCGTACCCCCGCCGGTGAGCCCGCGCCGGCCGCGGCCTCCAAGTCCTTCAGCGTGGTGGTC
ACGATGATGCTGGCCGGTAATCCGTTGTGCTGGCCCAGATTGCCACTTGTCAACAAACTACGTAATCCGGCGTTGAGCGCGTCGTGGTTCCGCTGTGGGCAGCTGCGGGTGTCTCGCCGC
GCCTGCTCCTTCGAGGGCGCGCCGTTCACACACGGTGCCTTCTGCTCGGGGTTGCACATACCCGGGGCGGCCAGCTTGGCCCACACCGCCTCGATAGTGGCGCGCAGCTCGGGGGTCACA
TATCCGCTGAGCCGCGACATCCCATCGACATCTTGCTTTCCTAACGTCAAGCCGCGGCGGCGGGCGCGGTCCTCGTCGGTGTAGTCGCCATCGGGGTTGAGGCAGTCCATGATCCGCGCG
GCCAATTTGGCCAGCTGGTCGGGACGGTACTGGGTGGCCTGCTTAGCCAAGTCCCGTTCGGCCTTCTCCAGGGTCTTGAGGTCTACCCAGGATGGTAGGCGGTGCACGAAAGCACGGATT
ACTTCAACATGGCCGTCACCAATTAACCCGTGGCGCTGTGCCTTTGCGGTGGCGGTGAGTAGCGGTGGCAGCGGCTCGCCGGTCAGCGCACGGCGCTGGCCAAGGTCGGCGGCCTCGGCC
ACTCGCCGCTTGGCCTCGCTGCGGGTGATGCGCAACCGGTCGGCCAGCGTCAATCCCAGCTTGCCGCCCAGCTCCTCCTCGGTGGATTGTTCGCCGATCTGATTGATCAACGTGTGTTCG
ACGCTGGGCAGCTGGCGTCGCGCGGTCTCGCAGTGCTCCAGCAGCGCCAGGCGCTCCGGGGTGGTCAATGCGTCAAAGGTCAGCCCCAGCACGCGGGACAGCGCGGTAGCCAATGACGCG
AAGGCCTCCGTGATCTCCTCCCGAGTGGAACACAT
>Rv1145_N1 Seen in 1 sample(s).
ATGCTGCAGAGGATCGCTCGGCTCGCCATCGCTGCGCCGCGCCGAATCATCGGGTTTGCGGTCTTCGTCTTCATCGCCGCAGCGGTCTTCGGTGTTCCGGTGGCTGACAGCCTGTCGCCC
GGGGGTTTCCAAGATCCGCGATCGGAGTCGGCACGGGCAATCGAGGTGTTGACCGACAAGTTCGGCCAGAGCGGTCAGAAAATGCTGATCGTGGTTACGGCAGCCGCGGGCGCCGACAGC
CCACCTGCCCGCGAGGTCGGGACTGACATCGTCGAGGTGCTGCGGCGGTCGCCGTTGGTTTACAACGTGACCTCGCCGTGGACTGTGCCACCGACTGCCGCCGCCGACCTGCTCAGCACC
GACGGAAAATCGGGGTTGATCGTCGTCAACGTCAAAGGCGGCGAAAACGACGCGCAGAACCACGCCCAAACCCTGTCAGACGAAGTCGCCCATGACCGCGACGGCGTCACCGTCCGTGCC
GGCGGCTCGGCGATGGAGTACGCCCAGATCAATCGGCAGAACAAAGACGACCTGCTGGTGATGGAGTTGATCGCGATTCCGCTGAGCTTCCTGGTGCTGATCTGGGTGTTCGGTGGGCTG
TTGGCCGCCGGGCTGCCGATGGCCCAGGCCGTACTGGCCGTTGTGGGATCGATGGCCGTATTGCGACTCGTTACGTTTGCCACCGAGGTGTCGACCTTCGCGCTCAACCTGAGTACAGCG
TTGGGCCTCGCGTTGGCTATCGACTACACGCTGCTCATCGTCAGTCGCTATCGCGACGAGCTCGCCGAGGGCAGTGATCGAGACGAAGCACTGATCCGGACCATGGCGACTTCGGGGCGC
ACGGTGTTGTTTTCGGCGGTCACCGTGGCGCTGTCGATGTCGGCGACTGCGCTGTTCCCGATGTACTTTCTGA
>Rv1225c_N1 Seen in 1 sample(s).
CTAACACCCGAGCAACGGCGGCAGGCCGGCCACCGAGTCGATCACGTGGTGCGGCCGGGTCGCGCTGGCGCCGGCCAGCCAGCGATCCAGCGTTTGCTGGCGGAACTTGCCGGTGCGCAC
CAGCACACCCGTCATGCCCACCGCCTGGGCGGCCAGCACGTCGTTGTGCAGATCGTCGCCGATCATGACCATCTGCTGTGGATCGACACCGACGCGGTCGGCGGCCGCCAGGAATCCCTC
GGCCGCAGGCTTGCCGATGGCGGTGGCGGTCTTGCCGCAGGCCTGTTCCATTCCGGTCAGGTACATCCCGGTGTCGATGCGCAGCCCGTCGGTGGTGTTCCAGGTCATATTGCGGTGCAT
CGCCACCACCGGAACGCCGTCGAGCATCCACCCATAGACCCGGCTGAGCGTGCGGTGATCGAACTGGGGGCCGGGCACTGCCGAGCACGACGACGTCGGGGGCTTCGGGGCAATCCTCGG
GACCGATCTCGGTCGACAAGACGACGTCGATGCCGGGCAAGTCCTCGGTGATGTCGCCGTTGTTCACCAGGAAGCACCGCGCGCCGGGATAGGCGCCGTGCAGGTACTCGGCCGTCAGCA
CCCCGGCCGTGATCACGTCGTCGGCGGCGACGGGGATCCCCGCGGCACCCAGCGCCTCGGCGATCTGCCGGCGGGTGCGCGTCGTGGTGTTGGTCAGATACGCGCAGGCGATTCCCCGAT
GGGTCAGTTGCCGCACGGTCTCGGCGGCCCCGGGAATCGCGCGCCACGACAGCACCAGCACGCCGTCGATGTCGAACAGCACCGCCGCGGCCATCAGATGCGCCACGTCCAC
>Rv1258c_N1 Seen in 1 sample(s).
TCACTGAGCCGATCCTACGGGCCGATCGATGTCCGCTTGGGGCGCCAGATCCAGTTCGCGCAGCGCGGGCAGCCGGATCGCGACCAGCCCGGTGCACACGATGGGCAGTGCCAACGCGAG
AAACGTGGCATGCAGTCCAGCGGCGTCGGTCAGTGGACCGGCCAGCAACAGACCCAACGGGCCGGCGGCGTAGGCCAGCGACGTCATCACCCCGACTACCCGGCCGCGCAGATGCTGTGC
TGCCCGCGTCTGTATCACGTAGTTATAGATCGGCTGGATGGGTCCGTACACCAGGCCGACCACCGCGCACAACACCATGATGACCGGCAGTGGCGGCAGGAACGCGATGACCATCGATGC
CAAACCCAGGGTAAGAACCGCGGTCGACATGGTCACGCGACGGGGAACGCGGATAGCCAACACGGCATACCCCAGCGCTCCCACCAGGCCGCCGCCGGCGATCGCCATCAACGCCCAACC
CAGCTGCACCGGTTGCTGGTGGTCGGTGAAGTATTTCGGGAACAGCACGCTCTCCATCGGCAGATACAGCGCGGTGACGGTCAGGTCAATCATCCCGAGGGTGCGCAATACCCGCAGGTT
CCAGACGAAGCGCAGCCCCTCGGCGATCCCGGATACCAACCCTTGGGGCCGCGAGGTGTGGTGCGGCTTGCCGGCACCCTGCGAGTTGCAGGGCGGCAATCGCGAGGATGGACAACCCGA
ATGCCGTCGCGGTAATCCACATTGTGGTGATGCCGCCAACCGTCGCGATCATCAAGCCACCGATGGCCGGGCCGACAATAAAGGCCAGGTTGAGGATCGCCTCGTAGGCGCCGTTGATGC
GGTCCAACGACCAGCCTGCCCGAGCGGCGGCCTCGGGCAGCATCGAGTCACGAGCCGTCATGCCTGCCGGGCCGAAGGCGGCCGCCAGGGCGGCCAATACGGCCAGCACCAGCACGTTGA
CCGCGTCGCCGCCGTACCCCCACGCCACCAGGGGGACGCCGGCCACCGCCGCACCCGACAGCGCATCGGCCACCATCGACACCCGGCGACGCCCGAAGTAGTCGACCGCGGTGCCGGCGA
CCAGCGTGGCGAACAACAGCGGCAGCATGGTCGCACTGGCCACGATCGAGGCCTGCCCAGCGCTGCCCTCGCGCTGCAACACCAGCCACGGAAACGCGACTATCGAGACGCCATCACCCG
CGGCCGCCATCAGCGTTGCGAACAGGATCAGGAATGCCGGGCCGCGGTTGCTGTTTCTCAT
>Rv1269c_N1 Seen in 1 sample(s).
TTAGTTGCAGGCCCAGGTGTCGATGTAGCCGCCGCCGAGCTTGGTCAGGGCGTCCTTCATGGCGGCGGCCAAGGTGGGTCCAACTCCTCCCTGGTATGCCCTATCGTTGGCGGCGACGGC
GCCGCAGGCGGTGAAACTGGTGAGCACCTTGCAGTCGGAGTAGCCACACGACTTGACGGCGGTGGCTTCGGCAGCCGCCCGGGTTGGGTAGTCCCACGATCGGCCCCACGAGCCGTTGCC
GGAGTAGGCAATTGCGCCATAGACATCGGCGGCATTTGCTGGTGCGGGAGCCAGGGAGCCAGGGTGACGGTCGTCGCGGCGGCAGTGGCGACGCCGGCGACGGCCACCGCGAACCGTCGC
CGAAGAGTAATCATCGTCGTCAT
>Rv1326c_N1 Seen in 1 sample(s).
CTAGGCGGGCGTCAGCCACAGCGCCGAAGTGGGCGGCAGCACCAGCACCGCGGACGCCGGGCGGCCATGCCAGGGGTCGTCGGTGGCGTCCACGCCGCCGAGGTTGCCGATCCCTGAGCC
GTGGTAGATCGTCGCGTCGGTATTGAGCACCTCGCGCCAGCGGCCCGCGCGCGGCAGCCCGAGTCGATAGTCACGGTGTTCGGCACCTGCGAAATTGAACACGCAGGCCAGCACCGAGCC
GTCGCTGCCGTAGCGCATAAAGCTCAACACATTGTTGGCGGAGTCGTTGGCGTCGATCCAAGAATAGCCTTCGGGGGTGGTGTCTAAGCTCCACAGCGCCGGGTGGCATCGGTAGATGTC
GTTGATGTCGCGCACCAGCCGCTGAATCCCGTTGGAGAAGCCGTTTTCGTCGAGTTGGAACCAGTCCAGGCCGCGCTGCTCGGACCATTCGGCGCGTTGGCCGAATTCCTGACCCATGAA
CAGCAATTGCTTGCCGGGGTGTGCCCATTGGTAGGCAAGCAGGCTACGCAGGCCGGCGGCCTTGACGTGATTGTTGCCCGGCATCCGCCCCCACAGCGTGCCTTTGCCGTGCACCACCTC
GTCATGACTGAGCGGCAACACGTAATTTTCGCTGAACGCATACAGCATCGAGAACGTCATCTCGTGGTGGTGGTAGCTGCGGTACACCGGATCTCGGCTGACGTAGTCGAGCGTGTCGTG
CATCCAGCCCATGTTCCACTTCATCGAAAAGCCCAGGCCGCCAATGTTGGTCGGGCGGGTCACCCCAGGCCACGGCGTGGACTCCTCGGCGATGGTGACGATTCCCGGCGCGACCTTGTG
CGCCGTGGCGTTCATCTCCTGCAGGAACTGCACTGCTTCCAGGTTCTCCCGGCCGCCGTGGACGTTGGGGGTCCAGCCGCCCTCGGGTCGCGAGTAGTCTAGATAGAGCATTGAGGCCAC
CGCGTCCACCCGCAGGCCGTCGATGTGGAACTCCTGTAGCCAGTACAACGCATTGGCTACCAGAAAGTTGCGCACTTCCGGGCGGCCGAAGTCGAACACGTATGTGCCCCAATCCAGTTG
CTCGCCGCGTTTGGGATCGGAATGTTCGTAGAGCGGAGTGCCGTCGAACCGTCCCAGGGCCCACGCGTCCTTCGGGAAGTGCGCTGGGACCCAATCCACGATGACGCCGATGCCGGCCTG
GTGCAGGGCGTCGACCAGCGCCCGGAAGTCGTCGGGTGTGCCGAATCGTGATGTCGGCGCATAGTAGGACGTGACCTGATACCCCCATGATCCGGCGAATGGATGCTCGGCGACGGGCAA
CAGCTCCACATGGGTAAACCCTTGATCCACAATGTAATCCGTCAACTCACGAGCAAGCTGGCGGTAGCTGAGTCCAGGCCGCCACGAACCGAGATGGACTTCGTAGGTGCTCATCGCCTC
GTTCACCGGGTTGCGCAGCGCACGCCCAGCCATCCAGTCGTCGTCACCCCAGGTGTAGTCACTCGACGTCACCCGCGATGCGGTCTGCGGCGGCACCTCGGTGCCGAACGCGAACGGGTC
GGCCCGATCGGTAACCACGCCGTCGGCGCCGTGCACGCGGAACTTGTACAGACCGTCGCAAGGGAAGTCGGGCCAGAACAATTCCCATACCCCTGATGGGCCGAGCACCCGCATGGGGGC
TTCGTGGCCATTCCAACCGTTGAACTCGCCGATCAAGCTGACGCCCTTGGCGTTGGGCGCCCACACGGCGAACGACACGCCACTCACCACACCGTCGGCCGTGGTAAACGAGCGGGGGTG
GGCACCCAGGACTTCCCAAAGCCGTTCGTGGCGGCCCTCGGCGAACAGGTGCAGGTCGACCTCGCCCAGGGTGGGCAGGAATCGGTACGCATCGGCCACGGTGTGTGGCTCGCAACCTTC
ATAGGTCACCTGCAGGCGGTAGTCGATGAGGTCGACGAACGGCAATGCGACGGCAAACAGGCCAGAATCGAGGTGCTGCAACGAGAGCCGGTCCTTACCAACGAGCGCGACGACCTCGAC
GGCATGCGGACGGAACGCTCGGATGACGGTATGGTCGTCGTATTCGTGGGCGCCCAGGATGCCGTGCGGGTTGTGATGTGTACCCGCCACCAAGCGCGCCATTTCGGCCGGCTCGGGTGC
AAGGTGCTCCCCGGTGAGTTTCTCGGATCGACTCAT
>Rv1330c_N1 Seen in 1 sample(s).
TCAGGCCGGGATCGTGCGTGTCGGAATCGCCGGCTCGCCGGGTGCCAATTTCAGCCCGTCGGCTGGCAGGCTGCGCAACCCGGATGCCACCAGCTGCCGTGCCGCGGCCAGGCTGGTATC
GGCTACCGGTTGCCCGGCGCGGACCAGTGGCAGCGTCAAAACCCGGTGCGGCTCGACAATGACCGGCGGACGGCCCGCCGGATGCACGAGCTCCTCGGTGATGGTGCCCGTCGCACGGGA
GCGCCGCAGTGCCTCTTTGCGGCCGCCGGGGGATTCTTTGTAGCTGCTGCGCTTTTGCACCGGTACACCGTCTACCTCGACCAGTTTGTAGACCATGTTGGCGGTCGGCGCGCCCGACCC
GGTGACCAGCGACGTGCCCACGCCGTAGCTGTCGACGGGTTCACCGCGCAACGCGGCGATGCTGAACTCGTCAAGTTCGCCGGACACCACGATGCGCGTCCGGGTGGCTCCTAGCCGGTC
GAGCTGCTCCCGCGCTTGGCGGGCCAGTACCCCAAGCTCACCGGAATCGATGCGGATCGCGCCGAGCTCAGCGCCGGCGGCGGCAACGGCATTGGCCACACCGGTCGTGACGTCATAGGT
ATCCACCAGCAGCGTGGTACCGGGTCCCAGCGCTTCGACCTGGGCGCGGAATGCGGCTCGCTCGGCTAGTTCGGTGGGGCCGCCATGCTGGGCGTGCAACATGGTGAATGCGTGTGCCGC
GGTGCCGTGCGCGGGCACTCCGTAGCGTCGCTGCGCCGCCAAGTTGGATGACGCGGCGAAACCGGCGATATACGCCGCCCGGGCCGCTGCCACCGCGGCGCGTTCGTGGGTGCGCCGCGA
GCCCATCTCGATCAGTGGGCGCCCCCCGGCGGCGCTGACCATGCGCGCCGCTGCCGAGGCGATCGCTGTGTCGTGGTTGAAGATTGACAGCACCAGCGTTTCGAGCAGGACGCATTCGGC
GAAGCTGCCGCGTACCGAGAGCACCGGTGACCCGGGAAAATACAGCTCCCCCTCGGCATAGCCGTCGATATCGCCGCGGAACCGGAATTCGCGAAGATACCGCACCGTGGCCGGGTCGAG
GAATTGGGCCAGCAACTCGCACGCGTCAGCGTCGAACCTGAACTGCGGCAACGCTTCCAGCAACCGGCCGGTTCCGGCGACAACTCCGTAGCGACGGCCGGTGGGGAGTCGGCGAGCGAA
CACCTCGAATGTGGTGGGGCGATTGGCGCTGCCGTCGCGCAGGGCAGCCGCCAGCATGGTCAACTCGTACTTGTCGGTCAACAGCCCGGCTGGGTCTTGATTGTCGGGCTCTCCCTCTCG
CCGCCTGGCGGCTGGGGGTGGCCCCAC
>Rv1363c_N1 Seen in 1 sample(s).
TCACGGTACGAACTCAACTTTCGACATCTTGTACTGTCCCCCCTCTTCGGTCACGGTCACTTTGAGCCGCCACGGACGTGGTTCGTCTTTCGCCCCAGCGGAATTGGTGACCCGTGAAGT
CGCCGCGACGAGCACCACGGCGGAATGCTCGTTCATGGATTCGACGGCTGTCGCGTTCACCGTGCCTTCGGTGACCACTTTGGACTGTTCGACAACCTTGGTGAAATCGGCTGCCCGCTG
CTGGAAGTCATCCCTGAATTCGCCGGTGGAGCTGTCGATCACACGCGCGACGTCTTCTTTGGCCTTGTTGAAGTCCAGCGAGGTCATGTTGATGACACCTTGCTTGGCTCCGGCGGCGAA
CGCCGCGGCGCGCTGCTGGCGTTCGGTGGCCTCATGGTGTTGCCACACAATGTATCCGCTGAGCCCGGTGAAGCCGCAGATGATGACGACTGCGGCCGCCATGGCAATCGTGGACAGTCT
TGGTAACCGCACCCGCAACCGCCGTCGCCAGGATGCCGACCGTGCGGCCTCCTGGTCTGCGGCCTCATAGTCGTCATAGTCGTCATAGTCTTCGGCGTCTTCCCAGTCTGCATACTCCTC
GGGGACGTTCTCGTCCTCGGCTGGGGCCATCGCCAGCGCCTCACGCTTCAACCGGGCGGCACGGGCACGGGCCCGCGCCGCGGCGGCCAGCGCTTCGGCTTCGGCGGCTTCGGCTTCGGC
GGCCAACGCCATCGCGTCGGCTTGCGATGTCCCCGCGTCCGACGGTGGTTCGGTTGTCTCAGCCAT
>Rv1413_N1 Seen in 1 sample(s).
GTGGCCACCATCGGGGAAGTCGAGGTATTCGTCGACCACGGCGCCGACGACGTATTCATCACCTACCCATTGTGGATCGACACACGCCAAGCCGACCGGCTCCGTCAGCTGGCTGACCGC
GCTCGCATCGCTGTCGGTGCGGGCACCGCCGAGGGCGCTTCGAACACCGGCGCACGGCTCGCAGACGCCGCTGGCGCGATCGATGTTCTCATCGAAATCGACAGTGGCCATCACCGCAGC
GGCGTCCGTGCCGAACAAGTGTTGGAGGTCGCCCACGCCGTCGGTGAGGCTGGGCTTCACCTGGTGGGGGTGTTCACCTTCCCCGGTCACAGTTATGCGCCAGGTAAACCCGGCGAAGCC
GGCGAGCAAGAGCGGCGCGCTCTCAACGACGCGGCGAACGCGCTGGTCGCGGTGGGCTTCCCGATCAGCTGCCGCAGCGGTGGGTCCACTCCCACCGCATTGCTCACCGCCGCGGACGGG
GCCTCCGAGACGTCCCGGCGTCTATGTGCTCGGTGA
>Rv1420_N1 Seen in 1 sample(s).
GTGCCAGATCCCGCAACGTATCGCCCCGCGCCCGGGTCCATCCCGGTCGAGCCGGGCGTGTACCGATTCCGGGACCAGCATGGGCGAGTCATCTACGTCGGCAAGGCCAAGAGCCTGCGT
AGCCGGCTGACGTCCTATTTTGCCGACGTGGCCAGCCTAGCGCCGCGGACCCGGCAGCTGGTGACCACCGCGGCCAAGGTCGAATGGACGGCCGTGGGGACCGAGGTTGAGGCACTGCAG
CTGGAATACACCTGGATCAAGGAGTTCGATCCGCGATTCAACGTCCGCTACCGCGACGACAAGTCCTACCCTGTGCTGGCGGTCACCCTGGGCGAGGAATTTCCCCGGTTGATGGTCTAT
CGCGGTCCGCGGCGCAAGGGTGTGCGCTATTTCGGGCCGTACTCGCACGCGTGGGCAATCCGGGAAACGCTGGATCTGCTCACCCGGGTGTTTCCGGCGCGAACTTGCTCGGCGGGGGTG
TTTAAGCGGCACAGGCAGATCGATCGTCCATGCCTGCTCGGCTACATCGACAAATGTTCCGCGCCGTGTATTGGCAGGGTCGATGCGGCCCAGCACCGCCAGATCGTGGCAGACTTCTGC
GACTTTCTGTCCGGCAAGACCGACCGGTTCGCCCGCGCCTTGGAACAGCAAATGAACGCCGCGGCCGAGCAACTGGACTTCGAACGAGCGGCGCGGCTTCGCGACGACCTGTCCGCACTG
AAGCGTGCCATGGAAAAGCAGGCCGTGGTGCTCGGGGACGGCACCGACGCCGACGTGGTGGCATTCGCCGACGACGAACTCGAGGCGGCGGTGCAAGTGTTCCACGTGCGCGGCGGACGG
GTCCGCGGCCAGCGTGGCTGGATTATCGAAAAGCCAGGAGAGCCAGGAGATTCCGGAATCCAGTTGGTCGAGCAATTCCTGACACAGTTCTACGGCGACCAGGCGGCGTTGGACGACGCC
GCCGACGAATCCGCCAACCCGGTTCCCCGCGAGGTGCTGGTGCCCTGTTTGCCGTCCAACGCCGAGGAGCTGGCCAGCTGGCTGTCCGGCCTGCGCGGCTCAAGGGTCGTGCTGCGGGTG
CCGCGCCGCGGGGACAAGCGGGCACTGGCCGAAACGGTGCACCGAAACGCAGAAGATGCACTGCAACAACACAAGCTGAAGCGGGCCAGCGATTTCAACGCCAGATCCGCTGCGCTGCAG
AGCATTCAGGACTCGTTGGGCCTGGCAGACGCACCCTTGCGGATCGAGTGTGTCGACGTCAGCCATGTGCAGGGCACCGACGTGGTCGGGTCACTGGTGGCGTTCGAAGACGGCCTGCCG
CGCAAGTCGGACTACCGCCACTTCGGGATCCGGGAAGCCGCAGGCCAGGGGCGCTCCGACGACGTGGCCTGTATTGCCGAGGTGACCCGGCGCCGCTTCCTGCGGCACCTGCGCGATCAG
AGCGATCCGGATCTTCTTTCTCCGGAAAGGAAGTCGCGTAGATTCGCCTATCCGCCCAATCTGTACGTCGTCGACGGCGGCGCGCCGCAAGTCAACGCGGCCAGTGCGGTAATCGACGAA
CTCGGTGTTACCGACGTCGCGGTGATCGGCCTGGCCAAGCGGCTGGAAGAGGTATGGGTGCCGTCGGAGCCGGACCCGATTATCATGCCGCGCAACAGTGAGGGACTCTATCTGCTGCAG
CGAGTGCGAGACGAGGCACACCGGTTCGCTATCACCTACCATCGCAGCAAGCGGTCGACGCGGATGACTGCCTCAGCGCTGGACTCGGTGCCGGGATTGGGGGAGCATCGCCGCAAAGCG
CTGGTCACCCATTTCGGATCGATCGCTCGCCTCAAGGAGGCCACCGTCGACGAAATCACCGCTGTTCCCGGTATCGGCGTGGCCACGGCCACGGCCGTCCACGACGCACTGCGACCTGAC
TCATCGGGGGCCGCGCGATGA
>Rv1551_N1 Seen in 1 sample(s).
GTGACTGCACGGGAGGTGGGCCGCATCGGACTGCGAAAGTTGCTGCAGCGCATCGGTATTGTTGCTGAATCAATGACGCCGCTAGCGACCGACCCCGTTGAGGTTACCCAACTGCTGGAT
GCCCGATGGTATGACGAGCGGCTGCGTGCGCTGGCCGACGAGCTCGGACGCGATCCGGACAGCGTGCGCGCCGAGGCGGCAGGCTATCTGCGGGAGATGGCCGCCTCGCTGGATGAGCGG
GCCGTGCAGGCATGGCGCGGCTTCAGTCGCTGGCTCATGCGCGCCTACGACGTACTGGTCGACGAGGACCAGATCACGCAGCTGCGCAAGCTTGATCGCAAAGCCACCCTGGCGTTCGCG
TTCTCGCATCGTTCGTACTTGGATGGGATGCTGCTGCCCGAGGCGATCCTGGCCAACCGGCTCTCGCCGGCGCTGACCTTCGGCGGGGCGAACCTGAACTTCTTTCCGATGGGCGCTTGG
GCCAAACGTACCGGGGCTATCTTCATTCGGCGTCAGACGAAAGATATTCCCGTCTACCGCTTCGTATTACGTGCTTACGCCGCGCAGCTGGTGCAAAACCATGTCAACCTCACCTGGTCG
ATCGAAGGGGGTCGGACCAGAACGGGCAAGCTACGGCCACCGGTGTTCGGGATCCTGCGTTACATCACCGATGCGGTCGACGAAATCGACGGTCCCGAAGTGTATTTGGTGCCGACCTCG
ATCGTGTACGACCAGCTGCACGAGGTGGAAGCCATGACCACCGAGGCCTATGGCGCGGTGAAACGACCCGAAGACCTGCGCTTTCTGGTCCGGTTGGCGCGACAGCAGGGCGAGCGACTG
GGCCGCGCCTATCTCGACTTCGGCGAACCGCTGCCGCTTCGCAAGCGCCTGCAGGAGATGCGCGCCGACAAGTCGGCACCGGCAGCGAGATCGAACGGATCGCGTTGGATGTCGAGCACC
GGATCAACCGCGCCACACCGGTTACCCCCACCGCGGTGGTGAGTCTGGCCCTGCTGGGCGCGGACCGCTCGTTGTCCATCAGCGAGGTGTTGGCGACGGTTCGCCCGTTGGCCAGCTACA
TAGCTGCCCGCAACTGGGCGGTGGCCGGCGCCGCCGATCTGACGAATCGCTCGACGATCCGGTGGACCTTGCATCAGATGGTTGCTTCCGGCGTGGTGAGTGTCTACGACGCGGGCACCG
AGGCGGTGTGGGGCATCGGCGAGGACCAGCACCTGGTGGCGGCGTTTTACCGCAACACCGCGATCCATATCCTGGTCGATCGGGCCGTCGCCGAGTTGGCGTTGCTGGCGGCCGCAGAGA
CCACAACAAACGGCTCGGTTTCCCCGGCGACCGTGCGTGATGAGGCGTTGAGCCTTCGCGACTTGCTGAAGTTCGAGTTCTTGTTTTCTGGCCGTGCCCAGTTTGAGAAAGACCTCGCAA
ACGAGGTACTGCTGATCGGGTCGGTGGTCGACACCTCCAAGCCCGCGGCCGCAGCCGATGTGTGGCGCCTGCTGGAATCGGCCGATGTGCTGCTGGCCCACCTGGTGCTGCGGCCGTTTC
TCGATGCCTACCACATTGTCGCCGATCGGCTGGCCGCCCATGAAGACGACTCTTTCGACGAGGAAGGGTTTCTGGCCGAGTGTCTACAGGTCGGCAAGCAGTGGGAGCTGCAGCGCAATA
TCGCCAGCGCCGAGTCCAGGTCGATGGAGCTGTTCAAGACCGCACTGCGCCTGGCTCGCCATCGCGAGCTGGTCGACGGTGCCGATGCGACGGACATCGCCAAACGCCGACAGCAGTTCG
CCGACGAGATAGCCACGGCAACCAGGCGGGTAAACACAATCGCAGAACTGGCCCGCAGGCAATGA
>Rv1564c_N1 Seen in 1 sample(s).
TCACAACGTCTTACGCAGGACCAGCAGCGAGCGCGCAGGTACCGAAAACGTGTCAGTGGCGGTTACCGTCAGGTCGATGTCACCGACGGGATCGTTGGTATCCAGCTCTCCGGTCCACTG
CTGCGCATAGCCGTCATGCGGCATCACGAACTCCACGTCGTGGTCATGGGCGTTGAAGCACAACAGGAATGAATCGTCGACTACTCGCTCACCACGGGCGTCCGGTGCGGTAATGGCTTC
ACCGTTGAGAAACACCGCAACACACCTGTCGAAGCCTCTGCCCCAATCCTCGTGCGTCATCTCCCGACCGCTCGGTGTCAACCAGGCGATATCGCGGACTTCGTCGCCACTGCGGATCGG
TTCACCCTCAAAGAACCGGCGTCGGCGAAACACCTTGTGGTTCTTGCGCAAGGTCGTCGCCTTGCGTGCGAAAGCTAGCAGATCGGCATTCTTGTCCACCAATGACCAATCCATCCAAGA
TAATTCGGAGTCCTGGCAGTAGACGTTGTTGTTGCCGTATTGGGTGCGCCCAATCTCGTCGCCGTGGGCGATCATCGGCGTGCCCTGGCTGACCATAAGCGTGGCCCACATGTTGCGCAT
CTGGCGGGCACGCAGCGCCAAGATGTCGGGGTCATCGGTGGGGCCCTCGACACCGCAGTTCCACGATCGGTTGTAGCTTTCCCCGTCGCGGTTGTTCTCGCCATTGGCCTCGTTGTGCTT
GTCGTTGTACGAGACCAGGTCGTTGAGTGTGAACCCGTCGTGGGCGGTGACGAAATTGATACTGGCACTGGGCCGGCGGCCGGTTGCTTCGTAGAGGTCCGACGACCCGGTCAGCCGGGA
GGCGAATTCGCCTAGGGTGGCCGGCTCGCCTCGCCAGTAGTCGCGCACGGTGTCGCGGTACTTGCCGTTCCATTCCGTCCACAGTCCTGGGAAGTTGCCAACCTGGTAGCCACCTTCGCC
GACATCCCATGGCTCGGCGATCAGCTTGACCTGACTGACCACCGGATCTTGTTGCACCAGATCGAAGAATGCCGACAGCCGGTCGACGTCGTGCAGCTCGCGGGCCAGCGTGGACGCCAG
GTCGAACCGGAACCCGTCGACGTGCATTTCGATCACCCAGTAGCGCAGCGAATCCATGATCAGCTGCAGGGTGTGTGGGTGGCGGGCATTGAGGCTGTTGCCGGTACCGGTGAAGTCCTT
GTAGAACCTCAAGTCGTGGTCCATCAGTCGGTAGTAGGCGGTGTTGTCGATTCCGCGAAAGTTGATCGTCGGACCCAAGTGGTTGCCTTCAGCGGTGTGGTTGTAGACGACGTCGAGGAT
GACCTCGATGCCGGCTTCGTGCAGGCTGCGCACCATGGTTTTGAACTCGGCTACCGCGCTGCCGGCTTGCCGGGTCGACGCGTATTGATGGTGCGGGGCGAAGAATCCGAAGGTGTTGTA
ACCCCAGTAGTTTCGCAAGCCGAGGTCCAGCAGCCGGGAGTCGTGTAGGAACTGGTGCACCGGCATCAACTCAACGGCGGTGACGTTGAGCTCGTTGAGGTGGTCGATGATCACCGGGTG
GGCAGGCCGGCGTAGGTGCCCCGGAGTTCGGGCGGGATACTGGGATGGGTCTGTGTCATGCCTTTGACATGCGCTTCGTAGATTACGGTCTCGTGGTACGGGGTGCGCGGCGACCGGTCG
TATGCCCAGTCGAAGAACGGATTGATCACGACGCTGGTCATAGTGTGGCCCAGCGAGTCGACCATCGGGGGAGTGCTGTCCGGGTCGACGGCGTTGACGTCATAGGAATACAGCGCCTGC
CCGAAGGTGAAATCGCCGTGGAACGACTTCCCATACGGGTCGAGCAGCAGCTTGCTGGGGTCACACCGATGGCCGGCCGCCGGGTCGAACGGCCCGTGCACACGAAACCCGTAGCGCTGG
CCGGGGGTGATGTTCGGCAGATAGGCATGCCAGACGTACCCGTCCACCTCGTCAAGCGGGATCCGCGACTCGACGCCGTCCTCGTCGATCAGACATAGCTCGACCTTCTCGGCGATCTCG
GAGAACAACGAAAAGTTGGTCCCGGCGCCGTCGTAGGTGGCTCCAAGCGGATAGGCGTTGCCCGGCCACACCGTGGGTAGAGCGGGCCCGGTCCCGTCGGACTCCCCGGCGTTGTTCGAC
GACAT
>Rv1615_N1 Seen in 1 sample(s).
ATGGGCCTACGGCCGGCACGGGTCGTGCGCCCGGCTCGATCTGGCATGCTGAAAGGCGTGACCGATCCCCTGCAGCACGGTGCCTTCGAGCCGGGCTGGCAATCCGCACCACCCGGATAT
CCACCGCCTTATCCGCAATATCCGGGGCCTGGCTCTTACTTTGACCCGTTCGCGCCATATGGTCGCCATCCGGTCACCGGCCAACCATTTTCCGACAAATCGAAGACTGTTGCCGGCCTG
TTGCAGTTGCTTGGACTGTTCGGCATCGCCGGGATCGGGCGAATCTATCTGGGCCATACCGGCCTGGGCATCGCGCAGCTGCTGGTGGGCTGGGTGACGTGCGGTTTGGGCGCCGTCATC
TGGGGCGTCATTGACGCCCTGCTGATATTGACCGACAAAGTCGGCGACCCTTGGGGTCGTCCCTTGCGCGATGGAAGCTAG
>Rv1775_N1 Seen in 1 sample(s).
ATGGCCAGCGATCTGTACCTGGGCTACCGCAACGACGACGCGGACACGCCGTTCGGCAAGTTCTTCAAACCCGAGATGGCCCCGCTGCCACAGCATGTCGTGGTGGCGTTGCAGCATGGC
CCCAGGCCGGGATGGCGTTGCTCGCCTTCGACGACGCCGCGAGCATCGTTGATGAGGGCTATCAGCAGACCGAGAACGGCTACGGGATTCTCGGCGACGGCAGCATGCAGGTATCCGTGC
GCACCGACATGCCCGGGGTCACTCCCGCGATGTGGGCATGGTGGTTCGGCTGGCACGGCAGCGACACCCGCCGCTACAAGCTGTGGCACCCGCGGGCCCATCTATCGGCGCGGTGGAAGG
ACGGCGACCAGGACAGCGGGGCCGGCCGTCGGGGCGCGCAGCGTTACGTCGGCCGCTGGTCGATGATCAGCGAGTACATCGGCTCGACGAAACTGGGTGCCGCAATACAATTCGTCGAGC
CGGCGGCCATGGGTCTGCCCGACGACAGCGACGATACGGTGTCGATCTGTGCGCGGTTGGGCTCTGCTGACGCCCCGGTGGATGCGGGCTGGTTCGTCCATCAGGTCCGATCGACGCCGG
GCGGGTCCGAGATGCGGTCACGGTTTTGGATGGGCGGACCGCACATCGCGGTGCGCAAGGCACCCGAGGTCGCGTCCAAGGCGGTGCGTCCCATCGCGTCGAAGCTAATCGGCGTCTCGG
AATCGACCGCGCGTAATCTGCTGGTGTACTGCGCGCAGGAGATGAACCACCTGGCGGGGTTCTTGGCGGACCTGTGGGAAAGCTTCGGTGACGAGTGA
>Rv1915_N1 Seen in 1 sample(s).
ATGGCCATCGCCGAAACGGACACCGAGGTCCACACACCGTTCGAGCAGGACTTTGAGAAAGACGTAGCCGCCACTCAGCGATACTTCGACAGCTCGCGCTTTGCTGGGATCATTCGGCTC
TACACCGCCCGCCAAGTCGTGGAACAGCGCGGCACGATCCCCGTCGACCACATCGTGGCGCGAGAGGCGGCGGGCGCCTTCTACGAGCGTCTGCGCGAACTCTTTGCAGCCCGCAAGAGC
ATCACGACGTTTGGCCCCTACTCGCCGGGGCAGGCGGTGAGCATGAAGCGGATGGGTATCGAGGCGATCTACCTCGGTGGTTGGGCTACCTCAGCTAAGGGCTCCAGCACCGAAGATCCG
GGGCCCGACCTCGCCAGCTACCCGCTGAGCCAGGTGCCTGACGATGCCGCGGTGCTGGTGCGCGCCTTGCTCACCGCGGACCGCAACCAACACTATCTACGCCTGCAGATGAGCGAGCGA
CAGCGTGCGGCGACACCGGCTTACGACTTCCGCCCGTTTATCATCGCCGACGCCGACACCGGCCACGGCGGCGATCCGCACGTACGCAACCTGATCCGCCGCTTCGTCGAGGTCGGTGTG
CCGGGCTACCACATCGAGGACCAACGACCCGGCACCAAGAAGTGCGGCCACCAGGGCGGCAAGGTCCTGGTGCCGTCCGACGAACAGATCAAGCGGCTCAACGCCGCCCGCTTCCAGCTC
GACATCATGCGGGTGCCCGGCATCATCGTCGCACGCACCGACGCGGAGGCGGCCAACCTGATCGACAGTCGCGCCGACGAGCGTGACCAGCCGTTCCTTCTCGGCGCGACCAAGCTCGAC
GTACCGTCCTACAAGTCCTGTTTCCTGGCAATGGTGCGGCGTTTTTACGAACTGGGCGTCAAGGAGCTCAATGGTCATCTTCTCTATGCGCTTGGCGACAGCGAGTACGCGGCGGCCGGC
GGTTGGCTTGAGCGCCAAGGCATTTTCGGCTTGGTCTCCGACGCGGTCAACGCGTGGCGGGAGGACGGCCAGCAGTCGATCGACGGCATTTTCGACCAGGTCGAGTCGCGGTTCGTGGCG
GCCTGGGAGGACGACGCGGGCCTGA
>Rv2027c_N1 Seen in 1 sample(s).
TCAGCGCAGCGGTGCAGACCACCGCAGCAAGGTGCCTCCGGTCGGCATGTTCTCGACTGTGAATTCGCCGCCCGCGTCGTCGGCACGCTGGCGGAGATTGCGCAGGCCGCTTTCGGTGAT
GTCGCCGGAGATGCCGACACCGTCGTCGACGACCTCGACCCGCACATCATCCTCGACGCTGACGTTGATGGCCAGGCTGGTCGCGTTCGCGTGCCGGACAGCGTTGCTAACCGCCTCCCG
CAGAACCGCTTCGGCGTGGTTGGCCAGGACGGTGTCGACAACGGACAGCGGGCCCGTGTACTGGACCGTGGTGTGCAGCGCGGGGATCGCGAGTTGGTCGATGACCTTGTCCAGTCGGTG
GCGCAGACCCGTCGCCCGGGAGGGCCCGGCGTGTAGGTCGAAGATCGCAGATCGAATCTCCTGAATGATTTCCTGGAGATCGTCGATGCTGCTGTAGATGGATTCCCGGACGGCGGGGAC
ACGTGCTCGCGGAGCGGCACCCTGCAGGGTGAGCCCGACTGCGAAGAGCCGCTGGATGACGTGGTCATGCAGATCACGTGCGATCCGGTCGCGATCGGTCAGGATCTCCACTTCTCGCAT
CTGTCGCTGCGCGGTCGCCAGCCGCCAGGCGAGCGCAGCCTGGTCAGCGAAGGCGGCCATCATATCGAGCTGTTTGTCGCTGAACGGCTGTTCATCGGCACTGCGAAGTGCGACCAGCAC
ACCGGCAACAGTGTCGGCGGCACGCAGCGGCAGCACCAGGGCGGGCCCGGGCTCCACCGGGCCGTCGACCGCGAGGTCAAGCCGGTCGAACCGGCGGGGCGTACGGTCGTGAAAGACTCC
CCCGATCGACGTTCCGCTGACGGCAACCGTCATTTGCTTGACCGCCGGGGAGATCTCTCCGGCCACCTCTACGATGACCAGGTCGTCGACCTCGCAAGCCGGCGCTTGTCGTCGAGCGGC
ACCGCCACCAAGGTGGCTGCCCCAGCCATCAACGTCAACGCTTCCTCGGCGATGAGCCGAAACACCATGGCCGGGTCCGCACCGGCCAGCATCTGCGTTCCGATGTCGCGGGTTGCCTCG
ATCCACGCTTCCCGGGTCCGTGATTCCTCGAAGAGACGGGCATTGTCAACGGCAATCCCGGCCGCGGCGGCCAGCGCCTGCACCAGCACCTCGTCGTCATCGCTGAACGGCTGGCCATCT
GCCTTCTCGGTCAAGTAAAGATTGCCGAACACCTCGTCGCGGATGCGCACTGGAACCCCGAGGAAGGTCCGCATCGGCGGATGGTGCAGCGGAAATCCAACCGATGCGGGATGCCGCGAG
ATATCGTCCAGCCGGATCGGCTTTGGCTCCTCGATCAGCGCGCCGAGAACACCTCGCCCCTCCGGCAATGAGCCGATGAGGTGCCGGGTCTCTTCGTCGATCCCCTCGTAGACGAATTCG
ACCAATCTATGGTCGTAACCGCGCACCCCGAGCGCCCCGTAGCGGGCATCCACCAACTCGGCGGCGGTATGCACAATGGCGCGCAGGGTGGCGTCGAGCTTGAGTCCCGATGTGATCGCC
AAGATGGCGTCGATCAGACCATCCAGCCGGTCGCGGCCTTCGACGATCTGTTCAATCCGGTCTTGGACTTCCAGCAGCAGCTCTCGCAACCGAAGCTGCGACAGTGTCTCGCGCAATGGC
GGGCTGCCAGGGTTAACGTTCGCCCTGTCAGGGTGTGTCAC
>Rv2084_N1 Seen in 1 sample(s).
GTGAGTGACGATTCGTCGTCGGCGTTCGATCTGATTTGCGCCGAGATCGAACGCCAGTTGCGCGGCGGCGAGCTGCTCATGGATGCCGCAGCAGCATCCGAATTACTACTCACCGTGCGG
TATCAGCTCGATACCCAGCCGCGGCCACTTGTCATCGTGCATGGACCGCTGTTTCAGGCCGTCAAAGCGGCCCGCGCACAGGTGTACGGACGCCTGATACAGCTGCGACACGCGCGCTGT
GAGGTGCTCGATGAGCGATGGCAGCTACGGCCGACGGGTCAGCGCGATGTGCGCGCACTGCTGATCGATGTGCTGAACGTGTTGTTGGCGGCCATTACCGCCGCAGGCGTGGAACGGGCA
TACGCGTGCGCGGAGCGGCGGGCGATGGCCGCCGCGGTTGTCGCCAAGAATTACCGGGACGCGTTGGGTGTCGAGCTGCAGTGCAATTCCGTATGCCGAGCCGCCGCCGAGGCGATCCAC
GCGCTGGCGCACCGCACAGGGGCTACCGAGGATGCCGACTGCCTCCCGCCGGTTGATGTGATACACGCCGACGTTACTCGCCGCATGCATGGCGAGGTGGCGACCGACGTTGTCGCGGCC
GGCGAACTGGTGATAGCGGCGCGACACTTGCTGGACCCCATGCCCAGGGGCGAGCTCAGTTACGGCCCACTCCACGAGGGGGGAAATGCGGCCCGTAAATCGGTCTATCGACGCCTGGTT
CAGCTATGGCAAGCGCGCCGGGCTGTTACCGACGGTGACGTCGACCTGCGCGACGCTCGCACGCTGCTGACCGATCTGGACAGCATTTTGCGTGAGATGCGCACGGCCGCAACCATTCAA
CAGGCGTACACACGAGCGGAACGGCGGGCGATGGCGGCGGCGGTCGTCGCCAAGATTCGCGGCGACGCAATGGGCCTCGACGCCCAGCGCGACGCGGTACATCGCGCGGCCGCCGATGCG
CTCCACGCGTTGCAATCGGTTGGCATACACCAATAGGCGACCCTTTGGCAGTTGAGGGTGTAGAGGAGATCGGCGCGTCGTTGCCGGGGCGGGAGTCGACGCCTTCCGATGATGGAGGTT
CCCTACACCCATCAGGAAGACCTCGACGCGTCCATCGCCGCCGGTGGTGCGGGCTTGGCCTGTGCTGA
>Rv2148c_N1 Seen in 1 sample(s).
TCAGGTGACCGTAACCGCCGCGGACCCAATAGCGCGGTACCGACACGCACACAGGTCGAACCATGTTTGACGGCGACTTCAAGGTCGTTGGACATGCCCGCCGACAGACCGATCGCGTGC
GGGAACATCGCACGCACCCGGTTGTGCTCCGATTGCAGCCGGTCAAAGGCCTCGTCCGGGTCCCAATCCAGCGGCGGAATGCCCATCAACCCGACCAGTTCGAGGCCCTCTGACTCCTGC
ACCTGCGCGCAAATCCGGTCTACGGCGCCGGGCGTCGTGCTGTCGACGCCGCCCCGGGATCCGTCACCGTCGAGGCTGACCTGGACGTAAACCCGCAGCCGCTCGCCACGACGGTGTTCG
GCCAGCGCCGCAACAACCGCCCGATCCAGCGCGGTCACCAACCGCGAGCTGTCCACCGAGTGAGCGGTGTGCGCCCAGCGAGCCAGCGACCCGGCTTTGTTGCGTTGAATCCGGCCCACC
ATGTGCCAGTGCACACCCCCCGAGTGACCCAACTCGGCAGCCGCCAACAACCGATTAAGTTCGGCCATCTTGGCTGAAGCTTCCTGTTCGCGCGATTCGCCAACGGACCGACAACCCAAT
CGAAACAAAATCGCAACATCGGTTGCTGGAAAGAATTTGGTAATCGGTAGAAGTTCAATTTCGCCGACATTGCGACCCGCCGCCTCCGCGGCCGCCGCAAGTCGCGATCGCATTGCCGCC
AACGCATGCGTCAATTCCGATTCGCGGTCTGGATACGCCGAAAGATCCGCCGCCAT
>Rv2176_N1 Seen in 1 sample(s).
GTGGTCGAAGCTGGCACGAGGGACCCGTTGGAGAGCGCGCTGCTGGACAGCCGCTATCTGGTCCAGGCCAAGATTGCCAGCGGCGGCACCTCGACGGTCTACCGGGGCCTGGATGTCCGA
CTCGACCGGCCCGTCGCGCTGAAAGTGATGGATTCTCGCTACGCGGGCGATGAACAGTTTCTGACCCGCTTTCGACTGGAGGCCCGTGCGGTTGCCCGGCTAAATAACCGCGCGCTGGTC
GCGGTCTACGACCAGGGCAAAGACGGCAGGCACCCGTTTCTGGTGATGGAGCTCATCGAGGGCGGTACCCTGCGCGAGCTGCTGATAGAACGTGGTCCCATGCCGCCACATGCCGTTGTG
GCGGTGCTGCGCCCAGTGCTTGGCGGGCTGGCTGCCGCCCATCGAGCCGGTCTGGTGCATCGCGATGTCAAGCCCGAGAACATCTTGATCTCCGACGACGGCGACGTCAAACTCGCCGAT
TTCGGGTTGGTCCGCGCGGTCGCCGCCGCTTCAATCACGTCTACCGGCGTCATCCTGGGTACCGCGGCCTACCTGTCCCCTGAGCAGGTCCGTGATGGAAACGCCGATCCTCGAAGCGAC
GTCTACTCTGTCGGCGTTCTGGTCTACGAGCTGCTAACGGGGCACACACCGTTCACCGGCGACTCGGCCTTGTCGATTGCCTACCAACGGCTTGATGCTGACGTGCCGCGTGCCAGTGCT
GTAATCGACGGTGTACCGCCACAATTCGATGAGTTGGTGGCATGTGCAACTGCCCGCAACCCTGCCGACCGATACGCCGATGCGATCGCGATGGGCGCCGATCTGGAGGCGATCGCCGAG
GAGCTGGCCCTGCCTGAATTCCGGGTACCGGCGCCGCGCAACTCCGCTCAACACCGGTCGGCCGCGTTGTACCGCAGCCGGATTACCCAGCAAGGGCAGCTGGGTGCCAAACCGGTTCAC
CACCCTACTCGCCAGCTGACTCGCCAACCCGGCGACTGCTCCGAGCCGGCTTCAGGGTCGGAGCCCGAACACGAGCCGATCACCGGCCAATTCGCCGGCATCGCAATCGAGGAATTCATC
TGGGCGCGACAGCACGCCCGTCGAATGGTGCTTGTCTGGGTGTCGGTGGTGCTGGCGATCACCGGGCTAGTGGCGTCCGCGGCATGGACGATCGGGAGCAACCTGAGCGGCCTGCTCTAA
>Rv2185c_N1 Seen in 1 sample(s).
TCAGCCCTCGACTCGTTTCTTCAGATCCTTCAACGCGCCGTCTATCAACCTGCGTTCCGCCTTACGCTTGAGCATCCCGATCATGGGGACAGCAAGGTCGACGGCAAGCTCGTAGGTGAC
CTCAGTGCCAGAACCCTTGGGCGCCAAGCGATACGTGCCTTCGAGGGACTTTAGCAGCGAGCTGGATTCGAGAGTCCAGCTAAGCGATTGGCGGTCTTCCGGCCACTCGTAGGACATGAT
CAAGGTGTCTTTGAAGATGGCTGCGTCCATCAACATTCGCGCTCGTTTCGGGTAGCCCTCGTCGTCGGCCTCTAGGATCTCGACTTCCTTATACTCCGAAATCCATTGCGGGTAGGCGTC
GATGTCGGCGATCGCCTTCATCACCTCGCCTGGATCCGCGTCGATGTAAATCGTCTGTGTCGTCTTGTCCGCCAC
>Rv2241_N1 Seen in 1 sample(s).
GTGGCGTCGTATTTGCCCGACATTGATCCCGAGGAGACCTCGGAGTGGCTGGAGTCCTTTGACACGCTGCTGCAACGCTGCGGCCCGTCGCGGGCCCGCTACCTGATGTTGCGGCTGCTA
GAGCGGGCCGGCGAGCAGCGGGTGGCCATCCCGGCATTGACGTCTACCGACTATGTCAACACCATCCCGACCGAGCTGGAGCCGTGGTTCCCCGGCGACGAAGACGTCGAACGTCGTTAT
CGAGCGTGGATCAGATGGAATGCGGCCATCATGGTGCACCGTGCGCAACGACCGGGTGTGGGCGTGGGTGGCCATATCTCGACCTACGCGTCGTCCGCGGCGCTCTATGAGGTCGGTTTC
AACCACTTCTTCCGCGGCAAGTCGCACCCGGGCGGCGGCGATCAGGTGTTCATCCAGGGCCACGCTTCCCCGGGAATCTACGCGCGCGCCTTCCTCGAAGGGCGGTTGACCGCCGAGCAA
CTCGACGGATTCCGCCAGGAACACAGCCATGTCGGCGGCGGGTTGCCGTCCTATCCGCACCCGCGGCTCATGCCCGACTTCTGGGAATTCCCCACCGTGTCGATGGGTTTGGGCCCGCTC
AACGCCATCTACCAGGCACGGTTCAACCACTATCTGCATGACCGCGGTATCAAAGACACCTCCGATCAACACGTGTGGTGTTTTTTGGGCGACGGCGAGATGGACGAACCCGAGAGCCGT
GGGCTGGCCCACGTCGGCGCGCTGGAAGGCTTGGACAACTTGACCTTCGTGATCAACTGCAATCTGCAGCGACTCGACGGCCCGGTGCGCGGCAACGGCAAGATCATCCAGGAGCTGGAG
TCGTTCTTCCGCGGTGCCGGCTGGAACGTCATCAAGGTGGTGTGGGGCCGCGAATGGGATGCCCTGCTGCACGCCGACCGCGACGGTGCGCTGGTGAATTTAATGAATACAACACCCGAT
GGCGATTACCAGACCTATAAGGCCAACGACGGCGGCTACGTGCGTGACCACTTCTTCGGCCGCGACCCACGCACCAAGGCGCTGGTGGAGAACATGAGCGACCAGGATATCTGGAACCTC
AAACGGGGCGGCCACGATTACCGCAAGGTTTACGCCGCCTACCGCGCCGCCGTCGACCACAAGGGACAGCCGACGGTGATCCTGGCCAAGACCATCAAAGGCTACGCGCTGGGCAAGCAT
TTCGAAGGACGCAATGCCACCCACCAGATGAAAAAACTGACCCTGGAAGACCTTAAGGAGTTTCGTGACACGCAGCGGATTCCGGTCAGCGACGCCCAGCTTGAAGAGAATCCGTACCTG
CCGCCCTACTACCACCCCGGCCTCAACGCCCCGGAGATTCGTTACATGCTCGACCGGCGCCGGGCCCTCGGGGGCTTTGTTCCCGAGCGCAGGACCAAGTCCAAAGCGCTGACCCTGCCG
GGTCGCGACATCTACGCGCCGCTGAAAAAGGGCTCTGGGCACCAGGAGGTGGCCACCACCATGGCGACGGTGCGCACGTTCAAAGAAGTGTTGCGCGACAAGCAGATCGGGCCGCGGATA
GTCCCGATCATTCCCGACGAGGCCCGCACCTTCGGGATGGACTCCTGGTTCCCGTCGCTAAAGATCTATAACCGCAATGGCCAGCTGTATACCGCGGTTGACGCCGACCTGATGCTGGCC
TACAAGGAGAGCGAAGTCGGGCAGATCCTGCACGAGGGCATCAACGAAGCCGGGTCGGTGGGCTCGTTCATCGCGGCCGGCACCTCGTATGCGACGCACAACGAACCGATGATCCCCATT
TACATCTTCTACTCGATGTTCGGCTTCCAGCGCACCGGCGATAGCTTCTGGGCCGCGGCCGACCAGATGGCTCGAGGGTTCGTGCTCGGGGCCACCGCCGGGCGCACCACCCTGACCGGT
GAGGGCCTGCAACACGCCGACGGTCACTCGTTGCTGCTGGCCGCCACCAACCCGGCGGTGGTTGCCTACGACCCGGCCTTCGCCTACGAAATCGCCTACATCGTGGAAAGCGGACTGGCC
AGGATGTGCGGGGAGAACCCGGAGAACATCTTCTTCTACATCACCGTCTACAACGAGCCGTACGTGCAGCCGCCGGAGCCGGAGAACTTCGATCCCGAGGGCGTGCTGCGGGGTATCTAC
CGCTATCACGCGGCCACCGAGCAACGCACCAACAAGGCGCAGATCCTGGCCTCCGGGGTAGCGATGCCCGCGGCGCTGCGGGCAGCACAGATGCTGGCCGCCGAGTGGGATGTCGCCGCC
GACGTGTGGTCGGTGACCAGTTGGGGCGAGCTAAACCGCGACGGGGTGGCCATCGAGACCGAGAAGCTCCGCCACCCCGATCGGCCGGCGGGCGTGCCCTACGTGACGAGAGCGCTGGAG
AATGCTCGGGGCCCGGTGATCGCGGTGTCGGACTGGATGCGCGCGGTCCCCGAGCAGATCCGACCGTGGGTGCCGGGCACATACCTCACGTTGGGCACCGACGGGTTCGGCTTTTCCGAC
ACTCGGCCCGCCGCTCGCCGCTACGTCAACACCGACGCCGAATCCCAGGTGGTCGCGGTTTTGGAGGCGTTGGCGGGCGACGGCGAGATCGACCCATCGGTGCCGGTCGCGGCCGCCCGC
CAGTACCGGATCGACGACGTGGCGGCTGCGCCCGAGCAGACCACGGATCCCGGTCCCGGGGCCTAA
>Rv2264c_N1 Seen in 1 sample(s).
TCAGCTACCCGAAATCGCGCTGAACTGTCCGATCCCGGGAACCTGAGGTATTCCGGGAGGGGAGCTGCGGAATCTCCGGAATCGGTGGGATCGGCGGGATCGGTGGAGGGCTGGGGGACG
TGGTCGCCGGCGGCTGCGTGGTCGCCGGCGGCTGCGTGGTCGCCGGCGGCGCGGAAGCGGGGGTCGTCGGTGCCGGAGTGATGACATCGGTGGTCACCGCCGGTTGCGTATTCGTCGTTG
TCGGCGGAGGCGGCAACGGCTGCTGCAGCGGCGGCGCGGGCCCACCGGTGGCTGGCGCCTGTACGGGAGGTGCGGGCTCGGCAGTGGGGCCATCGGATGCCGGCGCTGGGGACGGGGGCG
GTGCGGCGGTCGTGGTCACACCCGGCCTCTGCGGGGTCCCCGGCGCCGTCGGCTGGTCGCCGGTGGACAACCCGATCGCCACGGCGGCACCCACCAGCAACACCGCCACCGTCGTGCCGG
TGATGATCACGGCCGGCAGGCGATACCACGGGATTGGCGGGGACTTGGGCTCGGGCTCCGCATGGGCATCGTGGTCGAAGCTCAGCGACGGGCGGGCCGCTGTGTAGCCAGGGGCCGGCC
CGATGTGGGAGTCCTCGTCGGCCTCCGACCAGGCCAAAGCGGGCTGCAGGACCGACGCCGGCGCATCGGCCGGCGCCGTCGCCGTCGCCGAGGTGACCGCGGTCAGCACCGTTGCGCTGG
TGTCGCCGGGTCTGCGTGCCGCCCACAACGCGCCGCCGAAAGCGGCCGTCAATTGCGGACGAGGCGTCCTGACCACCGGCACGCAGAAACGTCCGGACAGCGTCGTGGTGACTGCCGGGA
TATTTGCACCACCACCCACCGAAACGATCGCTACCAGCTCGGCCGTGCGAATTCCGCTGCGGGCCAGGGTTTGTTCCAAGGCCCTGCCCACGCTGTCCAGCGAGTCACGGATTGTGTCCT
CGAGCTCGTTGCGGGTCAACCGGATATCCCCGCCCAACGCGTCGGTCAGCGTGGTCACCGTGCTTGACGAAAGCCGTTCCTTGGCTTTGCGACATTCGATCCGCAGCTTAGTCAGTGAGC
CGATCGCCGAGGTGCCGGCTGGATCGAACGCGCCCGTGCCCGGTAGTTCGGACATGACGTAGCTCAACAGCGACTGATCGATCAGATCGCCGGAGAAAGCCTGATGGCGCACCGTCGCGG
CCACCGGCCGATACTCGTCTGCGGCGTCGACGAGCGTGATGCCGGTCCCGCTGCCACCGAAGTCGCATACCGCGACGATCCCACGGGCCGGTATGCCCGGGTCGGCCCGTATCGCGTACA
GCGCTGCCGCGGCGTCAGGGAGCAGTGACAGTGGCTGGGCCGTACTCGAAGTCCCGTGCGACCATTCCGAGGCCCGACGCAGCGCGCTATCCAACGCTGCTACCGCAGCCGGCCCCCAGT
GGGCGGGATAGGTCACCGTGACACTTCCGGGAAGAGCACGACCGCCGGTAGCGGTGTAGGCCAGCGCCAGCAGTGCGTCAGCCACTAGCGCCTCGCTGCGGTACACCGAGCCGTCGGCAG
CCACGATGCCGACCGAATCTCCCACCCGGTCTACGAAGTCGGTGATCACCAGGCCTGGCTCGTCCAGCCTCGGGTTCTCCGATGGCACACCGACCTCGGGCGGGCGCTGTCGATACAGCG
TCAGCACGGGTTTACGTGTGATGGAGTGATCGGCAGCCACAGCCGCTAGGTTGGTGACACCGATCGACAAGCCTAATGCCGGTCTCGCCCCTGTTGCCAT
>Rv2293c_N1 Seen in 1 sample(s).
TCAGCCAGGGTCCCGCCGCCTGGAAAAAGTTACCGGTATAGCCAAGTGAGCGATCGGGTGCACTACAGGGTTGGCAGCCAAAAACGCTGCCGCCGTTCGGGATGCAAGGAAAAGCCTGGC
CGTTGTTCTTGTCGGAGCTAGACCCGTCACCGCCGACGAACAGTTGCGGCTGGCGCCCCAGGTGGTTCAACCGGACGACCGGAACGTTCCTGCACAGACAGACAGGATTGCCGAGCGTGT
TGATGTTGTCCAGTACAACAGAAAGCGTCTGGGCAGTAGCCAGCATGCCGGGATCGACCCCACGGAATGTTGCCCCGTTGTCCAGGGTCCACCGTGCTGGTATTGCCACGTCCCCAATGC
TGGTGCGGCCGGCACCACCGGCGACGCCCGAGAACATCACGGCGGCAATGGCAATGGAAGAAGCACAGGTAAAGCGTGCGAAGGCGGTCTCGGTGGTGTTGGTAGCGTTCACTAGGCCGA
TGCCGGTCATCGCCACAATCACCTTCTTGCCGCTGATCGAGCCCAGGTAGTAGCGACGACGGTCGGCGACCACCACCGGGTTGGCGTCCAGCGCGGTGTGCGCCAGCACCGCGTCGGCCT
CAGCCGGAAACGCCGACAAGACCAGCGTGCGCTGTTCGCACGGGATCACATTTGCCACGTATCCGGGATCGGCCGCCGCCACGCCACAGCCCAGCGACAACGCGGCCGCCACCAAAAGAC
AGTGCCGCAAAGGCGCGCCCAC
>Rv2339_N1 Seen in 1 sample(s).
ATGGTGCCGGGCGAAGTCCACATGAGTGATACGCCGTCAGGCCCGCACCCAATCATCCCGCGGACGATTCGCCTGGCCGCGATTCCCATCTTGCTGTGTTGGCTGGGATTTACCGTTTTC
GTCAGCGTCGCCGTTCCTCCGTTGGAGGCGATCGGTGAAACCCGGGCCGTGGCAGTTGCCCCCGACGATGCGCAATCGATGCGTGCGATGCGACGTGCGGAAAGGTGTTCAACGAATTCG
ATTCCAATAGCATCGCGATGGTCGTCCTGGAAAGCGATCAACCACTAGGCGAGAAGGCCCATAGGTATTACGACCACCTGGTCGATACGCTCGTACTGGACCAGAGCCATATCCAGCACA
TTCAAGACTTTTGGCGTGATCCCCTGACGGCGGCGGGTGCGGTCAGCGCAGATGGTAAGGCGGCGTACGTTCAACTTTACCTCGCCGGCAACATGGGTGAAGCACTCGCAAACGAATCCG
TTGAAGCCGTCCGGAAAATTGTGGCGAATAGTACACCGCCGGAAGGCATCAGAACCTATGTCACCGGACCGGCGGCCTTGTTTGCCGACCAAATCGCCGCCGGTGACCGAAGCATGAAGC
TGATCACCGGATTAACGTTCGCGGTAATCACCGTGTTGCTGCTGCTCGTCTATCGCTCGATCGCCACCACGCTGCTGATTCTTCCCATGGTGTTTATTGGACTCGGCGCGACGCGTGGCA
CCATTGCCTTTCTTGGATACCACGGAATGGTCGGCCTTTCGACTTTTGTGGTCAATATCCTCACGGCACTTGCCATTGCTGCCGGTACAGACTACGCGATCTTCCTGGTCGGCCGCTATC
AAGAAGCCCGCCATATCGGCCAGAATCGCGAAGCCTCTTTCTACACGATGTACAGGGGCACCGCTAACGTCATTCTCGGATCGGGACTGACCATCGCCGGCGCAACATATTGTCTGAGTT
TCGCCCGGCTGACGCTGTTTCACACCATGGGGCCTCCGTTGGCAATAGGCATGCTGGTTTCGGTCGCGGCCGCGCTGACCCTGGCGCCCGCCATCATTGCCATCGCCGGCCGCTTCGGCT
TGCTCGACCCCAAGCGAAGACTGAAGACCAGGGGCTGGCGTCGTGTGGGTACCGCAGTCGTGCGCTGGCCCGGGCCAATTCTGGCCACGTCGGTCGCGCTTGCCCTGGTGGGATTGCTCG
CACTACCGGGCTACCGGCCCGGCTATAACGATCGCTACTACCTGCGCGCTGGCACGCCTGTCAACCGCGGGTATGCGGCCGCCGACCGGCACTTTGGCCCAGCCCGGATGAACCCCGAGA
TGCTGCTGGTCGAGAGCGATCAAGACATGCGAAATCCGGCCGGGATGCTCGTCATCGACAAGATCGCCAAGGAGGTCCTGCACGTGTCCGGGGTCGAGCGGGTGCAAGCGATCACCCGGC
CGCAGGGGGTGCCCCTTGAGCATGCGTCGATTCCCTTTCAGATCAGCATGATGGGTGCCACCCAGACGATGAGCCTGCCCTACATGCGCGAACGCATGGCCGATATGTTGACCATGAGCG
ACGAAATGCTGGTTGCGATCAATTCCATGGAACAGATGCTCGACTTGGTGCAGCAGCTCAACGACGTTACCCATGAGATGGCAGCCACGACGCGCGAGATCAAAGCTACTACCAGCGAAC
TGCGAGATCACCTTGCGGACATCGACGATTTCGTCAGGCCGTTGCGTAGCTATTTCTACTGGGAGCACCATTGCTTCGACATTCCGTTGTGCTCGGCGACGCGATCACTGTTTGACACCC
TAGACGGCGTCGACACGCTGACTGACCAATTGCGGGCCCTTACCGACGACATGAATAAGATGGAGGCGCTCACACCGCAATTTCTCGCACTGCTGCCGCCAATGATCACGACCATGAAGA
CCATGCGGACCATGATGTTGACCATGCGATCAACAATAAGTGGCGTACAAGATCAAATGGCCGATATGCAAGACCATGCGACTGCGATGGGGCAGGCCTTCGACACCGCAAAAAGCGGCG
ATTCATTCTATCTTCCTCCGGAAGCCTTCGATAATGCAGAATTCCAGCAAGGCATGAAGTTGTTTTTGTCGCCGAATGGTAAGGCGGTGCGCTTCGTAATTTCCCACGAGAGCGATCCAG
CAAGTACTGAAGGTATCGATCGCATCGAAGCGATAAGGGCCGCGACCAAAGATGCCATCAAGGCGACACCATTGCAAGGCGCTAAAATCTATATCGGTGGCACGGCTGCGACCTACCAAG
ACATTCGAGACGGTACCAAGTACGATATCCTCATCGTTGGTATAGCCGCGGTATGCCTGGTATTTATTGTCATGCTCATGATTACCCAGAGCCTGATTGCGTCACTCGTCATTGTTGGCA
CGGTACTTCTGTCATTGGGTACTGCGTTCGGACTGTCCGTGCTCATCTGGCAGCACTTTGTCGGTCTCCAGGTGCATTGGACGATCGTCGCGATGTCTGTCATCGTCTTGCTGGCCGTCG
GTTCTGACTACAACCTCCTTTTGGTGTCCCGGTTCAAGGAGGAGGTCGGCGCTGGATTAAAGACCGGGATCATCCGGGCGATGGCCGGCACCGGCGCAGTTGTCACGTCGGCCGGTCTGG
TATTCGCGTTCACCATGGCGTCCATGGCCGTCAGCGAACTCCGCGTTATCGGACAGGTCGGCACCACCATCGGGCTCGGTCTACTTTTCGATACCCTGGTGGTCCGATCGTTCATGACGC
CATCCATCGCAGCGCTGCTAGGTCGCTGGTTCTGGTGGCCGAACATGATCCACTCGAGACCCACCGTCCCGGAGGCGCACACACGCCAGGGCGCTCGCCGAATTCAGCCGCATCTGCACC
GGGGTTGA
>Rv2437_N1 Seen in 1 sample(s).
GTGCTGCAGCGGACCAACGTTGTCCAACCGCTGAATACTCTGCGCATGGTCTGGATTCAGGTTGCCGGCATAATCCCGGCGACGGCCGGGATCGCGGCCACGGTTTCGCCCAGCTTGCGA
TGGGCGATTCGTGGCGGATCGGGGTGGACGAGCAGGAGAACACCACTCTGGTGCGCACCGGCCCGTTTAAATGGGTGCGTCACCCCATCTACACGGCCATGATGGCGTTTGGCCTCGGGC
TGTTGCTGGTGACTCCGAATCTCGTTGCCCTCGCCGGGTTTATCCTGCTCGTTGCCACGCTCGAGGTGCATGTCCGCCGCGTCGAAGAACCCTACCTGTTGCGGACGCACAGTGCCGTCT
ACCGCGGCTACACCGCCAGCGTCGGCCGGTTCGTCCCGGGTGTGGGGTTGATCCGCTAG
>Rv2526_N1 Seen in 1 sample(s).
ATGACCGTAAAGAGGACCACGATTGAGCTGGACGAAGATCTTGTGCGGGCAGCCCAGGCCGTCACCGGGGACGCGAGCGACGGTCGAGCGCGCGCTGCAGCAGCTGGTGGCCGCGGCTGC
CGAGCAGGCCGCCGCGCGCCGGCGGCGGATCGTCGACCATCTCGCGCACGCCGGCACTCACGTGGACGCAGACGTGCTGCTCTCCGAGCAGGCGTGGCGATGA
>Rv2545_N1 Seen in 1 sample(s).
GTGAGCACCACCATCGTTGCTGGCGTGATCCAGGGTCACCTGCCGGTGATCCTGCCCACGCGCAGGCGGGCTCGCGATCTCGGGCACACGACGGCGTTTTTCGGGCGCAAACGCTCCAAT
GCATATATCTCAGTATCGAATACCTATATGTTTGCTCCATGTCTCGGCGTACAACGATCGACATCGATGACATACTGCTGGCCCGCGCGCAAGCGGCGCTCGGTACCACCGGGCTGAAGG
ACAGGGTCGATGCCGCTTTGCGAGCCGCGGTGCGCTAG
>Rv2587c_N1 Seen in 1 sample(s).
CTATCCCCGTCCCGTCCGAGCCATGGCCCGGCGTTCGCGTGCGACCTGCTGCACCGCTCCCAGGCCGTTGTATGCCGGCTTGGCCAGCAGCGACGATTTGGACGCCAGATACACCAACGG
CCACGTCACCAAGAACACCACGACGAGGTCCAGGATCGTGGTGAGGCCCAGGGTGAACGCGAACCCCTTCACCTGACCGATCGCCAGAAAGTACAGCACGGCAGCGGCCAGGAAAGTGAC
GGCGTTGCCCGACACGATCGTCTTGCGGGCACGCGCCCAACCGCGCGGCACTGCCGACCGGAACGAACGGCCTTCGCGGATCTCGTCTTTGATGCGTTCGAAGAACACCACGAACGAGTC
GGCGGTGGTCCCGATACCGATGATCAGGCCCGCAATACCAGCCAGATCTAGGGTGTAGTTGATATATCGGCCCAAGAGCACCAGGATCGCAAAAACCATTGAGCCAGAAGCCACTAGCGA
CAAGGCCGTGAGCAGTCCCAGCACTCGGTAGTAGAGCAGCGAATACACCAGCACCAACAGCAGGCCGATCGCACCCGCGATCATGCCCGCGCGCAGCGATGACAACCCCAAGGTCGCCGA
AACCGTTTGGGCTTCCGACGGTTCGAAGGACAGCGGCAGCGACCCGTACTTGAGGACGTTGGCGAGCTGGCGTGCGGTCGCCGCGGTGAATGGCGGATCCCCACCGCTGATCTGGGTTCG
GCCGCCGGGGATCGCTTCCTGGATCTGCGGTGCACTGACAACCTGCGAGTCCAGGGTGAACGCCGTCTGGGTGCCGATATGGGCGGCGGTGTAGTCGGCCCAGATGTTGGCCGCCGGACC
CTTGAACTGCAGGTCGACGACGTAGCCGATGCCGCGCTGGTCCATACCCGAGGTGGCGTTTTGGATCTGGTCGCCGCTGATGACCGACGGCGCCAGCAGGTACGCGGTCTTGTGGTCGGT
CGAGCAGGTCACCAACGGCAGTTTCGGGTCGTCGTTGCCGGCCAAAATGTCGTCGCTCTCGCAGCGGGTCGCCTGGAATTGCAGTGCAACCATCTGCATGTATTGGTTGGTGCTCTGCCG
CAGCTTCTTCTCCTGGGCGATGCGCTCGGCGAGATCCTTGCGCGGATCCGTGGCCGGCGCCTCAGCGGGCGGCGCCGGCGGCGGGCTGGCCGGTGAGGTCGGGTTGGGCGATGGCGCCGG
GTCCTGCGGATAGGGCCGCGGTTGGGCCCCAGGTTGCGGTGAAGCCGGCGCCCCCGATTGGGCTGGCGGCGGTGCGGCGGGTTGACCGGGCGGCTGCGGTTCGGCGCTGGGTGCCGGCTG
CGGTTCTTCGGCTGCGGGCTGCGCCGGCATCGAGTTGAGCACCGGCCGGATGTACAGCCGAGCGGTCTGTCCGAGGTTGCGTGCCTCGCTGCCGTCGTTGCCGGGCACCGTGATGACCAG
GTTGTCACCGTCGACGACCACCTCCGACCCGGACACTCCCAGCCCGTTGACCCGCGCGCTGATGATTTGCTGCGCCTGTGCCAGCGCTTCCCGGCTCGGGGCCGAGCCGTCCGGTGTGCG
CGCGGTCAGCGTGACCCTGGTGCCGCCCTGCAGGTCAATGCCGAGTTTGGGGGCGGTGTGCTTGTCCCCGGTGAAAAACACCAGCAAATAGATGCCGATCAGCATCACCAGGAACACCGA
CAGGTAACGGGCAGGGTGCACCGGCGCCGAAGACGATGCCAC
>Rv2848c_N1 Seen in 1 sample(s).
TTACGCTCGTGGAGTATTGCAGGCCGCATGTGCGACGAAACGCGCCACCGCACCGGGGTGTTGCGGCCGGATGGGTATGCAGGTAGGACGCGTGCACGCCGCTGTGCACCGCGCCGTCTC
GCACGTCGTCCACGTCTTGGCCCTGGTACACCCACGCGGGCTGATAGCTATCGGCGAATGTGACTGCGGTTCGGTGGAATTCATGTCCAACCACGCGCTCGCCGACGGAGTACAGCGCCG
AATCAACAACCGCGACGGCGTCGCGATAACCCAGCTTGAGATGCTGGGTGAACCGCGCCGATCCGGCCACCACACCGCACATCGGGTGTCCGTCGAGTTCAGAAACCAGATAGAGCAGGC
CGGCACATTCGGCATGCACCGGGGCGCCGGCAGCGGCCAGTTCGTTGATCTGCCGCCGGACGGTGTCGTTGGCGGACAACTCGGCGGTGAACTGCTCGGGGAATCCGCCGGGCAACACCA
CCGCGTCCGTACCCTCGGGCAGAGTTTCGCTGAGCGGGTCGAACTCGACCACTTCAGCCCCGGCGGCGCGCAACATCTCGGCGTGTTCGGCGTAGCCGAAGGTAAACGCCCTTCCGGCCG
CGATGGCAACCGTGGCTGGCTGGCGGGCGGTGTTGCCGACGGCAATCACCGGGTCCCATGGCGGGTGGGCCGCCTGGCTCCCGGCGCAGGCGATCACCGCGGCCAGATCGACGTGGCGAG
CGACCACAGCAGTCATCGCCTGCACGGCGAGCCGTGCGCGACGGCCGTACTCGACGGCGGTAACCAGACCCAGATACCTTGTCGGCAGCTCTAGTTCAGCTGTGCGTGGAATGGCGCCCA
AGACCGCGACACCGGCCTGGTCACACGCCTGTCGCAGCACCTGTTCATGTCGGGCCGATCCGACCCGGTTGAGGATGACACCGGCGATCCGAGTTGCGGTGTCGAACGTGGAAAAGCCGT
GCAGCAGTGCGGCAACGCTGTGACTCTGGCCGCGGGCATCGACCACCAGGATCACCGGGGCGCCAAGCAGAGCAGCGACGTGCGCGGTGGACCCCGCTGCGGGCGCGCCCCCGGCAGGCC
CAATGCGCCCGTCGAACAGCCCCAGCACCCCTTCGATCACGGCGATGTCCGCGCCCGCAACTCCATGCGCGTACAGGGGGCCGATAAGCCGCTCCCCCACCAGTACCGGGTCGAGATTGC
GGCCGGGCCGTCCCGCGGCCAGGGCGTGATAGCCGGGGTCGATAAAATCCGGGCCTACCTTAAACGGCGCGACGGTGTGACCGGCCTGCCGCAGCGCTCCGATCAAGCCCGTCGCGATCG
TGGTCTTACCGCTGCCCGACGCAGGCGCGGCGACGGCCACCGCGGATACCCGCAT
>Rv2902c_N1 Seen in 1 sample(s).
CTAAGTGCCCTGAGCGCGACCCGTGGCCCGCATTGTCGCTGGGTGGGAACTCTTGCTCCATCTTCCCTCACCCGTCTGTGCCGTCCCGTCCCGAGGGTCGGGTTGGCCGTCGGCGACCTC
TGCGGTGTTCGACCCACTCGCCACCCGGCGAACATTGATGAACGAGTAACGGTGCTGCGGGCAGGGTCCCAATCGGGCCAGCGCCCGGCTGTGCGCCGGGGTGCTGTAACCCTTGTGCTC
CGCGAAACCGTACCCGGGGTAATCGGCGTCCAACGCAACCATCACGCGGTCCCGGCTGACCTTGGCGAGCACGCTAGCCGCGGCGATGCAGGCGGCTGCCGCGTCGCCACCGATCACCGG
CAACGACGGCATCGGCAGTCCTGGCACGCGAAAGCCGTCGCTGAGCACATAACCGGGCCGCACCGCCAGACCGGCCACCGCGCGCCGCATACCTTCGATATTGGCCACGTGCACGCCGCG
GCGGTCGACCTCGGCCGACGGGATGAACACCACGTGATAGGCCACCGCATACCGGCAGATCAGCGGGAACAGCTTCTCCCGCGCTTGCTCGCTGAGCTTCTTCGAATCATCAAGGGCGGC
AAGACTTGCTATCCGCCCGGGGCCAAGCACGCAGGCCGCGACCACCAACGGGCCAGCGCAGGCGCCGCGACCCACTTCGTCGACCCCGGCCACCGGCCCCAGACCACCACGATGCAGCGC
GGACTCCAGGGTGCGCATTCCCCGCAAACCCCCAGATTTACGGATCACCGTCCGCGGTGGCCAGGTCTTGGTCAT
>Rv2947c_N1 Seen in 1 sample(s).
TCACCCACGGCACCATCGACGGCCGCGGCCCGCGGCCCCCGGTGCTTTCGCTCGCCTCAACCGGCGCCTCTGCGGGGGCTGGTACGGGGGCCTCTTCCAAGATCAGATGCGCGTTGGTGC
CGCTGATCCCAAAGGAGGACACCGCCGCCCGGCGCGGACGCCCGTCAACCGACCACTCCCTGGCCTCGGTCAACACCGACACCGCGCCGCTGGTCCAATCCACCCGCGGGGAAGGCTCAT
CCACATGCAACGTCGCCGGCATCACCCCATGACGCATCGCCTGCACCATCTTGATCACCCCGGCGACCCCCGCGGCGGCCTGGGTGTGGCCCATGTTCGACTTGATTGAGCCCACCCACA
GCGGCTGCTCCGCTGGACCTCCCTGCCCGTAGGTGGACAGCAATGCCTGCGCTTCGATGGGATCACCCAACGTGGTGGCGGTCCCGTGTGCCTCCACCACGTCTACGTCTGCGGCGGACA
ACCCGGCGTTGGCCAACGCCGCCTGGATCACTCGCTGCTGGGCGAGCCCATTGGGCGCGGTCAGCCCATTGGACGCACCATCCTGGTTGACCGCGCTCCCCCGCACCACCGCCAGCACCG
AATGCCCCAACCGCCGGGCGTCCGATAGCCGCTCCAGCACAACCACCCCGGCGCCCTCGCCCCACCCGGTGCCGTCGGCCGCGGCCGCAAACGCCTTACATCGCCCATCGGCAGCCAACC
CCCGCTGCCGGGAAAACCCCACAAAAATCGACGGCAGCCCCATCACCGTCACCCCACCGGCCAACGCCAAATCACACTCCCCGGAGCGCAATGACGACATCGCCCAATGGATCGCCACCA
ACGACGACGAACAAGCGGTATCCACTGACACCGCCGGGCCCTGCAGCCCCAATACGTACGACACACGTCCCGAGGCCACGCTGATTGACGTGCCGGTCAACCCGTACCCTTGCAGCCCCC
CGGTATCCCTATTGCCGTAACTCGCCGCGAAAATGCCGGTGTACACCCCGGTCGCCGAACCACGCAACGACAACGGGTCAATCCCCGCGTGCTCCAACGCCTCCCACGAAACCTCCAGCA
TCAACCGCTGCTGAGGATCCATCGCCAACACTTCACTAGGAGCGATGCCGAAGAACCCGGCGTCAAAGCCGGTGGCGTCGTCTAGAAATGCCCCCCATCGCGTGTAGGTTTTGCCCTCAG
CGTCGGGATCCGGATCGTATAGCCCCTCAACATCCCAGCCCCGATCGGTCGGAAACTCCGACACCACGTCGCGCCCCGCCGAAACGACATCCCAGAGTCCGTCCGGGCCATCCACGCCGC
CCGGAAATCGGCAGCCGATTCCCACCACCGCCACCGGTTCTGTCGCGCGTTGCTCATATTCACGCAGCCGAGCGCGTGTCTCATCGAGCTCGACAGCAACCTTCTTTAGGTAGTGAAAAA
GCTTTTCGCTCTGCTGGTCGGCACCTTCAACGCTCATCGTCCGTTGCTCCTCTATCAC
>Rv2975c_N1 Seen in 1 sample(s).
TCAACGCGCCGGCCGCGAGAGCGGCCGCAACCCGCGCCACGTCTTCGGCGTCAGCCTGCGAATTCGCGTGCAAATCAGCTTCTACGACCGCGGCACGCATGGTGAACAGCATGTTGACGC
CGGTATCGGAGTCAGCGACCGGGAACACATTGAGCCGGTTGATCTCGTCGATGTGGAGGATCAGATCGCTGACGACGGCGTGTGCCCAGTCCCGCAAGGCCGAGGCGTCCAACGGCCGAT
CCGCCGTCCCCAC
>Rv3091_N1 Seen in 1 sample(s).
ATGCCGATCCCCTTTGCCGATGGGATGCTCAGCCGGCTGGGTCGCCGCGGGGCAGCGCTCGACCTGATCGAGGAGTTCGAGGACGAGTCCGGGGAGCCCCCCGCATCCCTGAGCCCCGCC
GACCTGCTGGCCGCCGAACCGGCCCTGCTGCTGCAGAAGATGGAGAACCGCCTCGTCCGGCACCACCTAGCCAATCCGGACGTGTTGAGCGGCGAACAGCTGCGCAAGCTGCGCTACATC
CTCAATTTCGCCAGGCTGGCCGACTTCGAACCGGGGGCCGCGGGGCCGGGCGGAAGCCGCGGTCGCGGGGACATCTCGGTGGGCGGCCAAGTCGCGCCTTGGCGGTCCCGGGTCGTCGAC
GCGTTGTACGCACCGCTGCGCGAGGAGCCCGATCCGGTCACGGCGCTGGAGGGCGCGAAAGACGTGCTGGCGACGCTGGTCGACGACCAGGACGATCAGCGTCGAGTGCTCATCGAGCGC
CACGGCAGCGACTTCTCCGCGACGGAACTCGACGCCGAGGTCGGCTACAAGAAGCTGGTGACCGTCCTCGGCGGCGGCGGGGGCGCGGGCTTCGTCTACATCGGCGGCATGCAACGGCTG
CTGGCGGCCGGCCAGGTGCCCGACTACATGATCGGCTCGTCGTTCGGGTCGATCATCGGCAGCCTGGTGGCCCGTGAACTGCCGGTGCCGATCGACGAGTACGCCGAGTGGGCCAAAACG
GTGTCCTACCGCGCCATCCTGGGCCCGGAGCGGCGGCGCAGCCGCCACGGGTTGGCCGGAATGTTCACCCTGCGCTTCGACCAGTTCGCCCATACCCTGCTCAGCCGTGCGGACGGCGAA
CGGATGCGCATGTCGGATCTGGCAATCCCGTTCGATGTCGTCGTCGCCGGTGTGCGCAGGCAGCCTTATGCGGCGCTGCCGTCCAGGTTCCGCCATCGCGAGCGGTCTACACTGACGTTG
CGGTCGCTGCCGTTTCTGCCGATCGGTATCGGCCCGTGGGTGGCGGCACGCATGTGGCAAGTCGCGGCCTTCATCGACTTGCGGGTGGTCAAGCCGATCGTCATCAGCGCCGACGGCGCG
ACACGCGACGTCAACGTCGTTGACGCGGCGTCTTTCTCGTCGGCCATCCCCGGTGTGCTGCACCACGAAACCAGCGACCCGCGGATGCTGCCAATCCTCGACGAGTTGTGCGCCGACCAG
GACGTCGCGGCGATGGTCGACGGCGGCGCGGCCAGCAACGTCCCGGTCGAATTGGCGTGGGAGCGGGTCCGCGACGGGCGGCTCGGCACCCGCAACGCATGTTATCTGGCGTTCGACTGC
TTCCATCCGCACTGGGACCCCCGACATCTGTGGCTGGTACCGATCACCCAGGCGGTCCAGCTGCAGATGGTGCGCAACCTGCCCTACGCCGACCACCTCGTCCGATTCGAGCCGACGCTG
TCGCCGGTGAACCTGGCGCCGTCCGCGGCGGCCATCGACCGGGCTTGCCGGTGGGGGCGCGACAGCGTCGAACCGGCGATTGCGGTGACATCGGCGCTGCTGGAGCCGACGTGGTGGGAA
GGCGACAGGCCCCCCGCCGCCGAACCCAAGGAACGCACAAAGTCGGCGGCCTCGTCGATGAGCGCCGTGATGGCCGCGATTCAGGCGCCGACGGGCCGGTTTCGGCGATGGCGAAGCCGC
CACCTGACCTAG
>Rv3197_N1 Seen in 1 sample(s).
ATGGATGATGGGAGTGTGTCAGATATCAAACGGGGCCGCGCCGCGCGCAATGCGAAGCTGGCCAGCATCCCGGTCGGCTTCGCCGGTCGGGCGGCGCTCGGGCTCGGCAAGCGACTGACC
GGTAAGTCAAAAGACGAGGTTACCGCCGAGCTGATGGAGAAGGCCGCCAATCAGTTGTTTACCGTCCTCGGCGAACTCAAGGGTGGCGCGATGAAGGTCGGCCAGGCGCTGTCGGTGATG
GAGGCCGCCATTCCCGACGAGTTCGGCGAACCCTACCGGGAAGCACTGACCAAGCTGCAGAAGGACGCCCCACCGCTGCCCGCCAGTAAGGTGCACCGGGTGCTCGACGGACAGCTGGGC
ACCAAATGGCGGGAGCGGTTCAGCTCGTTCAACGACACCCCAGTGGCATCTGCCAGCATCGGCCAGGTGCACAAAGCAATCTGGTCGGACGGCCGAGAAGTGGCCGTCAAGATCCAGTAT
CCCGGCGCCGACGAGGCGCTGCGCGCGGACCTCAAGACCATGCAGCGCATGGTCGGCGTGCTCAAACAGCTCTCACCCGGCGCCGACGTCCAAGGGGTGGTCGACGAACTGGTTGAACGC
ACCGAAATGGAACTCGACTACCGGCTGGAGGCCGCCAACCAGCGCGCCTTCGCCAAGGCGTACCACGACCACCCGCGCTTCCAGGTGCCTCACGTCGTGGCAAGCGCACCGAAGGTGGTG
ATCCAGGAGTGGATCGAAGGTGTGCCGATGGCAGAGATCATCCGTCACGGGACCACCGAGCAGCGTGATCTGATCGGTACGCTGCTCGCCGAGCTCACCTTCGACGCACCACGGCGGCTG
GGGTTGATGCACGGCGACGCCCACCCCGGTAATTTCATGCTGCTGCCCGACGGCCGGATGGGCATCATCGACTTCGGTGCCGTGGCACCGATGCCCGGCGGCTTCCCGATAGAGCTCGGG
ATGACGATTCGACTGGCCCGCGAGAAGAACTACGACCTCCTGTTGCCGACGATGGAGAAGGCCGGGTTGATCCAGCGAGGACGACAGGTGTCGGTTCGCGAGATCGACGAGATGCTGCGC
CAATACGTCGAGCCCATCCAGGTCGAGGTCTTCCACTACACCCGCAAGTGGTTACAGAAAATGACCGTCAGTCAGATCGACCGCTCGGTTGCGCAGATCAGAACGGCGCGCCAGATGGAC
CTGCCGGCCAAGCTCGCGATTCCGATGCGGGTTATCGCATCGGTGGGCGCGATCCTATGCCAGCTGGACGCGCATGTGCCGATCAAGGCCCTGTCGGAGGAGCTGATCCCGGGTTTCGCC
GAGCCCGACGCGATCGTCGTCTGA
>Rv3234c_N1 Seen in 1 sample(s).
TCAGCACCACGTCGTGGACGTCACAGTCGTAGCGAGCCCGCACCGTGCGATAGTCATCAAGACTTGCACGGGCAACCGTAAATCGCCGATTACGCGACACGGTGGCATTGAGCGGGCTAC
TGGGCGCGGTGCCCCGTGCCACCGTGCGGGCGATATCGAGAACCTTGCGGCCCGTCTCGACGAGTTGGCCGGAATTCGTTACCAACCCGGCGACCGCGGATCCGACGGCCTGTAGTTGTG
CGCCCGGCCGCACCAGCCAGTCCCCGACCGCGCGCAGCAGCAACCGCGTGGTGCCGGGGTCCCGTTCCGGGACCCAGATGTCTTCCGGAAACGCCGGTGGACGCCGCGTCCGGTCGGCGA
TCACGTGGCCTATCGCCAGCGCGGTCACCCCGTTGATCAGGGCTTGGTGCGACTTGGTGTAGAGGGCAATGCGATTCTTTTCCAGACCCTCGACGAGATACATCTCCCACAATGGCCGCG
ATTTGTCCAGCGGCCGAGCGGCCAGCCGTGCGATCAGCTCGTGCAGTTGCTCGTCACTACCCGGCGACGGCAGGGCCGACCGCCGGACGTGGTAGGTGATGTCGAAGTCGCGATCGTCGA
TCCACACCGGCCTGGCCAGGCCCAATTTCACTTCCTGGACTTTCTGACGATAGCGCGGTATCTGCGGCAGCCGCTGTTCGACGGTTTCCAGCAGTGCCTCGTAGCTCAATCCGGCACGCG
GACGGCGCAGGATCAACAGCAACCCGACATACATTGGGGTGGCTGTGTTCTCCAGCTGATAGAAGGAGGCGTCCGATGCAGACAACCGGGTGACCAC
>Rv3253c_N1 Seen in 1 sample(s).
TCAACACCTCCGGGTCGCGCTCTCTCGAGCTTGCCGAAGGCCCTGCGCCGAGTGCCGGCGCCCGTAGCCGACATAAATCGCGGTTCCGGCCACCAGCCAGATCCCGAACCGGATCCAAGT
CAACGCGGTGAGGTTCAGCATCAGCCACAGGCACGCGCACACTGCGGCGATCGGAAGTAACGGCACCCACGGAGCTGTGAACCCCCGCTGAAGGTCGGGTCGGGTCCGGCGCAGCACGAC
CACTCCGGCCGAGACGAGGATGAACGCGAACAGTGTCCCGACGTTGACCATCTCCTCAAGCTTGGTGATCGGAAACACCGACGCCGTCGTGGCCACCAACACCGCGACCAGCACCGTGAC
CCGGACCGGGGTGCCGCGCGAACCGGTCTTGGCCAATTGCCGCGGCACCAAGCCGTCGCGCGCCATGGCGAACAGCACGCGGCATTACCCGAGCATCAACACCATCACCACCGTGGTAAG
CCCGGCCAGCGCGCCGACGGAGATGATGCCGCTGGCCCAGTACACCCCGTTGGCCTGGAACGCGGTGGCCAGATTTGCCGGCCCGCGGCCCGGTACGGTCCGCAGTTGGGTGTATGGAAC
CATGCCCGACAGCACCACCGATACCGCGACGTAGAGAAGGGTCACGACCCCCAGCGACGCGAGAATCCCTCGAGGGACGTCTCGTTGAGGACGCTTGGTCTCCTCGGCCATGGTGGCCAC
GATGTCAAACCCGATAAACGCGAAGAACACGATCGATGCCCCGGCCAGCACGCCGTACCATCCGTAGTGGCTGCCTTGGGCTCCGGTCAGCAACGAGAAGACGGATTGATCGAGCCCGCC
GCCGTGGTGCTGGACTTCGGGCTCGGGAATGAACGGCGAGTAGTTGGCGGCCCTGATGTAGAAGGCACCGACGACCACCACCAAGACGACCACCGACACCTTGATTGCGGTGACCACCGC
GGAAAATCTCGACGACAATTTGGTGCCCAACGCGATCAGGGTCGCCACCAACGTGACGATCACGAGCGCACCCCAGTCGAGCTGCAGCGATCCGAGATGGCCTGTGCCATTACCGAATCC
GAACACGGTGCCCAAGTAGCTGGACCAGCCTTTGGCGACCACGGCCGCACCCATCGCCAGTTCCAGCACCAGATTCCAGCCGATCACCCAGGCCAAGAACTCCCCGAAGGTGGCATAAGA
GAAGGTATAGGCGCTGCCGGCCACCGGCAGCGTCGAGGCGAACTCGGCGTAGCACAGCGCGGCCAGCGCACAGGTCGCCGCCGCGATCAGAAACGATATCCAGATGGCCGGGCCGGTGAT
ATCGCCAGCGGTCGACGCGGTAACCGTGAATATTCCGGCGCCAATCACCACCGAGACGCCGAAAACAACCAGGTCCCACCAGGTGAGGTCCTTGCGCAGCCGAGTGGTGGGCTCGTCGGT
GTCGGCGATTGACTGTTCTACCGACTTCATGCGCCGTCGACCGGCCAT
>Rv3725_N1 Seen in 1 sample(s).
ATGCAGAATGCCACCATGCGCGTTCTGGTCACCGGCGGTACGGGATTTGTGGGCGGGTGGACTGCCAAAGCCATCGCTGACGCGGGCCACTCCGTCCGGTTCCTGGTGCGAAATCCCGCA
CGGCTGAAGACGTCTGTCGCGAAACTGGGCGTCGACGTGTCGGACTTTGCGGTTGCAGACATATCCGACCGCGATTCGGTACGGGAGGCGTTGAACGGATGCGACGCCGTCGTGCACAGC
GCCGCGCTGGTGGCAACCGACCCGCGTGAGACTTCGCGGATGCTGAGTACGAACATGGCGGGCGCCCAAAATGTTCTCGGTCAAGCCGTCGAGCTCGGAATGGATCCGATCGTGCATGTG
TCGAGCTTCACGGCGCTGTTTCGTCCCAACTTGGCGACGCTGAGCGCTGATCTGCCGGTTGCCGGTGGGACGGATGGATACGGACAATCCAAAGCGCAGATCGAAATCTATGCGCGCGGT
CTTCAGGACGCCGGCGCACCGGTGAACATCACTTATCCTGGCATGCTCCTCGGCCCGCCGGTGGGCGATCAATTCGGTGAAGCCGGGGAGGGTGTCCGGTCCGCATTGTGGATGCATGTC
ATTCCCGGGCGCGGCGCGGCGTGGTTGATCGTCGACGTCCGAGATGTGGCGGCACTGCACGCGGCGTTGTTGGAATCCGGGCGTGGGCCGCGCCGCTACACTGCGGGAGGTCATCGGATT
CCGGTGCCCGAGCTCGCGAAAATTCTGGGCGAGGTCGCCGGCACCACGATGCTGGCCGTCCCGGTGCCCGATTCCGCGCTGCGTGTCGCGGGATCGGTGCTGGATCAAGCCGGGCCCTAT
CTGCCTTTCAATACTCCGTTCACCGCGGCAGGTATGCAGTACTACACACAGATGCCGGAGTCCGACGATTCGCCGAGCGAAAAAGAACTAG
>Rv3736_N1 Seen in 1 sample(s).
ATGTCCGTGGTGCGCGGGACCGCTCTGGCTAACTACCCGAGCCTAGTTGCCGGGTTGGGCGGTGACCCGGCCACTCTGCTACGGGCCGCGGGTGTTCGGGATCAGGATGTCGGCAACTAT
GACGCGTTCATTTCGATCCGGGCAGCGATTCGGGCAATCGAATCGGCCGCAGCGGTCACCGCCACAATGGATTTCGGGAGACGATTGGCACAGCGGCAAGGGATTGAGATCCTGGGACCG
GTCGGTGTGGCGGCCCGCACGGCCGCCACGGTCGGTGACGCTCTGGCGATCTTCAACACCTTCATGGCGGCCTACAGCCCAGTTATCGCCATCCGGATCACGCCGCTGGCCGGACAGCGG
TCATTTATTGCACTCGAGTTCCTGCTCGACGAGCCGGCGTCGTATCCGCAGACCATGGAGCTGGCGCTCGGGGTGGCGCTCGGGGTGATCCGGTTGTTGTTGGGCGCTGACTACGCCCCA
CTGGCCGTGCACTTACCCCACGACCCACTCACACCCGAAGCCTTCTACCTGCAGTACTTCGGCTGCCGGCCTTACTTCGCCGAACGTGTTGGTGGTTTCACCATGCGCACCGCGGACCTG
AGCCGTCCCCTCAACCGCGACGATGTCGCCCACCGGGTGGTCGTCGACTACCTGAGCAGCATCACGCCGCTGGGCGAGGGGATCGTGGAATCGGTGCGCACCATCGTGCGCCAGCTGCTG
CCCACCGGAGCGGCGACGCTCAACGTGGTCGCCGGGCAGTTCCACCTGCACCCGAAAACGCTGCAACGTCGACTTGCGGAGGAGAACACCACATTCGTTATTCTGGTCGATCGGGTCCGC
AAGGATGTCGCCGATCGCTACCTAAGGACCACCGGGATCGGCCTTACCCATTTGGCACGTGAACTGGGCTACGCCGAACAAAGCGTGTTGACCCGCTCGTGCAAACGCTGGTTCGGAACC
GGACCGGCCGCCTACCGCAACCAGGCCAGGTTACAGACAACCGTGAGCGCACCTGGCAGCGGGCGTGGTCCGAATCCAGGTAACGTCTCAGTATCCTGCTGA
>Rv3825c_N1 Seen in 1 sample(s).
TTACTGTGACGACAGCGCGGCAGCAGGAGCGTCGTCGGGCGCCAGCTGTTCATACAAGTGATCCGCTAAGCCCCGCACCGTGGCGCTGACGTTCTTGGGTGCCAACCGGATTCCGGTCTC
GGTCTCGATCCGAGTGCGCAGCTCTAGTGCGCCCAACGAATCAAGTCCATACTCGGGTAGCGGGCGGTCAGGGTCGACGGTGCGCCGCAGAATCAGGCTGACCTGCTCGGCGACCAGCTG
CCGAAGCCGCGCCGGCCACTCGTCGCGTGGCAGCTCGTTCAGCTCGACGCGGAATTTGCTTGTGCCCGAACCGTTGCTGCTGGAGAACACTTCGAAAAACCGGCTGCGCTCTGCGAAGGC
GACCAGCCACGGGGCTCCGATGACCGGGGCATAGCCGGTATAGACGCGGTTGTGGCGCAATAGCGCCTCGAACGCGTAAGCACCTTCGTCGGGAGTGATCGCCGTGTAGTTGCTTTCCTC
CAATGCCGAAGCCCGCGCGGGCGATGCCGACCACCACCCCAACTGGCCGATATCCGACCAGGCTCCCCACGCGATCGCGGTAGCCGGCAGGCCCTGAGCTTGCCGCCAATGCGCGAAGGC
GTCCAGCCAGCTGTTGGCCGCTGAGTAGGCACTCTGTCCCGGCGAGCCGGTGAGAGCTGCCGCCGACGAAAACAAGCAGAACCAGTCAAGCGGCTGTCCGCTGGTTGCTTCATGCAACTC
CCAGGCACCGTGAACCTTTGGCGCCCAGTCGCGCGCCAGCAACTCGTCGGTGATATTGGCCAAGGTGGCGTCCTCGACCACCGCGGCCGCGTGTAGCACGCCTCGTACCGGAAGCCCGGT
GGCCACAGCGGTCGCCACCAACCGCTCCGCGGTACCCGGTTGGGCGATGTCACCGCATTCCACCACGACTTCAGAGCCCATCGCCGCGATGGCCTCGATCGTTTCCCTCATCTTTTGCGT
CGGCTGGGTGCGGGAATTCAGCACGATCCGGCCGCAACCGGCCGCGGCCATCTTCTCGGCCAGGAACAGCCCTAGCCCACCGAGGCCGCCGGTGATGATGTAGGAGCCGTCGGGACGGAA
CACCTGAGCTTGTTCCGGAGGCAGGGTAACGAGGCTTTTTCCGGTCTGTGGGATGTGGAGGACGAGTTTGCCGGTGTGCTCGGCGTTGCCCATCACACGGATGGCGGTGGCCGCCTCGAC
GAGGGGGTAATGGGTGCTCTGCGGCATCGGCAACTCGCCGGCTGCGGTCAAGCGATAGACCGTGCCGAGCAGGTCGCGCAGCTCTTCTGGGTGTGTCGCAGACAGCAACCCCAGGTCTAC
GGCGTAGAAGGACAGGTTGCGCCGGAAGGGAAAGAGCCCCAGCTTGGTGTCACCATAGATGTCGCGCTTGCCAATCTCGACGAACCGTCCCCGGAAGGCGAGCAGTTTCAGCCCGGCAAG
TTGCGCGGCGCCGGTCACCGAGTTGAGCACGACATCGACACCCCGGCCGTTAGTGTCCCGCCGAATCTGCTCGGCGAACTCGATGCTGCGCGAGTCATAGACATGCTCAATACCCATGTT
GCGCAATAGCTCTCGACGCTGTGGGGTACCGGCGGTGGCGAAGATCTCAGCGCCCGCCGCGCGGGCTATAGCGATCGCCGCTTGTCCGACCCCGCCGGTGCCGGAGTGAATTAGCACCGT
GTCACCCGCCCTAATCCGGGCGAGCTCATGCAGTCCGTACCAGGCGGTGGCGTGCGCGGTGGTCACCGCAGCGGCCTGTGCGTCACCCAGGCCCGGTGGCAGCGTCGCGGCCAGCCGAGC
GTCACACGTGACGAATGTGCCCCAGCAGCCGTTAGGCGACATGCCACCAACATGGTCACCAACCTTGTGGTCAGTGACGCCTGGTCCGACCGCGGTCACCACGCCGGCGAAATCCGTGCC
CAGCTGGGGCAGGTGTCCCTCGAAGCTGGGGTAGCGACCGAAAGCGATGAGTACATCGGCAAAGTTGACGCTGGACGCACGGACCGCAACCTCGATCTGTCCTGGTCCTGGTGGAACGCG
GTGAAACGCGGCCAGCTCTATCGTTTGCATATCGCCGGGGGTACGGATCTGCAGGCGCATGCCGCTCTGCTGATGATCCGCGACGATGGTGCGCCGCTCCTGAGGACGCAACGGGGTCGG
ACACAAGCGCGCCACGTACCACTCGTTGTCTCGCCAGGCGGTCTCGTCTTCTTCCGACGTGGCCAGCAATTGGCGTGCCAGCTGCTCGACACCGGTCTGTTCGTCCACGTCGATCTGGGT
GGCACGCAGGTGAGGGTGCTCGGCGCCGATCGTCCGCAGTAGACCACGCAGCCCGCCCTGCTCAAGATTGACGCAGTCGTCGGCCAGCACCCGCTGGGCACCCCGCGTCACGACGTACAT
GCGCGGCACCGCCCCGGGAAGGTCTGACAATTCGCGAGCGATACCCACCAGCCGGCGAACGTACTCAGCGCCGCGATCCGCGCTCCCCTGATGCGGCGTACCGGTGTTCGACCCGGTGAG
CACGACCACGCCGCTAAACTCGTCGCTACCAACTTGATCGCGTAGCTGGTCGGCGGCGGCCAACTGGTCGTCGTGCAGTGGCCACCGCATCGTCGTGCACGCCGCGCTGTGTTCCCTAAA
CGCGTCCGCTAGCCGGGTAGCGGTCACATCAGAGGCAGCGCAGTCACTGATCAGCAGCCATTTTCCAGCGCCAGAGGGGTCCATCTCGGGCAGCTCACGCTGGTGCCATTCGATGGTGAG
TAAGCGCTCATTCAGCACCCGATTGTGTTTATCGCGCTCGGACACTCCCGTACCGATTCGCAGTCCGCACACGGCCAGCAACACCGTGCCGTGCGCGTCCAGCACGTCGATATCGGCCTC
GACGCCGACCAACTCGACTTTGGTCACCCGCGTGTAGCAATAGCGAGCGGTACGCACCGGAGCATAGGCACGGACTCGGCGCACCCCCAACGGCACCAATAGGCCGCTACCTACCGACTG
GCTATCGGGATGCGCGCCGACCGACTGGAAACAGGCATCCAGGAGGGCCGGGTGGATTGCGTACAGGCCCTGCTGCGAACGAATCGAGCCGGGCAGCGCGACTTCGGCCAGCATTGTGGC
GGTCGCATCCTCCGCGACATAGGCCACGGCCAGGCCGGTGAAGGCCGGACCATATTGCACACCGTGCTTGTCGAATTGCCGGCGCAGATCCTCACCGTCCACGCGGCAAGGGTGGGCTTC
CAATAAGGAGGCCATGTCGTACGCCGGCGGCTCGCATTCGCCGGATACCTGCTGCAGCACCGCCGACGCACGCCGCAAGTGATGCCCAACGCCTTCCTGCAAGGCCTCGACGGCGAAGTC
GACGACACCGGGCGAGGTCACCGTTGCCACGGTGGACACCGGGGTCTGGTCATCCAGCAGCAGCATCGCCTCAAAGCGCATGTCGCGTACTTCGGACTGCTCGCCGAGGACGGCACGGGC
CGCAGACAACGCCATCTCGCAGTAGGCGGCCCCTGGAAGAGCAGCCACGTTGTGTATCCGGTGATCGCCCAACCAGGGCAAGGTTGCGGTACCAACATCGGCCTGCCAGGCGTGGCGTTC
CGGCTCTTCGGGCAATCGCACGTGTGCGCCCAACAACGGGTGCACGGCTACCGTGGAGCCACCCGGCGACCGATTGTCAACGCCTTCGCGGTCATAGAACAGGAACCGGTGCGACCACGC
CGGCAGCGGAGCATCGACCAAGCGGCCTTGGGGACAGAGCACCGAGAAGTCCACTGCCGCACCAGCGTTGTGCAGATCCGTCAGCAGGCGACGGAGCCCCAGCGGCAATGGCTGCTCCCG
CCGCATACCGGCCAGCGCGGCAACCGGCATGCCTACACTGCCGGCAATCTGATCGACCGCGTGGGTCAGCAGCGGGTGCGGCGAAAGCTCGGCGAAGACTCGGTACCCGTCGTCGAGCGC
CGAGCGCACCGCAGCGGAGAACCGCACGGTGTGGCGCAAATTGTCGGCCCAGTAACGCGCGTCGCACGCCGGCGCTTCGCGCGGGTCGAAAAGCGTCGCCGAATAGTAGGGAATCTCAGG
AGCTTTCGGATTCAGGTCGGCCAGCGCAGCTATCAACTCGTCGAGGATCGGATCCACCTGCGGCGAATGCGAAGCCACGTCGACGGCCACCGCCCGCGCCAGCACGTCTCGCCGCTCCCA
TATGTCGACCAGCTTGCGCACCGACTCGGTGCCTCCGGCGATCACGGTGGACTGCGGCGCGGTCACCACGGCGACCACCACATCGTCGATGCCTAGAGCGGTCAATTCCGACTGCACAGC
TAAGGCAGGCAACTCCACCGACGCCATCGCCGCGGAACCGGCGATCGTCGCCATCAGTTTTGATCGTCGGCAGATGACGCGTACCCCATCTTCGGCTGACAGCACCCCTGCGACCACAGC
CGCGGCCGACTCACCCATTGAGTGGCCGATCACGGCGCCCGGGCGCACTCCGTATGCCGCCATCGTGGCTGCCAACGCGACCTGCATCGCGAAGATGGTCGGCTGAACTCTGTCGATGCC
AGTCACGGTCTCGGGCGCCGTCATCGCCTCGGTGACCGAGAACCCGGACTCCGCGGCGATCAATGGCTCTAGCTCCGCAACGGTCGCGGCGAACACCGATTCGTTCGTCAGCAGATCGGC
GCCCATCGCTGCCCACTGCGACCCTTGCCCGGAGAATAACCAGACCGGCCCGCGGTCATCCTGCCCCACCGCGGGCTGGTAAACGGTGTCACCGTCGGCGACCTCGCCCAAGCCGGCAAT
CAGCTCGTCGACGCTGCTCGCGATGACCGCCGTGCGCACCGACCGGTGCGTACGCCGCCGCGCCAGCGTGTACGCAAGATCCGAGAGCACCAGGGAGTCGGCGTGCTGCTGTATCCAGTC
GGTCAACCGCTGAGCAGTCTGCCGCAGCGCGTCGGCCGAGGAAGCGGACAGCGTGAACAAGGCAGGGGTGCCGGTCGGGGGGGTGCTCGCCGCGTGGGGCTGGGCTTCGGTTTGCGGAGC
TTGCTCCACAACAGCGTGCACGTTCGTTCCCGAGAACCCATAAGACGACACTGCCGCCCGCCGGGGCACCTGACGACCGTTGGTGGGCCACGGTGTGGTCACCTCGGGCACGAAGAGGTT
GGTGGTGATGCCAGCAATCTCATCGGGCAGCCGAGTGAAGTGCAGATTACGTGGAACCACACCATGTTTCAGAGCGAGAACCACCTTGATTAGCCCTAGCACCCCGGCGGTCGACTGGGT
GTGTCCGAAGTTGGTCTTCACCGATGCGAGTGCGCACGGGCCGTCGACCCCATACACCTCGGAGACACTTGCATATTCAATGGGGTCACCGATCGGGGTGCCGGGGCCGTGCGCTTCGAC
CATGCCGACCGTCGCGGCGTCCACGCCACCGGCAGCCAACGCCGCTCGATAAGCCGCAACCTGTGCGGGCTGCGAAGGCGTCGCGATATTGACCGTGTGGCCATCCTGATTTGCGGACGT
GCCACGAATTACCGCCAGGATCCGGTCACCGTCGGCCAATGCATCCGGCAACCGCTTGAGCACCACCACGGCACAACCCTCGCCTGACACGAACCCGTCAGCCGCGACATCGAACGCGCG
ACAACGTCCGGTCGGGGACAACATGCCCAAAGCGGATCCAGCAGCGGCCTTGCGTGGCTCCAGCATCAAGGCGACACCCCCCGCCAAGGCAACGTCGCTTTCACCCTCGTGCAGGCTGCG
ACACGCCATGTGCACGGCCGTCAGGCCGGACGAGCATGCGGTATCAACGGTTATTGCCGGACCGTGCAGTCGCATCGCGTAGGCGACCCGGCCCGACGCCATGCTGAAGCTGTTGCCCAG
ATATCCGTACGGCTCCTCCAATTGTTTGGCGTCGGCCGCCACCATCGTGTAGTCACCATGGGTGACACCCGCGAACACGCCGGTCGCCGAGCCTGCCAGCGTTTGCTGAGTAAGACCGGC
GTGCTCCATGGCCTCCCAGGACGTCTCCAGCAACAGACGTTGCTGCGGATCGATCGCAATCGCCTCCCGCTCGCCGATGCCAAAGAACTCGCAATCGAAATCCGCGGGGTTATCCAGGAA
ACCGCCCCACTTGCACACCGTCCGACCGGGCACGCCCGGCTGCGGGTCGTAGAACTCGTCGCAATCCCACCGGTCCGGCGGCACCTCGGTGATCAGGTCGTCGCCTCGTAACAACGCCTT
CCACAACAACTCGGGGGAATCGATCCCGCCGGGCAGCCGGCAAGCCATGCCGATAACAGCAACCGGAGTCACACGTGGTTCAGCCAACGTCCATGCACCCCTATCTGCACCAGTGCCTGA
CGCCGCCGACCCCAAGCCCAA
>Rv3829c_N1 Seen in 1 sample(s).
CTACCGACCACTCAAAACGCAACAGTTGGCAGCCCTACGATCGGCCAGCGCCTGACGGGCGGCGTTATATCCAGGGATGAACGTGATTCCCGGCCCACCGTGGACAACCGGCACTGCCCA
GGTACAACCCGGCTATCGGGATCGGCTGGCCGATAAAGCCTTTCGGGCCAGGCCTGTTGGGGCCGATCTGGTCCGAGTGCAGCAGGGCATGGCAGTAGTCCCCACCCGGGGCACCGAACA
TCACACCCATGTGTTTGGGGGTAAAGGTGGTGTACCGGAGAATGCTGCCTTTGAAGTTCGGTGCCAACCTAGTGATCTTGTCGATCACGTTCTGCCCCATTTCGACCTTTGCCCGGCCGT
ACCCTCCGTATTTTGAGCCACCCTCGATCGGGAACCACATTGCGAACGCCGACGCGGCCTGCTTACCCGCCGGGGCCAGGCTGGGATCATGCAGCGACGGGATCTGCAACACCACGGTCG
GATCGGCCGGGACGATCCCACGCCGGCAATCCTCCCACTGCTGCTGAACCTGCTCCGGTGTACAGAAAATGCCCATCGATGCCTGCATGCTCGGATCGTTGAGTGCCTGGTAGGGCGCCG
CGAAGGCCGGTGGCTGCGCGAGCGCAAAATGCATCTGCAGATAGCTGCCGCGGTGGTCGATGCGCAAATAGCGATCGCGGATTTCCGACGGCAACACTGCCGGATCGATCAGCTCGTTGA
TGGTGACGTCGGGTGCTATGGCGGAGACCACGATCGGGGAGGTCAAGGTGTCCCCCGCCGCGGTGCGCACGCCCCGCACGCGGGCTGACGACCGACTATTGTCAACCACGATCTCGGTCA
CCTTGGAACGTAACCGGACCTCGCCGCCGGTGCGTTCCAGCAATTGCGACAGATGGGTGGTAAGCGCGCCGATGCCACCGCGCAATTTCTTCCACCGCACGAAGTCGCCCTCCGGGACAC
CCAATCCGAAGGCGAGCGCGGCAGCGCTGCCCGGTGTGGCCGGCCCGCGATAGAGCGTGTTCACGGCCAGCACGGTCATCGACCCGCGCAGGGCGCCGTGCTTCTCGCGGTCCGGGAAAT
GGCGGTCCAACACGTCGGTGACCGATCCGAACAGCATGTCATCGATCGCTGACCGTTCGAATTCATTTGTGGCACAGGCATACATCTCGTCGAAGCTCTTGGGCAGAGTTCCGGCTTCGA
AACGCCCCAGCGCCCGGGTCGGCGCCTGGCTCCACGCCAGCAGGCCCGCCATCCCGGTGACGGCGTCTGCCCCGTGCACCCGATGGAGGTGGGTAAGCATCTTCGTCGGGTCGGTGAATT
GGACCACCGGATCGTCCCCGACACCGCGCAACGCTACCGACATCACCTCCAGATCGACCGTCGGCAAGCTGTCCAGGCCTAACTCGCTGCTGACCGCCGAGGAGGTCGGGAACTGCACCG
ATCCGGCGATCTCGAACCGGTACCCGTCGAACAGCTCCACCGTGGAGGCCATCCCGCCGGCGTAGCGCTTAGCGTCCAGACACGCGGTCCGCAGTCCGGCTCGCTGCAGCAGCACTGCCG
CGGTCAGCCCGTTGTGCCCGGCGCCGATAACTATCGCGTCATAACCAGTCAT
>Rv3830c_N1 Seen in 1 sample(s).
TTATGACGAAAACTGTGAGGGTGGTCCAGGTGTCGGAGATGCCGACGCGCAGCGACTCCAGTGCGACGTGGCAGACCCGCGCCAGCTCCCCGAGCGACCGGTCACTCCCAAGCATCCAGG
CTTCCATCGCGCCGAACACCGCCGCGGCGACGCATCGTGCGGTGACGGCGATGTGCAATCGGGCATCGGGTGCACCCGCGATATCGCAGTTACGTCGCCGCAATTGGGCCTGGATGGCAT
CGGCGAAGTCGGCTTCCACCTCGCGCATATGGCGGACGATCCGGCTCGGCTCCAACTCGCCGCGCCGCAACGACGCAATCTTCGTCACTGCGTCAACGTCATAAGGAAACGAGAAGATAG
CCGCTTGCACGGAATCGATGATCGATTCGTCGGCCGGTCTAGCATCCAGCGCCGCGCGAAACCAGTGCAGTCCGGCGTCGTAGTCGGCAAACAGCAAATCGTGCTTGGATCTGAAGTGGC
GATAGAAAGTACGCAGCGACACCCCGGCGTCCTCCGCAATCTGCTCGGCTGAGGTAGCCTCGACGCCCTGGGCCAGAAATCGCACCAGGGCGGCCTGGCGCAGTGCCTCGCGAGTGCGTT
CGCTGCGCGCCGTCTGCGGGGGCCGGACCAT
>Rv3847_N1 Seen in 1 sample(s).
ATGGGTACCGGGTCAGGTGGGCCTATTGGGGTTTCTCCCTTCCATTCGCGTGGTGCCCTGAAAGGGTTCGTGATCTCTGGACGTTGGCCTGATTCGACCAAAGAGTGGGCCCAGCTGCTG
ATGGTCGCAGTTCGGGTCGCGTCGTTGCCCGGCTTGCTCTCCACCACAACGGTGTTTGGTGCCCGCGAAGAGTTGCCCGACGAACCCGAGCCGGGGACCGTCGGTCTGGTGCTGGCCGAG
GGCACCGTCTTCGGTGAATCAGCAATTCAGCCAGGATATTTCGCTGATCATCAACCCCCTGCATTGCTGATGCTGCATCCACCCTCGGAGACCACGCCGTCGCTGCCGGAATGCACCGGG
GCGGCGTCAGGGTGCGTGCTGCTGCCGGGATTACCGTATCTGGGATTGGAACATCGTGCGGCTTGGGTGGAGGCTGAAGCCGACGGCACCATCACATCTATGGTGAGCCGGGTGGGCGTC
GACCCGATAAGCCATCCCGACACGCAATTCTGGCAATGCTGCTTGCAGCATAA
>Rv3848_N1 Seen in 1 sample(s).
ATGCTCGCCGCCACACTGCTAAGTCTGGGAGCCGTTTTCCTTGCTGAGCTCGGCGACAGATCCCAGCTCATCACGATGACCTACACACTTCGCTACCGCTGGTGGGTGGTGCTGACCGGG
GTGGCGATCGCAGCGTTCACGGTGCACGGGGTAGCGGTGGCGATCGGCCACTTTTTGGGCTCGACCGTGCCGGCCCGGCCGGCCGCCTGCGTATCGGCGATCGCATTCCTGATCTTTGCC
GTGTGGGTCTGGCGGGAGGACACGGCCAGCGACAGCGAAACCTCGCCAACCGCTGCCGAACCCCGACTCGCGCTGTTCACCGTGGTCTCGTCGTTCGCACTGGCTGAGCTGGGTGACAAG
ACAACGTTGGCGACGGTGACCTTGGCCAGCGATCACCACTGGGCCGGCGTATGGATCGGCACCACCCTGGGCATGATCCTGGCCGACGGCCTGGCGATCGGCGCAGGGCTGCTGCTGCAC
CGGCGCCTTCCGGAGCGGTTGCTGCAGGTCCTGACTGGCCTGCTGTTCCTGCTGTTCGGACTGTGGTTGCTGTTCGACGACGCGTTGGGCTTCAGATCGGTTGCCATCGCCGTGACAGCG
GCGGTGGTGCTGGCCGCGGCAACTACGGCGGTATCGGTGCGGGTGGCGCAAACTCGTCGGCGGCGGCCAACCGCTGCTGCGACACCAGAAGATGACTCGACACGCCCCGAGCGGTCGTCG
GTCGCGCCGGGCCATCCCGGGAGCATCTTGCTACCGCTTCCGGAAGTGTCTTTGCGGGGGCGCCGACCGCCCTCAGGGTCGCCTGACGAGCGCTGTGCGGACCCAGGCAGCAAAGGAGGC
TCTCGGCGAATCTCCGTTGGCTGCTAGTTGCCCGGAGTCGGCCGCATCCGCCCGACACGGTCATCCTGA

In [12]:
# Description of novel alleles; number of mutations, and description of the mutation;
cat sample.call.novel.txt


Sample	Locus	Novel_id	MinKmerDepth	Nmut	Desc
SRR6152708	Rv0024	N1	42	1	From allele 98, Del of len 1 at pos 719.
SRR6152708	Rv0035	N1	46	2	From allele 227, Subst C->G at pos 47, Subst A->T at pos 76.
SRR6152708	Rv0045c	N1	61	4	From allele 62, Subst C->T at pos 318, Del of len 2 at pos 650, Subst A->G at pos 652.
SRR6152708	Rv0063	N1	55	1	From allele 140, Ins of base G at pos 334.
SRR6152708	Rv0101	N1	50	2	From allele 1541, Subst A->G at pos 5360, Subst A->G at pos 6088.
SRR6152708	Rv0134	N1	66	2	From allele 25, Subst G->T at pos 374, Del of len 1 at pos 386.
SRR6152708	Rv0165c	N1	58	3	From allele 16, Subst T->C at pos 147, Ins of base G at pos 163, Ins of base G at pos 164.
SRR6152708	Rv0195	N1	63	2	From allele 88, Subst G->A at pos 185, Subst T->C at pos 191.
SRR6152708	Rv0226c	N1	66	2	From allele 59, Subst A->C at pos 36, Subst A->G at pos 1229.
SRR6152708	Rv0276	N1	42	2	From allele 108, Subst C->G at pos 50, Subst T->C at pos 57.
SRR6152708	Rv0290	N1	51	2	From allele 253, Subst G->A at pos 691, Subst G->A at pos 1033.
SRR6152708	Rv0551c	N1	47	2	From allele 68, Subst T->C at pos 472, Del of len 1 at pos 483.
SRR6152708	Rv0654	N1	49	2	From allele 187, Subst C->T at pos 1303, Subst G->A at pos 1315.
SRR6152708	Rv0739	N1	63	3	From allele 87, Ins of base C at pos 14, Ins of base G at pos 13, Subst A->G at pos 62.
SRR6152708	Rv0757	N1	49	2	From allele 141, Subst T->C at pos 364, Subst T->C at pos 373.
SRR6152708	Rv0818	N1	56	3	From allele 89, Subst G->A at pos 275, Subst C->T at pos 290, Subst G->A at pos 293.
SRR6152708	Rv0826	N1	49	2	From allele 84, Subst C->G at pos 901, Subst G->A at pos 920.
SRR6152708	Rv0888	N1	50	1	From allele 187, Ins of base G at pos 1145.
SRR6152708	Rv0908	N1	55	2	From allele 271, Subst T->C at pos 2197, Subst C->T at pos 2207.
SRR6152708	Rv1001	N1	45	2	From allele 68, Subst T->C at pos 982, Del of len 1 at pos 999.
SRR6152708	Rv1097c	N1	62	3	From allele 135, Subst A->G at pos 302, Del of len 2 at pos 311.
SRR6152708	Rv1128c	N1	38	2	From allele 223, Subst T->C at pos 421, Del of len 1 at pos 439.
SRR6152708	Rv1145	N1	48	2	From allele 2, Subst A->G at pos 818, Ins of base A at pos 829.
SRR6152708	Rv1225c	N1	55	2	From allele 110, Ins of base G at pos 435, Subst T->C at pos 451.
SRR6152708	Rv1258c	N1	47	2	From allele 82, Ins of base G at pos 681, Subst T->C at pos 697.
SRR6152708	Rv1269c	N1	55	2	From allele 63, Ins of base G at pos 110, Ins of base C at pos 123.
SRR6152708	Rv1326c	N1	54	2	From allele 76, Subst A->G at pos 2007, Subst T->C at pos 2034.
SRR6152708	Rv1330c	N1	58	2	From allele 169, Subst G->T at pos 436, Subst T->C at pos 447.
SRR6152708	Rv1363c	N1	43	2	From allele 61, Subst C->A at pos 59, Subst C->G at pos 75.
SRR6152708	Rv1413	N1	60	2	From allele 139, Subst C->A at pos 65, Subst G->A at pos 80.
SRR6152708	Rv1420	N1	47	2	From allele 42, Subst C->G at pos 56, Subst T->C at pos 212.
SRR6152708	Rv1551	N1	48	2	From allele 279, Subst A->G at pos 907, Del of len 1 at pos 917.
SRR6152708	Rv1564c	N1	45	2	From allele 39, Del of len 1 at pos 1564, Subst G->C at pos 1581.
SRR6152708	Rv1615	N1	47	2	From allele 97, Subst C->T at pos 275, Subst C->T at pos 288.
SRR6152708	Rv1775	N1	56	2	From allele 167, Del of len 1 at pos 124, Subst A->C at pos 128.
SRR6152708	Rv1915	N1	58	2	From allele 1, Subst G->A at pos 536, Ins of base T at pos 886.
SRR6152708	Rv2027c	N1	55	2	From allele 5, Del of len 1 at pos 948, Subst G->C at pos 969.
SRR6152708	Rv2084	N1	61	2	From allele 138, Ins of base T at pos 23, Ins of base G at pos 20.
SRR6152708	Rv2148c	N1	75	2	From allele 87, Subst G->C at pos 19, Del of len 1 at pos 4.
SRR6152708	Rv2176	N1	53	2	From allele 53, Subst C->T at pos 75, Subst T->C at pos 95.
SRR6152708	Rv2185c	N1	41	2	From allele 41, Subst G->T at pos 342, Subst T->G at pos 358.
SRR6152708	Rv2241	N1	37	2	From allele 296, Subst T->C at pos 2532, Subst T->G at pos 2545.
SRR6152708	Rv2264c	N1	48	3	From allele 1, Ins of base A at pos 57, Subst C->T at pos 28, Subst T->C at pos 322.
SRR6152708	Rv2293c	N1	60	1	From allele 66, Ins of base A at pos 602.
SRR6152708	Rv2339	N1	57	2	From allele 524, Del of len 1 at pos 219, Subst A->G at pos 224.
SRR6152708	Rv2437	N1	64	2	From allele 78, Del of len 1 at pos 107, Subst C->G at pos 113.
SRR6152708	Rv2526	N1	53	6	From allele 4, Subst A->G at pos 45, Del of len 2 at pos 72, Del of len 3 at pos 73.
SRR6152708	Rv2545	N1	46	2	From allele 48, Subst G->A at pos 91, Del of len 1 at pos 99.
SRR6152708	Rv2587c	N1	51	2	From allele 344, Subst C->T at pos 906, Subst T->C at pos 924.
SRR6152708	Rv2848c	N1	60	2	From allele 238, Ins of base G at pos 58, Subst A->G at pos 66.
SRR6152708	Rv2902c	N1	46	2	From allele 2, Subst T->C at pos 253, Subst G->A at pos 261.
SRR6152708	Rv2947c	N1	37	1	From allele 48, Ins of base G at pos 780.
SRR6152708	Rv2975c	N1	51	3	From allele 42, Subst G->A at pos 32, Del of len 2 at pos 6.
SRR6152708	Rv3091	N1	59	2	From allele 166, Subst T->C at pos 1289, Subst G->A at pos 1299.
SRR6152708	Rv3197	N1	66	2	From allele 10, Subst A->G at pos 342, Subst C->A at pos 351.
SRR6152708	Rv3234c	N1	60	2	From allele 19, Ins of base C at pos 18, Subst G->T at pos 57.
SRR6152708	Rv3253c	N1	36	2	From allele 278, Subst G->A at pos 447, Subst T->C at pos 462.
SRR6152708	Rv3725	N1	44	3	From allele 79, Subst G->C at pos 526, Subst A->G at pos 747, Ins of base A at pos 752.
SRR6152708	Rv3736	N1	75	2	From allele 83, Subst T->C at pos 510, Subst A->G at pos 755.
SRR6152708	Rv3825c	N1	40	2	From allele 480, Subst T->C at pos 4426, Subst G->A at pos 4453.
SRR6152708	Rv3829c	N1	61	2	From allele 328, Subst T->A at pos 86, Ins of base G at pos 103.
SRR6152708	Rv3830c	N1	61	1	From allele 105, Ins of base G at pos 147.
SRR6152708	Rv3847	N1	57	2	From allele 64, Del of len 1 at pos 504, Subst G->A at pos 532.
SRR6152708	Rv3848	N1	53	2	From allele 271, Subst G->A at pos 866, Subst G->T at pos 878.

Updating an MLST scheme with the detected novel alleles

You can update your MLST scheme with the novel alleled detected by MentaLiST, specially after running it on many different samples. In the scripts folder, there are python scripts to help select alleles and build an updated scheme. To do that, you will perform the following steps:

  1. Run MentaLiST on all files in your dataset (first-pass run)
  2. From the found novel alleles, select a subset that satisfy some restrictions in terms of number of mutations or minimum number of samples that have it.
  3. Create a new MLST by copying the existing one and adding the novel alleles.
  4. Run MentaLiST to create a k-mer database for this MLST scheme.
  5. Re-run MentaLiST again on all files, so each novel allele will have a proper allele id. (second-pass run)

Each step will be described below.

For the impatient:

Here are the commands for each of the main steps. Below, I will describe each command and also look at the intermediate files being created.


In [ ]:
# First pass call:
mentalist call -o my_dataset_calls1.txt --db mtb_cgmlst.db -1 SRR6*_1.fastq.gz  -2 SRR6*_2.fastq.gz
# Parse the novel alleles output, possibly filtering some alleles.
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o all_novel_alleles 
# Add the novel alleles to the scheme FASTA files:
create_new_scheme_with_novel.py mtb_cgmlst_fasta/*fasta -o MTB_novel_scheme -n all_novel_alleles.fa
# Build a new MentaLiST db for this scheme:
mentalist build_db --db mtb_novel_cgMLST.db -k 31 -f MTB_novel_scheme/*.fasta 
# Second pass mentalist call:
mentalist call -o my_dataset_novel_calls1.txt --db mtb_novel_cgMLST.db -1 SRR6*_1.fastq.gz  -2 SRR6*_2.fastq.gz

1. Run MentaLiST on all files in your dataset

Let's download a 4 sample tuberculosis dataset:


In [30]:
# Download a 4 sample tuberculosis dataset:
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/002/SRR6397472/SRR6397472_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/006/SRR6398036/SRR6398036_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR615/008/SRR6152708/SRR6152708_{1,2}.fastq.gz --no-clobber
wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR639/003/SRR6398023/SRR6398023_{1,2}.fastq.gz --no-clobber


File ‘SRR6397472_1.fastq.gz’ already there; not retrieving.

File ‘SRR6397472_2.fastq.gz’ already there; not retrieving.

File ‘SRR6398036_1.fastq.gz’ already there; not retrieving.

File ‘SRR6398036_2.fastq.gz’ already there; not retrieving.

File ‘SRR6152708_1.fastq.gz’ already there; not retrieving.

File ‘SRR6152708_2.fastq.gz’ already there; not retrieving.

File ‘SRR6398023_1.fastq.gz’ already there; not retrieving.

File ‘SRR6398023_2.fastq.gz’ already there; not retrieving.

You can run MentaLiST in many samples at one time, by passing all files at once, using the -1 and -2 parameters:


In [34]:
mentalist call -o my_dataset_calls1.txt --db mtb_cgmlst.db -1 SRR6*_1.fastq.gz  -2 SRR6*_2.fastq.gz


[ Info: Opening kmer database ... 
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6397472. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398023. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398036. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.

2. Selecting the novel alleles

To do that, we will use a Python script on the scripts folder of your MentaLiST installation (it might be already on your PATH if you installed via conda).


In [94]:
# optional: select the python environment and/or PATH to run the scripts;
conda config --set changeps1 False # just avoid the PS1 change here on Jupyter, not needed in your console;
conda activate mentalist1
PATH=$PATH:/rhome/pfeijao/sfu/MentaLiST/scripts

The 'parse_novel_alleles.py' script collects all novel alleles, creates a report and also outputs a FASTA file with selected alleles, to include in an updated MLST scheme.


In [36]:
parse_novel_alleles.py -h


usage: parse_novel_alleles.py [-h] [-f F [F ...]] [-o O] [-t THRESHOLD]
                              [-m MUTATION]
                              [-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}]

Given a list of FASTA files with novel alleles found with MentaLiST, output a
FASTA with a unique list of novel alleles.

optional arguments:
  -h, --help            show this help message and exit
  -f F [F ...]          Fasta files with novel alleles.
  -o O                  Output Fasta file with alleles above the threshold
                        requirement(s).
  -t THRESHOLD, --threshold THRESHOLD
                        Minimum number of different samples to appear, to
                        include a novel allele in the output fasta.
  -m MUTATION, --mutation MUTATION
                        Also include if novel allel has equal or less than
                        this number of mutations, regardless of times seen.
                        Disabled by default.
  -ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the logging level

You must give the novel allele FASTA file(s) found by MentaLiST as parameter -f. If you ran all your samples at once (like we did in this example), there is a single FASTA file, but you can also combine different FASTA files for different runs of MentaLiST on the same MLST scheme.

Any given novel allele will be included in the output file (parameter -o) if this exact allele is present in at least (-t) samples. Also, if the parameter -m is given, any novel allele that has -m or more mutations will be excluded; this is useful if you want to include only novel alleles that are very close to existing alleles.

Let's run the script, choosing alleles present in at least 2 of the 4 samples:


In [38]:
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o novel_alleles -t 2


02:10:09 PM (217 ms) -> INFO:Reading the new alleles  ...
02:10:09 PM (245 ms) -> INFO:Writing output ...
02:10:09 PM (248 ms) -> INFO:Done.

There are three files created as output:


In [49]:
ls novel_alleles*


novel_alleles.fa  novel_alleles.samples.txt  novel_alleles.txt

Both .txt files have a report on all novel alleles found in the dataset, including how many times this each allele is seen, the number of mutations, and on which samples.


In [62]:
head -n 30 novel_alleles.txt


Locus	Alleles found	Samples x (mutations)
Rv0024	1	1x (1)
Rv0025	1	1x (1)
Rv0035	1	1x (2)
Rv0045c	1	1x (4)
Rv0049	1	2x (2)
Rv0051	1	2x (3)
Rv0058	1	1x (2)
Rv0063	1	1x (1)
Rv0101	1	1x (2)
Rv0104	1	1x (2)
Rv0134	1	1x (2)
Rv0139	1	2x (2)
Rv0165c	1	1x (3)
Rv0187	1	1x (2)
Rv0195	1	1x (2)
Rv0226c	1	1x (2)
Rv0236c	1	1x (2)
Rv0276	1	1x (2)
Rv0289	1	1x (2)
Rv0290	1	1x (2)
Rv0311	1	2x (3)
Rv0325	1	2x (2)
Rv0347	1	1x (2)
Rv0551c	2	2x (4), 1x (2)
Rv0574c	1	2x (2)
Rv0585c	1	2x (3)
Rv0592	1	1x (3)
Rv0634c	1	1x (2)
Rv0654	1	1x (2)

For instance, locus Rv0311 has one novel allele, which was seen on 2 samples, and it has a distance of 4 mutations in relation to an existing allele in the scheme. On the other has, Rv0551c has 2 novel alleles, where one was seen in two samples, and the other on just one.

On the .samples.txt file, we can see which samples exactly have the novel alleles.


In [63]:
head -n30 novel_alleles.samples.txt


Locus	Count	Samples
Rv0024	1x	SRR6152708
Rv0025	1x	SRR6397472
Rv0035	1x	SRR6152708
Rv0045c	1x	SRR6152708
Rv0049	2x	SRR6398023,SRR6398036
Rv0051	2x	SRR6398023,SRR6398036
Rv0058	1x	SRR6398023
Rv0063	1x	SRR6152708
Rv0101	1x	SRR6152708
Rv0104	1x	SRR6397472
Rv0134	1x	SRR6152708
Rv0139	2x	SRR6398023,SRR6398036
Rv0165c	1x	SRR6152708
Rv0187	1x	SRR6397472
Rv0195	1x	SRR6152708
Rv0226c	1x	SRR6152708
Rv0236c	1x	SRR6397472
Rv0276	1x	SRR6152708
Rv0289	1x	SRR6397472
Rv0290	1x	SRR6152708
Rv0311	2x	SRR6398023,SRR6398036
Rv0325	2x	SRR6398023,SRR6398036
Rv0347	1x	SRR6397472
Rv0551c	2x	SRR6398023,SRR6398036
Rv0551c	1x	SRR6152708
Rv0574c	2x	SRR6398023,SRR6398036
Rv0585c	2x	SRR6398023,SRR6398036
Rv0592	1x	SRR6397472
Rv0634c	1x	SRR6397472

The third file is the .fa FASTA, which has all the filtered novel allele sequences. Comparing with the original FASTA file, we can see that from the 172 unique alleles found by MentaLiST, we are keeping 56.


In [50]:
grep -c ">" my_dataset_calls1.txt.novel.fa


172

In [51]:
grep -c ">" novel_alleles.fa


56

We can check that the locus Rv0551c has only one allele in this FASTA file, even though two new alleles were found. This is because one of the alleles was seen in only one sample, and therefore was filtered.


In [65]:
grep "Rv0551c" -A1 novel_alleles.fa


>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT

We can run the script again, this time without the parameter -t 2, meaning that we don't filter any allele, and we get all novel alleles:


In [95]:
parse_novel_alleles.py -f my_dataset_calls1.txt.novel.fa -o all_novel_alleles


03:03:56 PM (935 ms) -> INFO:Reading the new alleles  ...
03:03:56 PM (1062 ms) -> INFO:Writing output ...
03:03:56 PM (1065 ms) -> INFO:Done.

In [67]:
grep -c ">" all_novel_alleles.fa


172

In [69]:
grep "Rv0551c" -A1 all_novel_alleles.fa


>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGAGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT
>Rv0551c
CTAGCCGACCGCGCGCCCAGCGCCTTCCCAGAACCGTGCGCGCACGGCCTTCTTGTCCGGCTTTCCTAGACCGGTCAACGGCAAAGAGTCGACGACCACCACCCGCTTGGGTGCCTGCACCGATCCCTTGCGTTGTTTGACCGCTGCCTGGATCTCGGCGGTCATGGCCTCGATCGCGGGCTCATCGCGGGCCGCGTTGGAGCGCAACACCACCACCGCGGTGACGGCCTCGCCCCACTTCTCATCCGGCGCGCCAACCACGCACACCTGAGCAACCGCCGGATGCTCGGCCACCACGTCCTCGACCTCCCGGGGGAACACGTTGAAGCCGCCGGTGACGATCATGTCCTTGACGCGGTCGACGATGTAGTAGAAGCCATCGGAGTCCTCGCGGGCCAGGTCGCCGGTGTGCAGCCAGCCGTCTTTAAAAGTCCGCGACGTCTCGTCTGGCAGATTCCAGTAACCGCCCGCCAACAGCGGTCCGCTGACACAGATTTCGCCGACTTCGCCCTGCTTCACCGGCTTGCCATGCTCGTCTAACAGCGCGACGCGGGCGAACAGCGTCGGCCGCCCACATGAGGTCAGCCGCTTCTCGTCGTGATCGCCCTTGGCCAGATAGGTGATCACCATGGGCGCCTCGGATTGCCCGTAGTACTGGGCGAAGATTGGGCCGAACCGCCGGATCGCCTCGGCTAGTCGCACCGGGTTGATCGCCGGCGCCGTAGTAGACGGTTTCCAGCGACGACAGGTCCCGGGTGTGCGAATCCGGGTGGTCCAGCAGCGCGTACAGCATCGATGGCACCAACATGGTCGCTGTAATGCGTTGCTCCTCAATGATTCTGAGTACCTCGGCCGGGTCGAACTTCGCCAGCACTATCATCTCGCCGCCCTTGATCACCGTCGGCGTGAAAAACGCCGCGCCGGCGTGCGACAGCGGGGTGCACATTAAGAACCGCGGGTTGGCCGGCCACTCCCATTCGGCGAGCTGGATCGAGGTCATGGTGGCGATCGACTGCGCGGTGCCTATCACGCCCTTAGGCTTGCCGGTGGTGCCGCCGGTGTAAGTCAGGCCGATAACTTGGTCGGGTGGCAGGTCGGCGGCGACCAGCGGCTGCGGCTGGTATTTGGCGGCCTCGGCGGATAGGTCGACTGCCACATGCTTGAGCGCATCGGGCACCGGCCCAATGGTGAGGATTTGCTGCAGCGAGTCCACCTGCTCCAGCAGAGCCAGTGCGCGCTCGACGAACATCGGGTTGGGGTCGATGATCAGTGAGCTGATGCCGGCGTCGTTCAGCACGTAGGCGTGATCGGCCAGCGAGCCCAACGGGTGCAGCGCGGTGCGCCGATAACCGCGGGCCTGCCCGGCGCCGATGATCATCAAAACTTCAGGACGGTTGAGCGACAGCAGACCGACCGCCACCCCGGTGCCGGCACCTAGCGCCTCGAATGCCTGGATGTACTGGCTGATACGGTCCGCCAGCTGGCCACCGGTCAGCCTGGTGTCGCCGAGGAACAGCACCGGCTTGTTCTGGTGGCGCTTGAGCGCTCCCACTAGCAGATGGCCGTTGTGGGTCGGGCTGCGCAACAGCTCGCCCGAACAATCCTGGTCACGCATGGCGCCGCTCTCCCTCGCTAGCTGGGGTACCCCCACCGCATCGCTTCGTCCCCCGCAAGCGGGTGGTACCCCCACTGCATCGTCGCCGGCGGTGCTCAT

Even in the case that you don't want to filter any allele, you have to run the parse_novel_alleles.py script, as its output will be used on the next step.

3. Create a new MLST scheme with the novel alleles

To create a new MLST scheme with the novel alleles included, provide the original MLST scheme and the novel alleles FASTA file to the script 'create_new_scheme_with_novel.py'


In [70]:
create_new_scheme_with_novel.py -h


usage: create_new_scheme_with_novel.py [-h] [-n NOVEL] [-o OUTPUT] [-i ID]
                                       [-ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                                       files [files ...]

Adds novel alleles to an existing MLST scheme.

positional arguments:
  files                 MLST Fasta files

optional arguments:
  -h, --help            show this help message and exit
  -n NOVEL, --novel NOVEL
                        FASTA with novel alleles.
  -o OUTPUT, --output OUTPUT
                        Output folder for new scheme.
  -i ID, --id ID        Start numbering new alleles on this value, later will
                        implement from last allele id +1.
  -ll {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Set the logging level

So, to add the novel alleles from the previous step in the small MLST scheme from the initial example, we run:


In [71]:
create_new_scheme_with_novel.py mtb_cgmlst_fasta/*fasta -o MTB_novel_scheme -n all_novel_alleles.fa


02:33:14 PM (133 ms) -> INFO:Opening the novel alleles file ...
02:33:14 PM (158 ms) -> INFO:Opening the MLST schema and adding novel alleles ...
02:33:36 PM (21784 ms) -> INFO:Done.

We can see that the new scheme has more alleles, for instance, on locus Rv0551c:


In [80]:
grep ">" mtb_cgmlst_fasta/Rv0551c.fasta | tail -n5


>Rv0551c_364
>Rv0551c_365
>Rv0551c_366
>Rv0551c_367
>Rv0551c_368

In [81]:
grep ">" MTB_novel_scheme/Rv0551c.fasta | tail -n5


>Rv0551c_366
>Rv0551c_367
>Rv0551c_368
>Rv0551c_369
>Rv0551c_370

4. Run MentaLiST to create a new MLST database file

Now we run MentaLiST to create a new MLST scheme database, similarly as it was done before, but now with the new MLST scheme, to include the new alleles. Unfortunately there is now way currently to do this in an incremental way.


In [87]:
mentalist build_db --db mtb_novel_cgMLST.db -k 31 -f MTB_novel_scheme/*.fasta --threads 4


[ Info: Opening FASTA files ... 
[ Info: Combining results for each locus ...
[ Info: Saving DB ...
[ Info: Done!

5. MentaLiST call - second pass

Now, we can rerun the MLST calling phase with the new DB:


In [90]:
mentalist call -o my_dataset_novel_calls1.txt --db mtb_novel_cgMLST.db -1 SRR6*_1.fastq.gz  -2 SRR6*_2.fastq.gz


[ Info: Opening kmer database ... 
[ Info: Finished the JLD load, building alleles list...
[ Info: Decompressing weight list...
[ Info: Building kmer index ...
[ Info: Sample: SRR6152708. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6397472. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398023. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Sample: SRR6398036. Opening fastq file(s) and counting kmers ... 
[ Info: Voting for alleles ... 
[ Info: Calling alleles and novel alleles ...
[ Info: Writing output ...
[ Info: Done.

Comparing this call with the previous, we can see that the novel alleles (marked as "N") have been called in the new output:


In [91]:
# OLD:
cut -f10-20 my_dataset_calls1.txt | column -ts $'\t'


Rv0023  Rv0024  Rv0025  Rv0033  Rv0034  Rv0035  Rv0036c  Rv0037c  Rv0038  Rv0039c  Rv0040c
1       N       1       1       2       N       4        1        1       2        2
1       1       N       1       2       1       1        1        1       1        2
1       1       1       1       2       1-      1        1        1       1        2
1       1       1       1       2       1-      1        1        1       1        2

In [92]:
# New:
cut -f10-20 my_dataset_novel_calls1.txt | column -ts $'\t'


Rv0023  Rv0024  Rv0025  Rv0033  Rv0034  Rv0035  Rv0036c  Rv0037c  Rv0038  Rv0039c  Rv0040c
1       314     1       1       2       562     4        1        1       2        2
1       1       123     1       2       1       1        1        1       1        2
1       1       1       1       2       1-      1        1        1       1        2
1       1       1       1       2       1-      1        1        1       1        2

Command summary:


In [ ]: