Repetitive DNA elements ("repeats") are DNA sequences prevalent in genomes, especially of higher eukaryotes. Repeats make up about 50% of the human genome and over 80% of the maize genome. Repeats can be categorized as interspersed, where similar DNA sequences are spread throughout the genome, or tandem, where similar sequences are adjacent (see Treangen and Salzberg). Some interspersed repeats are long segmental duplications, but most are relatively short transposons and retrotransposons. Though repeats are sometimes referred to as “junk,” they are involved in processes of current scientific interest, including genome expansion, speciation, and epigenetic regulation (see Fedoroff). Some are still actively expressed and duplicated, including in the human genome (see Witherspoon et al, Tyekucheva et al).
RepeatMasker is both a tool for identifying repeats in a genome sequence, and a database of repeats that have been found. The database covers some well known model species, like human, chimpanzee, gorilla, rhesus, rat, mouse, horse, cow, cat, dog, chicken, zebrafish, bee, fruitfly and roundworm. People often use RepeatMasker to remove ("mask out") repetitive sequences from the genome so that they can be ignored (or otherwise treated specially) in later analyses, though that's not our goal here.
It's intructive to click on some of the species listed in the database and examine the associated bar and pie charts describing their repeat content. For example, note the differences between the bar charts for human and mouse, especially for SINE/Alu and LINE/L1.
Let's obtain and parse a RepeatMasker database. We'll start with roundworm because it's relatively small (only about 2.5 megabytes compressed).
In [3]:
import urllib
rm_site = 'http://www.repeatmasker.org'
fn = 'ce10.fa.out.gz'
urllib.urlretrieve("%s/genomes/ce10/RepeatMasker-rm405-db20140131/%s" % (rm_site, fn), fn)
Out[3]:
('ce10.fa.out.gz', <httplib.HTTPMessage instance at 0x10488ae18>)
In [5]:
import gzip
with gzip.open(fn) as fh:
print ''.join(fh.readlines()[:10])
# Might want to adjust your browser window wider so you can see all the columns
SW perc perc perc query position in query matching repeat position in repeat
score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID
508 0.0 0.0 0.0 chrI 1 432 (15071991) + (GCCTAA)n Simple_repeat 1 432 (0) 1
1226 10.0 0.0 0.0 chrI 566 595 (15071828) + (GCCTAA)n Simple_repeat 1 41 (240) 2
344 22.2 0.0 0.0 chrI 596 676 (15071747) C RCS5 Satellite (41) 1387 1307 3
1226 10.0 0.0 0.0 chrI 677 846 (15071577) + (GCCTAA)n Simple_repeat 42 281 (0) 2
432 21.9 2.4 0.0 chrI 1622 1744 (15070679) + LONGPAL1 DNA/MULE-MuDR 136 261 (2330) 4
8509 0.6 0.0 0.1 chrI 2052 3026 (15069397) + PALTTTAAA3 DNA 1 974 (529) 5
4521 1.1 0.2 0.2 chrI 3124 3652 (15068771) + PALTTTAAA3 DNA 974 1502 (1) 6
Above are the first several lines of the .out.gz file for the roundworm (C. elegans). The columns have headers, which are somewhat helpful. More detail is available in the RepeatMasker documentation under "How to read the results". (Note that in addition to the 14 fields descrived in the documentation, there's also a 15th ID field.)
Here's an extremely simple class that parses a line from these files and stores the individual values in its fields:
In [33]:
class Repeat(object):
def __init__(self, ln):
# parse fields
(self.swsc, self.pctdiv, self.pctdel, self.pctins, self.refid,
self.ref_i, self.ref_f, self.ref_remain, self.orient, self.rep_nm,
self.rep_cl, self.rep_prior, self.rep_i, self.rep_f, self.unk) = ln.split()
# int-ize the reference coordinates
self.ref_i, self.ref_f = int(self.ref_i), int(self.ref_f)
We can parse a file into a list of Repeat objects:
In [34]:
def parse_repeat_masker_db(fn):
reps = []
with gzip.open(fn) if fn.endswith('.gz') else open(fn) as fh:
fh.readline() # skip header
fh.readline() # skip header
fh.readline() # skip header
while True:
ln = fh.readline()
if len(ln) == 0:
break
reps.append(Repeat(ln))
return reps
In [35]:
reps = parse_repeat_masker_db('ce10.fa.out.gz')
Now let's obtain the genome for the roundworm in FASTA format. For more information on FASTA, see the FASTA notebook. As seen above, the name of the genome assembly used by RepeatMasker is ce10. We can get it from the UCSC server. It's around 30 MB.
In [9]:
ucsc_site = 'http://hgdownload.cse.ucsc.edu/goldenPath'
fn = 'chromFa.tar.gz'
urllib.urlretrieve("%s/ce10/bigZips/%s" % (ucsc_site, fn), fn)
Out[9]:
('chromFa.tar.gz', <httplib.HTTPMessage instance at 0x1037cd488>)
In [10]:
!tar zxvf chromFa.tar.gz
x chrI.fa
x chrII.fa
x chrIII.fa
x chrIV.fa
x chrM.fa
x chrV.fa
x chrX.fa
Let's load chromosome I into a string so that we can see the sequences of the repeats.
In [36]:
from collections import defaultdict
def parse_fasta(fns):
ret = defaultdict(bytearray)
for fn in fns:
with open(fn) as fh:
for ln in fh:
if ln[0] == '>':
name = ln[1:].rstrip()
else:
ret[name].extend(ln.rstrip())
return ret
In [37]:
genome = parse_fasta(['chrI.fa', 'chrII.fa', 'chrIII.fa', 'chrIV.fa', 'chrM.fa', 'chrV.fa', 'chrX.fa'])
In [38]:
genome['chrI'][:1000] # printing just the first 1K nucleotides
Out[38]:
bytearray(b'gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaaAAAATTGAGATAAGAAAACATTTTACTTTTTCAAAATTGTTTTCATGCTAAATTCAAAACGTTTTTTTTTTAGTGAAGCTTCTAGATATTTGGCGGGTACCTCTAATTTTGCCTGCCTGCCAACCTATATGCTCCTGTGTTtaggcctaatactaagcctaagcctaagcctaatactaagcctaagcctaagactaagcctaatactaagcctaagcctaagactaagcctaagactaagcctaagactaagcctaatactaagcctaagcctaagactaagcctaagcctaatactaagcctaagcctaagactaagcctaatactaagcctaagcctaagactaagcctaagactaagcctaagactaagcctaatactaagcctaagcctaagactaagcctaagcctaaAAGAATATGGTAGCTACAGAAACGGTAGTACACTCTTCTGAAAATACAAAAAATTTGCAATTTTTATAGCTAGGGCACTTTTTGTCTGCCCAAATATAGGCAACCAAAAATAATTGCCAAGTTTTTAATGATTTGTTGCATATTGAAAAAAACA')
Note the combination of lowercase and uppercase. Actually, that relates to our discussion here. The lowercase stretches are repeats! The UCSC genome sequences use the lowercase/uppercase distinction to make it clear where the repeats are -- and they know this because they ran RepeatMasker on the genome beforehand. In this case, the two repeats you can see are both simple hexamer repeats. Also, note that their position in the genome corresponds to the first two rows of the RepeatMasker database that we printed above.
We write a function that, given a Repeat and given a dictionary containing the sequences of all the chromosomes in the genome, outputs each repeat string.
In [64]:
def extract_repeat(rep, genome):
assert rep.refid in genome
return genome[rep.refid][rep.ref_i-1:rep.ref_f+20]
print reps[1].rep_cl, reps[1].rep_nm
Simple_repeat (GCCTAA)n
In [40]:
extract_repeat(reps[0], genome)
Out[40]:
bytearray(b'gcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaagcctaa')
In [65]:
print extract_repeat(reps[1], genome)
print reps[1].rep_cl, reps[1].rep_nm
CCTGTGTTtaggcctaatactaagcctaagcctaagcctaatactaagcc
Simple_repeat (GCCTAA)n
In [59]:
extract_repeat(reps[2], genome)
Out[59]:
bytearray(b'cctaagcctaatactaagcctaagcctaagactaagcctaatactaagcctaagcctaagactaagcctaagactaagcct')
Let's specifically try to extract a repeat from the DNA/CMC-Chapaev family.
In [68]:
chapaevs = filter(lambda x: 'DNA/CMC-Chapaev' == x.rep_cl, reps)
In [69]:
[extract_repeat(chapaev, genome) for chapaev in chapaevs]
Out[69]:
[bytearray(b'cacggcccggcggggggtacatggatgagaattctctaccgtattccaatttggctgactgcgtgctcaacgttgaatactcagtgtaaactttcgtacaccgttgcgtactgcacagcgcgcattttaattgacgacatttagcaaaaattgaacataagatttttcggaattatgaagctcaattttcacaaaaataatgagttttttgtagaatttatgaaaaaacgtgaatatatagattttttgttcatgatattcaagaaaaagcgatttttagttcttcacagaggaatcctctcgcatttcacttgctcatgatgttttttgctccactttaggacgataaaaatgcgaattgttgataaaatgaatgaataatataaaaa'),
bytearray(b'ggggctgctgaaaccaatgtcggcatgatgagagttccggtcttctgaatccatttcctgcgtgggctgtggcgacgagctgcacgtctgaaaatcaagtttttgtaatt'),
bytearray(b'tttgggcgcatgatatggagctgaatcattcgattttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggaccaacagttttcgaaaaaatttaatttttgttcagaaatgtgaatattcactaaatcgaaaaaaataattgcaaaatccgtcagctgaacattcaaaacttatcaatttgaaatcagcatatttcagtgtataattaaaaaagtttcaaaaattctgagaccaatttttattgagaaaaataatttttcgctcgaattattgaattttcactaaatgcaaaaaacagtaaacttgggcccatgctacaagcctgaatctttcaaattaagaaccagcatgattttttcaatattctaggacgtttaaaaaaaatctggaccaacagtttttgaggaacgtaattttttatacaaaaatgttctgatttttcactaaactcaaaaaaatagtcaagttgggcccatgctgtacacctaaatcattaaaattcagaaccgccatgtattttttcttaccaaaggctctttaaaaaaaatctggaccaacagtttttgagatatttagaaaaacaactcacttttcgacgtttttcgccttttcgtggctcacccggttgatttttgcggcgatttgtggtctttcgctgaaaatattatttttatttcaattattaacgaagaaaacaagaaaaaacgacgagaaaacatcaaaaaaacgcgaaaaaacatcgaaaaaccaccgcaacctcatgaacaaaaaaaaagcattgcagccgcgggactagttttcgcaactttctaggccatgtcccgttcgccgtgccgtg'),
bytearray(b'cacggcccggcggggggtacatggacgagaattctctaccgtattccaatttggctgactgcgtgctcaacgttgaatactcagtttaaagtttcgtacaccgttgcgtactgcacagcgcgcattttaattgacgaaatttcgcgaaaattaacagaagatttttttcggaattatagagctgaaattgaaaaaaaaactatcaaattttcatcgaatttgtgaaaaatcgtaagtatgaagatcttttcttcactatattcaaggaaaatcgatatttcgcttttcacagacgaatgatgtctcattttactcgatgaaagtttctgatgagctgtttttatcgatttttgagcgataaaaatgcgatttgttgataaaatggatcaattatataaagaaacaacatatattgctctgagattactttttgagaatcaattctttatttttcggtcattttaaattaagcattaaaataaaaatattagaaatcataataaaaaaaacagaaaatcgatatattactttttcttcggaatttcacgacttttttggacgaattttattctgtaaactttcttcttcgaatttgtgtccacgtggctttcagtcgaagaagattctgcagcactccttcttgcttgcccacaacttactcgaattttctaaaatttttaacttattgaaattgtcatttcacctttacactcacttcagctaaactattactgcatttcggaagttgataggatactggtggagcaacaagtggatggcttctagtgattggctggcttgtcgagcaagtttgtgtgattgcctgaaataatttttgatttcaattttgagttgatttaaagcagtgaacctaccaccgggttcggacgagaaagagcattactcggtagaccacggaatccaattttcgttgaattgcctccaaatgcaatagaagtttgtacgttttgtgagaagtcgggctgaaaattttcaaaatttgaaacttttcgagaaaaataaaaatctcaccacagcatttcgagattttgtcgattgtggaagccttttcctggagcgaaaattgattttttttttcgctaaattttttcttttttgggcagccgtgacgtcccgaataactgcttttgggtcccgaagatcattttgcgaagaaattggcagaactgttgcatcttttggtacgatggaaagaccgggaatggacgtgttctgaaatagttgtgtttttaagaatgcagaaatgtttttctgtaccaaaattaccatagtcatgtcattcatgatgttacgacacatgagctctctcagaacatggatgtaacgccttttcttgtcccggtaattgcaaaatctcctctcaagtgcattgaaaatcgcgtggacagattcaactccttgttctgtgatccttccaatgtttctcacatcttttgccatttgtggtgcatggtagaccaacaagtgcagctttaaaataattgtttcttcgggaaccgctactttcaaatcctccacaaatccgcgaatcgaattttgaagtattaagacgtcggaatcatttaaaaacttgtttcccgaaagtgacataatagttgaaagctttcccattgctgatttcaatccgagcaacattgggcataaatttgggccaaaaatgttgaaagtctcctctacaacagccggcgttagcagcaatttcaaatggtttccgcaaaatgattggaaccaagcctgcttgtccgctccaaacttagcccaacactgtcccattttttcaagtgttccttcgggagtaccattcacaattgtatcgagcaacaatttttccgattgaagtgctttcagttcagcatgcgactccaatttcatctttccggtggctccttgatacttttcttccgcacttttaattaggttaacagcgttttttagagttgcttttcgtgttttcaggataggaaaagaagtagtgttatccaaagtatcagaatatttccagaggggattgaagatatatttgtcaaaaatacccatgataatgtgcagaagaggaatcaaatagaacatgatcgcaacgtgtggcagaagtggagtacatcctttgcgaacacccaagtcgccattttcacaacaagctttgtaaagatcgattgttcgtgggtggaatgtttcatcaacattcatatccttgattttcatcctctcttcagctccccgtggattctgtgcaaaacatttgaagcagaaattgtgggatgaatgtccttggtgtccaagaatatcagattgaaacttgcaatctccagttgcaatttgcacaatttttgcggttttttgaactcctttgtccaaatatcaaattttcgttagcttgccaagctgctcaagaacgtccggaatgaattttttcagagacgaataattgtcggatccgtcatatactgcaattaccataacgtgtctcgaagaattcggtcgagatacgtttccgattaccaatgccaactttgtgcttccacctccagcgtcaccaacgactccaatcttgattactcctttcgtgtatccgtcgtccacaaattgatttgaattgcatagaagctctattcgataggctaaaacttctgcaattttcatgcactgcacaatggtaatcacttttcctttattgtcgaacgaagtggaaactttgaaactggagatcattgataactggattgacaaatctcttgtgttctttaccgatggaagcaaatcatagccaatggcattagtcaaatagtttttgattttttccatctgacttagagataatccgcattttgataaaaagtcaacggcctcaaagtttgaaagcttgtttttgtagctttgattctcttctgaattcaggaattttgtgaattttcgaataaattgtccgacgtcatcctcgaggcagatttcgtgttgaagcaagtgaagagctttgcgaaatcgatttttgatacaacttttgcttcttagattcgaaatattaactttaaaagctgattttttaaggttttcaacttcttcggcgtgtctttgtagactcagaaccatagctttgccacttttcttcacatctgcacagcttctcaccaatcgaccttctataccactgacgatcgttcgtatattgcatacttccatttgcagcgaagaattagatgctcttatagtgatattttcatggcggactatttgcatttcttccgaaaacaccgcaaactcatcaatccgcttttgtatttcttctgatatttcatttttttcatttttcagtcgttcgatcgttagtcggagcattttgatctgcggaatttgctcaacattggagattattcgaaccctcggtgtactgaacgagtttcgtaaaggtgtcggtggaaatacgggattggagaatctcagcaaaatcatataatattagttttgaaatattgaaaaaaattacattgtgagaaaaagtcggaatttcgtcactaaaatccatttccacgtctctcgtcagaattccttcatccatattgaaacaatttgacgacctgcatgtagttgcggagctactggaagcaatgtcgggatggtgggagtttcgatcttctgaactgatttcctgattagcctgtggcgacgagctgcacgtctgaaaatcacgtttttgaagttagaacaaactactccaacttaattaaagttgacaaaattgagctgaacgaacctccactttcgaattgttcagttcttcctcttcagtttgatcttttgaaactccattagcactgttccttgctctctgggcatttgctaaaagaaggcctgcacaagatttttcttttcttttttgtttgaagtatacttttgtcatctggaaatattgcatgaatattataagggaaacaatttttaaatatcgattttcacgaaatttgaaaaaatcaataatttgggcgcatgatattgagctgaatgtttcgaatttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggaccaacagtttttgaaaaaaaaatacttttcgttcagaaatgtactgattttccactgattttcacgaaatttgaaaaaatcaataatttaggcgcatgatattgagctgaatgttttgaatttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggaccaacagttttcgaaaaaattcaatttttgttcagaaatgtgaatattcactaaatcgaaaaaaataattgcaaaatccgtcggctgaacattcaaaacttatcaatttgaaatcagcatatttcagtgtataattaaaaaaggtttcaaaaattctgagaccaatttttgttgagaaaaataatttttcgttcgaattatcgatttttcacgaaatgccaaaaacagtaaacttgggcccatgctaaaagcctgaatctttcaaattaaaaaccagcatgattttttctatattctaagacgtttaaaaaaaatctggaccaacagttcttgaggaaagtaattttttatacaaaaatgtgctgatttttcactaaattcaaaaaaataatcaagttgggcccatgctatacacctaaatcattaaaattcagaaccgccatgtatgtattttttcataccataggctctttaaaaaaaatctggaccaacagtttttgagatatgtcaaaaaaaacaactcactttttgacgtttttcgccttttcgcggatgatgcggtcgatttttgcggcgatttgtggtctttcgctgaaaatattatttttatttcaatttttaacgaagaaaacaagaaaaaacgacgagaaaacatcaaaaaacacgaaaaaaacgtcgaaaaactcccgcaacctcatgaaaaaaaataaagcactgcagccgcgggactagttttcgcaactttctaggccatgtcccgttcgccgtgccgtg'),
bytearray(b'acgtggctgaagaaatttctacagtagtcccatttggctgactgaatattcaacgcgaataagttttgtacactattgcgtactttgcgtacgcgcattttatttgacgacaattcgtcaatatcagc'),
bytearray(b'aattcctaaattttttattaaaatcgaaaaaaaaaaatgaaatacgtgagattgagtttcgagacttttttattcagaatcagcatatatttctccatatttgagtaggttttcagaaatattgtaccataatttttggaaaaatgtaatttttaattcgaaattgcactgaatttctcgaatttttcactaaaatcgagaaaataaatatgaaatacgcgagattgaggttcaagactttttaattcggaatcagcatatatttttccatatttgagtagattttcagaaatattgtaccataatttttcgagatattttgaataataacttacttttcgacgttttttgcctttgtccggtttaatccatcgaatttcgaagcggtttgcgtagattagctgaaaacattatgcttattccacgtagtaacaagaaaaaacaagaaaaaataagaaaaaacgaagaaaa'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcagttaata'),
bytearray(b'cacggtatcacaaaaactagatctctcgtaaaatttgagaaagatctcgcaggtacgcagtgaaatggtccgcaatgtgtcatcgcggtgtttgcgtacttgcgtaccgtagtccgcaaaacattgcagcggcaaatagatttttgaagcaaattttagcagaaaaaaggcagaattaatagtttcaaggtgaaaaaaaaaataaaaaatagattttattattcaatttagtagctaacaattgagaattgaattattgaaacacagaaaaattaattttgaacagtaaaacaaagaaataatcgaaacggaagattgaaaattgaaaaatacatacaaatcgattcaaaaaacgaatttaatatgtggaatcagcctccttcttcgctttacggatgcaagttgaaatttttcttggtaaaactttcgaaaataaggaaacccggtgaaatttcttaattgcctcgactcctggtgaactttttgagataaatccatcgaaattttgctgagtagctcattgacaaggtttcaatctgaaatttcataaaatcaattattttcgaataattttaatcccaacaaaccgaaggaaatcctgaattttagcttttcgatagaatcctaaggtgtcacatcgacaattttccaagttgaaagaaaccatgtggagcatctccgcttataatatcctcgacaaaatattgatgagaaatccatcaaatttgcaccgagtatctgcttgtcaatgtttttacctgaaatttaatgaaattaataattttttaataattttaagcacaactaaccatagaaaatcccgaattttatgatattgaggaaatcccgaaattttaattgcgaaaaggttcagaattgtgtgagggagagcgctgggtggtttattgggaagacgaggcgtcctgcaaagaaaatgttacgatttgtttttttgaaaggttccacttacttcctccatatcagaaattagattgctacatgacagctccttagcaatgtacaggtatcttttccctgtgtttctaaccgagtggaagcgtctttcgaggcgattgaaaacggcatgaagcgattcgattccttgctccgaaagccttcccaaattcctttccgtttttgcgacttctactacgtgagcaaccaggacatgcaatttttgagtgattgattcttctgggtgggcagcttgtaggaattcaacgaattccttcatcgactcatccaattcgctgatgtcatcgtcagagagcaatcgattggccgacaatgacatgattttggacagcctgctcatcgcgttcttgacattcagaagcatcggagtcatgtggtttttcagatttttgaatgcagctgtgacacctttctcagacaaaatcagcttcgtgtgatttccagtgtacatttgaaaccacgctcggcgtgttgcaccaacttcttccagatcattctcaaattgtttaagatatccaccagcaagtccatccagcgtctcgtccaataacactttttcttctttcagagcagtaaactgtgctttcatctctctttttctcttaagtggtgaagcttcgaattttttgttcgcgtcttgaatcttcaaatcggcaactctttttgtctcttctttatttcttcgaatttcaaatgatgttttattgtctaaactcacaacggccatccaaatcggctcaaagatgtatttcgtgaacagtccgacaatcaagtgtagcattgcaggcaagtagtgttccaacttgacattttgaagaattggtccacttccgcatctgacgccaaagctaccgtatttagaattgagcttgtaagaattcattgttttcaagaaatatatcttttgcaagttcagatctttaagcttttgcatcagtcctcctcttggattggtttcgaaacaaaacgggcaaaaataggtggcacattgttttttatgtgacagtaaatcacatgtaaacttaaaatcaccgactactttttggacaacgccacgtgtcaccacccttccatcttccatataggtgatgctggtgaagttgttgatcttcacgataagatcagacaggtaggccatgataagttcccgcgaatcagagtcatcaaaaacagcgagaaggacgattcggtgcggcgagttcgcatgatcacaatttccgatcagaagacaaagtttcgtcgttcctcctcccgaatctccaccaattccgataacaattttgccatcggtataggaatcatggcgtaattgttttgatacagacaaacgctccagcttcggaatgacacttttctcgacatcgataattttgacaacggtcaccttttctccttttgatatatatgttgtggccttataattgtcgatagttgacattctcgttttccgttgcatcgtcaaattaatagttggcatgatttcaagatcggtgaacatcttatagttttgcttcacccgccgcaattgattgttagagaatctacatttttcttgaaaaataatcgtttgccaatgcgtcaattggacttggaaggactgcgaatcattttgttggagatatttctggaaatcaatcatgaaattacgaatatcctcttctcgactgattcgttgaagcaaatccagtgccaaatctatcctagattttccagagaacgattccttcttcgcaactctgatatccatattattctgtactttttcgcccttgacgtaatttgagcgatttgttttcttaatatcagattttcgttggttttccgttttgatgtttttcttattgaaattatctctttcctttgcatgaaaaattcatcagaaaaaacggcaaactcgtcgtctctctccatcaaccgatttagaagtgtctcttcgtcaatgtctgttcttggattgagctggaccgaatggtttgagctggacaggatggcggttctgaaatggaaaattgaataaacaatcaacaacaaagaaataaatctacctctgatatgctatccaaaatggaatcaacttctagaaatctcattcgttctctactcaagtcttctctaattcccttcacatatggtgtctcattcgaaaacacgccgacgctgtcactcgatgaatagagtggacttgacgatctttcaaaagagtgtggtgttcttggaggtgatgaagaagattccgcaatagaagaagttgatggttgtgtgtcgaacggctggaaaaattatttttaaaatattataatcgtcttaggatccgagttgtagcatggattaattcacttacaatatcagtttcaggtgttattgtcagattcaaatcgatattgcgtttcgactccacatccagattatcatcaaaatccacaaatggagacactcgtcttctctcaggagtgatctgcgcatctccgtttctgtcaagagcttctgtcaattctacattcgacaggtttaggtccagggaagcatttctttcagtatttgaattttcgtcttcttcatcttttttccatcgtcgtttccgagcatttagaagatttgaagaccatgctgattgatttgatggaggaggcatctgaaaaaaaaattttttactcaattttcagtgaaaaaaatttacttttggaaaattttaagtgaaaaatgtacgatttgtgtatatgtttgcctttatcttgtagagaatttttagctgattctaattcggcaataaaaataaaacattttgtgatttttttcgaaaaaaatcatttttcctccattttcagtgcaaaattttagtttcgaaaaaattttcttgaagaatgtacgatttctaaaaaattctctctgtatcttgcttaaaattaacagttctttctctttgaagcaaaaaactaacacattttgtgatttttttggcaaaaaaaattattttataaattcttattacaaaaaaaaattttttcgaaaaaattttaatgaaaaattaagatttctctatgaattgagttccatcttatacacaattaaattttgatcataaaacaactataaatcgtaaagagtttttgattttcttgaaaaagggaacgtttttaaaaactattttcagtgaaaaaaatttacttttggaaaattttaagtgaaaaatgtacgattcgtgtatatgtttgcctttatcttgtagagattttttagctgattctaattcggcaataaaaatataacattttgtgatttttttcgaaaaaaatcatttttcctccattttcagtgcaaaattttagtttcgaaaaaaaaattaatgaagaaggtacgatttctaaaaaattcggcccgtatcttgttcagaatttttagctctctctttttcaatcaaaaaaataaaatattttgtgattttttaagagaactcgtcaaaaaactcacttaattcgttttcctttctgcccacgccgacgctcctttgttttcctcgaattttttcgcttactacctgaaacaagacacactttttcgattttacaaaaaaaaataacaaaaaagatacggaaaaacggttaaaatcggcaaaaaacggagagaatcgatgccgagtgaaaggcttgaaatttaaacaattgttcgcaatagagcgtgtttgcctccatctagagattgaaccaccgtg'),
bytearray(b'tgctgaaaattgctgaaaatcgaaatttcgtcagctgatgtcgattattctgcgcgggggtacggtacgcaagtccgcaaacactgtcacgccaaattgcgga'),
bytearray(b'cacggtagcacagaaactagatctctcgtaaaatttgagaaagatctcgcaggtacgcagcgaaatggtccgcaatgtgtctcgcggtgtttgcgtacttgcgtaccgtagtccgcaaaacattgcagcggcaaatagatttttgaagcaaattttagcagaaaaaaggcagaattactagtttaaagtgaataaaattaaaaaaaaaagattttattattaaatttatcaactaacaattcaaaattaatgtattaaaacacagaaaagttgattttgaacagaaaaacggagtaatcatttaaaaagacaatattaaagtgaaaaaacacgcaaatcgattgaaaaaacgaatttaatatgtggaatcggccttctttttcgcttttcggctgcaagttggaatttttcttgggaaaaactttcgaaaatgaggaaatcagtgggaacttcttaatttcctcgactcctggtgaactttttgtgttaaatccatcgaaattgtgcggaatagctcattgacaaggtttcaatctgcaatttgatgaaattctatatttttaaattattttaatcacaacaaacctgaggaaatcccgaattcgatcctttcgataaaatccagagatgtcactttgccacttttccaaaatgaaggaaaccatgtggagcttctcagctttttgagttcctggccgaaatcatgatgaaaaactatcgaaattgacttgagcagcttcttggcaaggtttttgtctgaaattttaagattttaatgatttttgaacgtttttaacacaacgaaccaaaggaaatcctgaatttcacatttctgactgtttcctgggatgttacatcggcagttttccaaaatgaaggacatcatgtagagcatctccacttattgaaattctggtgaagttcttgccgacaaatccatcgacattacgttgaacgtcttctaggcaaggtgtttatctgaaaattcatga'),
bytearray(b'taagcagtttttgaaaagttttcgaaaaaaaAAAGAATTTCCGTTTTTTGAGATttaattttcagtgaaaaaaatttacttttggaaaatttcaagtgaaaaatgtacgattcgtgtatatgtttgcctttatcttgtagagaatttttagctgattctaattcggcaataaaaatataacattttgtgatcgttttcgaaaaaaaaatctttttctttatttttagtgcaaaattttagtttcgataatttttctatgaagaatgtacgatttctagaaaattctgcctgtatcttgctcaaaattaacagttctttctttttaaagcaaaaaattaacacattttgtgattttttggcaaaaaaaattattttataatttcttatttcaaaaaattttttttcgaaaaaatcttaatgaaaaattaagatttctctatgaatttagttccatcttatacaaaatttaatgctgatcataaaacaactataaaatgtgaagactttttgattttcttgaaaaatggaacgtttttaaaaactgttttcagtgaaaaaaatttacttttgaaaaattgtgaaaattgaaaaattgtgaataagtgaaatatgtacgatctctcaataattttgtcttcatcttgtagagaattgttagctgtttctgattcggcaagaaaaatacaacattttgtgatcgttttcgaaaaaaaaaatttttttcttaatttttagtgcaaaattttagtttcgaaaaaaaatttatgaagaaggtacgatttctagaaaattctgctcgtatcttgttcagaatttttagctctttctttttcataccaaaaaataaaatattttgtgattttttaagagaactcgtcaaaaaactcacttaattcgttttccttgccgcccacttcgacgttcctttgtttttctcgaattttttcgcttactacttgaaacaagacacactcttttgattttacaaaaaaaaattacaaaaaagatacggaaaaacggttaaaatcggcaaaaaacggagagaatcgatgccgagcgaaaggcttgaaatttaaacgattgttcgcaatagagcgtgtttaccgccatctagcgactgaaccaccgtg'),
bytearray(b'cacggtggttcagttgctagatgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'cacggcccggcgaaagagacgtggccgcgagagctgcgccggctaggccaccgcctcctatggttaagatttttgaacgaataaacatttttaatttggctgctaagctcatttatctttgttttttctcgttttttctcatttttatcgataaaaatatattttttgttgcagaaaatcacaaaaccgcggcaaaacagcactcaaccgccaactgggaggaggaaaatccgaaaaaagagtttttt'),
bytearray(b'tgcgaaaaactgtttaaagtatcgattttcttcaatatcagcaacatacaatcctttaaaatgattattttttgtaaattcgataaaaattcatttatttttcacaacttctgcccgaaaattaccgaaataaccagcgtttctataactaagaaagtgtcgtcaattaaaatgccgcgtccgcaaaatgtcgtacgaaacttttcgctgagtatcaaacgttgaatattcagtcagccaaattttactacggtagagattttacagccacgtacggttcgccgggccgtg'),
bytearray(b'ttgttcggtggagtgcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'tttttcgtgtttttttatgtttttttatgttttttcgtcgttttttcttgttttcttcgttaataattgaatttaaaataatattttcagtaaaaggacttaaatcgccgcaaaaatcgaccgcgtgagccgcgaaacggcggaaaacgtctaaaagtgagttgtttt'),
bytearray(b'aaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaactcaacacattttcgatcatttttgaacaaaaaaattgttttctgaaaaatttgacgcttaatttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacactttttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttattttttttgaaactgcactcaaatataaagaaaaacatgcttgttctgaattagtaagaattgcgactcaagctcgcgtatctcatatttatttgttcgatttcagtggaaaatcgacacatttttgggaaaatttttttttttcgaaatattcgctcctaatttttaatgatttccagatgaca'),
bytearray(b'tctctatattattcattcattttatcaacaaacttctatcgccctaacgtcgatcaaaaaagctcatcagcaactgccgtcgagt'),
bytearray(b'tctcgtcgagtgaaatgcgatagaatttgtctgtgaaaaaccaaatatcgattttccttgaatatcgtgaagaacaaatcttcatacttacgattcttcacaaattcgatgaaaatctgatagttttttcaattttagctctataattccgaaaaaaatcttctgttaattttcgcgaaatgtcgtcaattaaaatgcgcgctgtgcagtacgcaacggtgtacgaaagtttacactgagtattcaacgttgagcacgcagtcagccaaattggaatacggtagagaatcctcgtccatgtaccccccgccgggccgtg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'cacggtggttcagttgctagatgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'gttttttcttattttttcttgtttttttttgttactacgtggaataagcataatgttttcagctaatctacgcaaaccgcttcgaaattcgatggattaaaccggacaaaggtaagaaacgtcgaaaagtaagttattattcaaaatatctcgaaaaattatggtacaatatttctgaaaacctactcaaatatggaaaaatatatgctgattccggattgaaaagtcttgaacctcaatctcgcgtatttcttattgagtttctcgattttagtgaaaaattcgagaaattcagtgcaatttcgaattaaaaattaaatttttccaagaattatggtacaatatttctgaaaacccactcaaatatggaaaaatatatgctgattctgaattaaaaagtcttgaacctcaatctcgcgtatttcatatttagtttctcaattttagtgaaaaattcgagaaattcagtgcaatttcgaattaaaaattaaatttttccaaaaattatggtacaatatttctgaaaacccactcaaatatggaaaaatatatgctgattctgaattaaaaagtcttgaacctcaatctcgcgtatttcatatttagtttctcaattttagtgaaaaattcgagaaattcagtgcaatttcgaattaaaaattaaatttttccaaaaattatggtacaatatttctgaaaacccactcaaatatagaaaaatatatgctgattctgaattagtaacatttgaaactcaatctcacgtatttcataatttttttttcgattttaatgaaaaatatattcagtgcaattttcgattttttttcaaaa'),
bytearray(b'gctgatattgacgaattgtcgtcaaataaaatacgcgtacgcaaagtacgcaatagtgtacaaaacttattcgcgttgaatattcagtcagccaaatgggactactgtagaaatttcttcagccacgt'),
bytearray(b'tgtgggctacggtagtcaagtacgcaaacaccacgagcattttcacaattgcgtacaaaatttttttcaagcttt'),
bytearray(b'tattaattgctaaaatttatgtggactacggtagtcaagtccgcaaacaccacg'),
bytearray(b'ttttgaaaaaaagttactttttgttcgaaaatgtattgattttcacttattttcactagattcaaaaaaa'),
bytearray(b'ttggacccatgctatactcaacatttttttggaattctgaatcagcattctcttcataaattagacaatttctaaaaaatctggaccaa'),
bytearray(b'ttttccggtgatttgctaaaactataatttctatttcaattattaaccgagaaaaccagaaaaaa'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcagttaata'),
bytearray(b'tattaattgctaaaatttatgtggactacgatagccaagtccgcaaacaccacg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcagttaatacttcccggtttcgttttacatgataatatcgttgaatttaagcaaaaaatgcagcattagtttcatgaaaaaaaaaataagaga'),
bytearray(b'atttttcctcaaaaactagaatttttcattgaaaaatggcttaaaaatcgatttttgttcgaaaaaa'),
bytearray(b'aaggttttcatcgataaagtcacgaatttgtcgaaatgctttggtgagttttatctttttcagaaaaaaaattcgaaaattttcag'),
bytearray(b'cacggtggttcagttgctagatgggtgcaaacgcgct'),
bytearray(b'attgttcggtagagcgcgtttgcactcatctagctgctgtaccaccgtg'),
bytearray(b'agaaatttctacagtagtcccatttggctgactgaatattcaacgcgaataagttttgtacactattgcgtactttgcgtacgcgcattttatttgacgacaattcg'),
bytearray(b'ttttggaaaaatttaatttttaattcgaaattgcactgaatttctcgaatttttcactaaaatcgaaaaaaaatatatgaaatacgcgagattgaggttcaaggcttttcaattcggaatcagcctatatttttccatacttcaatgggttttcagaaatattgtatgtgaatttttggaaaaatgtgatttttaatttgaaattgcactgaatttcttgaatttttcaataaaatcgagaaaataaatatgaaatacgcgagattgaggttcaagactttttaattcggaatcagcatatatttttctatatttgagtgggttttcagaaatattgtatgtgaattttttgagaaaa'),
bytearray(b'tttttcgaaaagtgttggaccagatttttttgatcttagaatatgaataaaagcatgctg'),
bytearray(b'caaggcccggcaaaccgtacgtggctgcaaaatctctaccgtagtaaaatttggctgactgaatattcaacgtttgatactcagtgaaaagattcgtac'),
bytearray(b'cacggtggttcagtcgctagatggcggtaaacacgctctattgcgaacaatcgtttaaatttcaagcctttcgctcggcatcgattctctccgttttttgccgattttaaccgtttttccgtattttttttgttatttttttttgcaaaattgaaaaagtgtgtcttgtttcaggtagtaatcgaaaaatttcgaggaaaacgaaggaacgtcaccgtgagcagcaaggaaaacgaattcagtgagttttttgactatttctctctgaaaaaattaaaaaatgtattactttttgattgaaaaagaaagagcaaaaaattctgaagaagatgtaacttaatttattcagaacaaaagttttttttgtctcgaaattacaatttttcgaccaaaaatagaaaatgatcattttctcgaaaatgaatcccaaatttatgtcagttttttatgaaaaaataatcagcgtaaacttctgtactagaaatactaccttttttttggattgaaagtttgtatcgtgctcaaaattttaaaagtaaaattattttacgctgaaaaatgcaaaatcataactttttttgaggagaaacccccaaatgttatcgattttgttatcaaattgaactcagcttaaaattatctacaagatgaagctaaattcattgaga'),
bytearray(b'atattaattaagttttttgtatcaaattgtgtgttttctcaattttatattgcctttttatttcataactttccttttctgttcaaaatcaacttttttttgtgttttaacacttcaattatcaattgttagtttataaatttcataataaactctgattttttattttttttcatcttgaaactattaattctgctgtttttctgctaaaatttgcttcaaaaatctatttgccgcttcattgttttgcggactacggtacgcaagtacgcaaacaccgcaacgacacattgcggaccatttcgctgcgtacgctgcgagatctttctcaaattttacgagagatctagtttctgtgctactgtg'),
bytearray(b'cacggcccggcgaaccgtacgtggctgtaaaatctctaccgtagtaaaatttagctgactgaatattcaacgtttgatactcagcgaaaagtttcatacgacattttgcggacgcggcattttaattgacgacactttcttagttatagaaacgctggttatttcggcaattttcgggcagaaattgtgaaaaataaatgaatttttatcgaatttacaaaaaataataattttaaaggattgtatgttgctgatattgaagaaaatcgatactttaaacagtttttcgca'),
bytearray(b'aaaaaaaactcacttttttcgaattttcctccttccagttggcggttgagtgctgttttgccgcggttttgtgattttctgcaacaaaaaaaatatttttatcgataaaaatgagaaaaaacgagaaaaaacgaagataaatgagcttagcagtcaaattaaaaatgtttattcgttcaaaaatcttaaccataggaggcggtggcctagccggcgcagctctcgcggccacgtctctttcgccgggccgtg'),
bytearray(b'ttttcttcgttttttcttattttttcttgttttttcttgttactacgtggaataagcataatgttttcagctaaactacgcaaatcgcttcgaaattcgatggattaaaccggacaaaggcaaaaaacgtcgaaaagtaagttattattcaaaatatctcgaaaaattctggtacaatatttctgaaaacccactgaaatatggatatgctgatttcgaattaaaaagtcttgaacctcaatctcgcgtatttcatatttattttctcgattttattgaaaaatttgagaaattcagtacaatttcgaattaaaaattaaatttttccaaaaattctggtacaatatttctgaaaacccactgaaatatggaaaaatatgctgatttcgaattaaaaagtcttgaacctcaatctcgcgtatttcatatttattttctcgattttattgaaaaattcaagaaattcagtgcaatttcaaattaaaaatcacatttttccacaa'),
bytearray(b'gctgatattgacgaattgtcgtcaaataaaatgcgcgtacgcaatgtacacaatagtgtaaaaaactttatcgcgttgaatattcagtcagccaaatgggactactgtagaaatttcttcagccacgt'),
bytearray(b'acgtggctgaagaaatttctacagtagtcccatttggctgactgaatattcaacgcgaataagttttgtacactattgcgtactttgcgtacgcgcattttatttgacgacaattcgtcaatatcagc'),
bytearray(b'aaaatgttgaagagaaaccagagaaattgatcgagtagattcttggcaagttttgaaattatatggttttaataagttttgaacatttttaaatacaactaaccatgga'),
bytearray(b'ttgtttggtggagcgcgtttgcaccaatctagcaactgaaccaccgtg'),
bytearray(b'tttttttaaatttttttcttggctgctttactgatgtttttttctcaattttttcttgttttctttgttactaatttaaattaaaaaaactattttcagcttatcacagcaaatcggagcgaaactcgaccgcgataacaggaaaaagtcgaaaagtgagttttttgccaaaatatctcgaaaaactcatattttgttttgaaaacagatgcaaataaaaagaaatacat'),
bytearray(b'atgtattaattgctaaaatttatgtggactacggtagtcaagtccgcaaacaccacg'),
bytearray(b'tttcgtaatgtttttttcgagtttttgattgttttttctcatttttttttgtttttctttattattagttaaaatataaaaactattttaagctaatcaacgcaaatcgaggcgaaaaccgatcgcagaaagaggaaaagtcgaaaagtgagtttttttgcaaaaatatttca'),
bytearray(b'acgtggctgaagaaatttctacagtagtcccatttggctgactgaatattcaacgcgaataagttttgtacactattgcgtactctgcgtacgcgcattttatttgacgacaattcgttaatatcagc'),
bytearray(b'ttcattaaaatcgaaaaaaaaattatgaaatacgtgagattgagtttcaaatgttactaattcagaatcagcatatatttttctatatttgagtgggttttcagaaatattgtaccataatttttggaaaaattaaattttaattcgaaattgcactgaatttctcgaatttttcactaaaatcgagaaactaaatatgaaatacgcgaaattgaggttcaagacttttaattcagaatcagcatatatttttctatatttgagtgggttttcagaaatattgtaccataatttttggaaaaatttaatttttaattcgaaattgcactgaatttctcgaatttttcactaaaatcgagaaactaaatatgaaatacgcgagattgagattcaagacttttaaattcggaatcagcacatatttttccatatttgagtaggttttcagaaatattgtaccatattttttcgagatattttgaataataacttacttttcgacgttttttgcctttgtccggtttaatccatcgaatttcgaagcggtttgcgtagattagctgaaaacattatgcttattcca'),
bytearray(b'gcaaacatactcttttgcgaataagcgatttttttgttttttttttttggtgttttccgttttttgcttgttttcaccgtttcccctctttttttttgtttttttttgtcaaatcgagaaagagtgtgtttttttttcaggtgttaaacaagatttgcgagcaaaacgagggcacaccatcgtaagaagcgaagaaaacgagaaaagtgagttttttgaagattcctctttaaaaaatagggaaatgttttagttttgagccaaaaaagaaagagctgaatttttcaaacaagatacatgc'),
bytearray(b'cacggcccggcgaaccgtacgtggctgtaaaatctctaccgtagtgaaatttggctgactgaatattcaacgtttgatactcagcgaaaagtttcgtacgacattttgcggacgcggcattttaattgacgacactttcttagttatagaaacgctggttatttcggcaatttcgggcagaaattgtgaaaaataaatggatttttatcgaatttacagaaaataattatttgaaagtattgtatgttgctgatattgaagaaaatcgatactttaaacagtttttcgca'),
bytearray(b'aaaaaactcacttttttcggattttccgcctcccagttggcggttgagtgctgttttgtcgcggttttttaattttctgcaacaaaatgtatatttttatcgataaaaatgagaaaaaacgagaaaaaac'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatctagcagctgaaccaccgtg'),
bytearray(b'cacggtggttcacttgctagatgggtgcaaacgcgctccactgaacaa'),
bytearray(b'tattaattgctaaaatttatgtggactacggtagtcaagtccgcaaacaccacg'),
bytearray(b'gcgaaattctgcattttgtcgtgagatccgcggtgtttgcgtacttctggggctaccgtaacccggaaaa'),
bytearray(b'tcgagttttacattgaaaaaaaatggccaaaaatcggagaaaaatgggcaaaaaacggagagaattgatgacaaatcaaag'),
bytearray(b'tcgagttttacattgaaaaaaaatggccaaaaatcggagaaaaatgggcaaaaaacggagagaattgatgacaaatcaaag'),
bytearray(b'ccttaaaaggaagaaatttggtggaaaaatacaattttcgctctaaaaaattccgtaaattcgagaatttatgaaaaatactttggttttttat'),
bytearray(b'gcaaaattctgcaatatgtcgtcaaattcggtgtttgcgtattttcgacgctaccgtaccccgcggaa'),
bytearray(b'ttttcttcgttttttcttattttttcttgttttttcttgttactacgtggaataagcataatgttttcagctaatctacgcaaaccgcttcgaaattcgatggattaaaccggacaaaggtaaaaaacgtcgaaaagtaagttattattcaaaatatctcgaaaaattatggtacaatatttctgaaaacctactcaaatatggaaaaatatatgctgattccggattgaaaagtcttgaacctcaatctcgcgtatttcttattgagtttctcgattttagtgaaaaattcgagaaattcagtgcaatttcgaattaaaaattaaatttttccaaaaattatggtacaatatttctgaaaacccactcaaatatggaaaaatatatgctgattctgaattaaaaagtcttgaacctcaatctcgcgtatttcatatttagtttctcaattttagtgaaaaattcgagaaattcagtgcaatttcgaattaaaaattaaatttttccaaaaattatggtacaatatttctgaaaacccactcaaatatagaaaaatatatgctgattctgaattagtaacatttgaaactcaatctcacgtatttcataatttttttttcgattttaatgaaaaatctaagaattcagtgcaattttcgattttttttcaaaa'),
bytearray(b'gctgatattgacgaattgtcgtcaaataaaatgcgcgtacgcaaagtacgcaatagtgtacaaaacttattcgcgttgaatattcagtcagccaaatgggactactgtagaaatttcttcagccacgt'),
bytearray(b'actacggtgcgcaagtactcaaacactgcgacgtcagagcgcagac'),
bytearray(b'tttagacgtatttttcttttctctgctcttatgatcgattttcgcagaggtttttgattatccggtaaatattactagttattctaatttttcattaaaaaattacatcgaaaataacgaaaaaacatcgaaaaacgcgaaagatcaacgaaaccaattcatgaattaattcgaatttataattcagtacaaaagcgattcggtcgcgggactagattttgcaacttcctaggccatttccaatttgcagtgc'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'ggtggttcagttgctagatgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'ttgttcggtggagctcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'cacggtgcttcagttgctagatgggtgcgaacgcgctccaccgaacaa'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'gcatggggcgtggccgaaaattctctactaccgtttaccaatttggctaatttgccaatcaacgttgaaaagttttgtacatcg'),
bytearray(b'cacggcccggcggggggtacatggacgagaattctctaccgtattccaatttgactgactgcgtgctcaacgttgagtactcagtttaaagtttcgtacaccgttgcgtactacacagcgcgcattttaattgacgacatttcgcgaaaattaacagaagattttttcggaattatagagctgaaattgaaaaaaaactatcaaattttcatcgaatttgtgaaaaatcgtaagtatgaagatcttttcttcactatattcaaggaaaatcgatatttagcttttcacagacgaatgatgtctcattttact'),
bytearray(b'gcctcattttactcgatggaagtttctgatgagctgtttttatcgatttttgagcgataaaaatgcgatttgttgataaaatggataaattatataaagaaacaacatatattgctctgagattactttttgagaatcaattctttatttttcggtcattttaaattaagcattaaaataaaaatattagaaatcataataaaaaaaacagaaaatcgatatattactatttcttcggaatttcacgacttttttggacgaattttagtctgtaaactttcttcttcgaatttgtgtccacgtggctttcagtcgaagaagattctgcagcactccttcttgcttgcccacaacttgctcgaattttctaaaatttttaacttattgaaattgtcatttcacctttacactctcttcagctaaactattactgcatttcggaagttgataagatactggtggagcaacaagtggatggcttctagtgattggctggcttgtcgagcaagtttgtgtgattgcctgaaataatttttgatttcaattttgagttgatttaaag'),
bytearray(b'gatttaaagcagtgaacctaccatcgggttcggacgagaaagagcattgctcggtagaccacggaatccaattttcgttgaattgcctccaaatgcaatagaagtttgtacgttttgtgagaagtcgggctggaaattttcaaaatttgaaacttttcgtgaaaaataaaaatctcaccacagcatttcgagattttgtcgattgtggaagccttttcttggagctaaaattgattt'),
bytearray(b'tacgatggaaagaccgggaatggacgtgttctgaaatagttgtgtttttaagaatgcataaatttttttctgtaccaaaattaccatagtcatgtcattcatgatgttacgacacatgagctctctcagaacatggatgtaacgccttttcttgtcccggtaattgcaaaatctcctctcaagtgcattgaaaatcgcgtggacagattcaactccttgttctgtgatccttccaatgtttctcacatcttttgccatttgtggtgcatggtagaccaacaagtgcagctttaaaataattgtttcttcgggaaccgctactttcaaatcctccacaaatccgcgaatcgaattttgaagtattaagacgtcggaatcatttaaaaacttgtttcccgaaagtgacataatagttgaaagctttcccattgctgatttcaatccgagcaacattgggcataaatttgggccaaaaatgttgaaagtctcctctacaacagccggcgttagcagcaatttcaaatggtttccgcaaaatgattggaaccaagcctgcttgtccgctccaaacttagcccaacactgttccattttttcaagtgttcctccgggagtaccattcacaattgtgtcgagcaacaatttttccgattgaagtgctttcagttcagcatgcgactccaatttcatctttccggtggctgcttgatacttttcttccgcacttttgattaggttaacagcgttttttagagttgcttttcgtgttttcaggatagggaaacaagtagtgttatccaaagtgacagaatatttccagaggggattgaagatatatttgtcaaaaatacccatgataatgtgcagaagaggaatcaaatagaacatgatcgcaacgtgtggcagaagtggagtacatcctttgcgaacacccaagtcgccattttcacaacaagctttgtaaagatcgattgttcgtgggtggaatgtttcatcaacattcatatccttgattttcatcctctcttcagctccccgtggattctgtgcaaaacatttgaagcagaaattgtgggatgaatgtccttggtgtccaagaatatcagattgaaacttgcaatctccagttgcaatttgcacaatttttgcggttttttgaactcctttgtccaaatatcgaattttcgttagcttgccaagctgctcaagaacgtccggaatgaattttttcagagacgagtaattgtcggatccgtcatatactgcaattaccataacgtgtctcgaagaattcggtcgagatacgtttccgattaccaatgccaactttgtgcttccacctccagcgtcaccaacgactccaatgttgattactcctttcgtgtatccgtcgtccacaaattgatttgaattgcatagaagctctattcgataggctaaaacttctgcaattttcatgcactgcacaatggtaatcacttttcctttattgtcgaacgaagtggaaactttgaaactggagatcattgataactggattggcaaatctcttgcgttctttaccgatggaagcaaatcatagccaatggcattagtcaaatagtttttgattttttccatctgacttagagataatccgcattttgataaaaagtcaacggcctcaaagtttgaaagcttgtttttgtagctttgattctcttctgaattcaggaattttgtaaattttcgaataaattgtccgacgtcatcctcgaggcagatttcgtgttgaagcaagtgaagagctttgcgaaatcgatttttgatacaacttttgtttcttagattcgaaaatttaactttaaaagctgattttttaaggttttcaacttcttcggcgtgtctttgtagactcagaaccatagctttgccacttttcttcacatctgcacagcttctcaccaatcgaccttctataccactgacgatcgttcgtatattgcatacttccatttgcagcgaagaattagatgctcttatagtgatattttcatggcggactatttgtatttcttccgaaaacaccgcaaacgcatcattctgcttttgtatttcttctgatatttcatttttttcatttttcagtcgttcgatcgttagtcggagcattttgatctgcggaatttgctcaacattggagattattcgaaccctcggtgtactgaacgagtttcgtgaaggtgtcggtggaaatacgggattggagaatctctgcgaaatcatataatataatattagttttgaaatattgaaaaaaattacattgtgagaaaaagtcggaatatcgtcactaaaatccatttccacgtctctcgtcagaattccttcatccatattgaaacaatttgacgacctgcatgtagttgcggagctactggaagcaatgtcgggatggtgggagtttcgatcttctgaactgatttcctgattagcctgtggcgacgaactgcacgtctgaaaatcacgtttttgaagttagaacaaactactccaacttaattaaagtagacaaaattgagctgaacgaacctccactttcgaattgttcagttcttcctcttcagtttgatcttttgaaactccattagcactgttccttgctctctgggcatttgctaaaagaaggcctgcacaagatttttcttttcttttttgtttgaagtatacttttgtcatctggaaatattgcatgaatattataagggaaacaatttttaaatatcgattttcacgaaatttgaaaaaatcaataatttgggcgcatgatattgagctgaatgtttcgaatttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggaccaacagtttttgaaaaaaaatacttttcgttcagaaatgtactgattttccactgattttcacgaaatttgaaaaaatcaataatttgggcgcatgatattgagctgaatgtttcgaatttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggaccaacagttttcgaaaaaattcaatttttgttcagaaatgtgaatattcactaaatcgaaaaaaataattgcaaaatccgtcggctgagcattcaaaacttatcaatttgaaatcagcatatttcagtgtataattaaaaaagatttcaaaaattctgagaccaatttttgttgagaaaaataatttttcgttcgaattatcgatttttcacgaaatgccaaaaacagtaaacttgggcccatgctaaaagcctgaatctttcaaattaaaaaccaacatgattttttctatattctaagacgtttaaaaaaaatctggaccaacagttcttgaggaaagtaattttttatacaaaaatgtgctgatttttcactaaattcaaaaaaatagtcaagttgggcccatgctatacacctaaatcattaaaattcagaaccgccatgtattttttcataccataggctctttaaaaaaaatctggaccaacagtttttgagagatgtcaaaaaaacaactcacttttcgacgtttttcgtgtttccccggatgatgcggtcgatttttgctgcgatttgtggtctttcgctgaaaatattatttttatttcaatttttaacgaagaaaacaagaaaaaacgacgagaaaacatcaaaaaacacgaaaaaaacgtcgaaaaactcccgcaacctcatgaaaaaaaataaagcactgcagccgcgggactagttttcgcaactttctaggccatgtcccgttcgccgtgtcgtg'),
bytearray(b'aaaaaaactcacttttcgactttttcctgtttctgcgatcgggttttgcgtcgatttgtggtaattagctgaaaatataaactatagtttttatattttaactattaataaagaaaacaagagaaaagtgagaaaaaacaatcaaaaactcgaaaaa'),
bytearray(b'tattaattgctaaaatttatgtggactacggtagtcaagtccgcaaacaccacg'),
bytearray(b'TGCGAAAAACTGTTTAaagtatcgattttcttcaatatcagcaacatacaattctttaaaatgattattttttgtaaattcgataaaaattaatttatttttcacaatttctgcccgaaaattgccgaaatgaccagcgtttctagaactaaaacaagtgtcgtcaattaaaatgccgcgtccgcaaaatgtcgtacgaaacttttcgctgagtatcaaacgttgaatattcagtcagccaaattttactacggtagagattttacagccacgtacggttcgccgggccgtg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'cacggtggttcagttgctagatgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'cacggtggttcaatcgctagatggaggcaaacacgctctattgcgaacaattgtttaaatttcaagcctttcactcggcatcgattctctccgttttttgccgattttaaccgtttttccgtatcttttttgttatttttttttgtaaaatcgaaaaagtgtgtcttgtttcaggtagtaagcgaaaaaattcgaggaaaacaaaggagcgtcggcgtgggcagcaaggaaaacgaattaagtgagttttttgacgagttctcttaaaaaatcacaaaatattttatttttttgattgaaaaagagagagctaaaaattctgaacaagatacgggccgaattttttagaaatcgtaccttcttcattaattttttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttatatttttattgccgaattagaatcagctaaaaaatctctacaagataaaggcaaacatatacacgaatcgtacatttttcacttaaaattttccaaaagtaaatttttttcactgaaaatagtttttaaaaacgttccctttttcaagaaaatcaaaaactctttacgatttatagttgttttatgatcaaaatttaattgtgtataagatggaactcaattcatagagaaatcttaatttttcattaaaattttttcgaaaaaatttttttttgtaataagaatttataaaataattttttttgccaaaaaaatcacaaaatgtgttagttttttgcttcaaagagaaagaactgttaattttaagcaagatacagagagaattttttagaaatcgtacattcttcaagaaaattttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttttatttttattgccgaattagaatcagctaaaaattctctacaagataaaggcaaacatatacataaatcgtacatttttcacttaaaattttccaaaagtaaatttttttcactgaaaattgaggaaaaaatttttttttcagatgcctcctccatcaaatcaatcagcatggtcttcaaatcttctaaatgctcggaaacgacgatggaaaaaagatgaagaagacgaaaattcaaatactgaaagaaatgcttccctggacctaaacctgtcgaatgtagaattgacagaagctctcgacagaaacggagatgcgcagatcactcctgagagaagacgagtgtctccatttgtggattttgatgataatctggatgtggagtcgaaacgcaatatcgatttgaatctgacaataacacctgaaactgatattgtaagtgaattaatccatgctacaactcggatcctaagacgattataatattttaaaaatgatttttccagccgttcgacacacaaccatcaacttcttctattgcggaatcttcttcattacctccaagaacaccacactcttttgaaagatcgtcaagtccactctattcatcgagtgacagcgtcggcgtattttcgaatgagacaccatatgtgaagggaattagagaagacttgagtagagaacgaatgagattgctagaagttgattccattttggatagcatatcagaggtagatttatttctttgttgttgattgtttattcaattttccatttcagaaccttgacgccatcctgtccagctcaaaccattcggtccagctcaatccaagaacagacattgacgaagagacacttctaaatcggttgatggagagagacgacgagtttgccgttttttctgatgaatttttcatgcaaaggaaagagataatttcaataagaaaaacatcaaaacggaaaaccaacgaaaatctgatattaagaaaacaaatcgctcaaattacgccaagggcgaaaaagtacagaataatatggatatcagagttgcgaagaaggaatcgttcacaaagccgaaacgttgtcacgcttcccaagtataccatcgaatggagcaagctcgaaaataagagctctggaaaatctaggatagatttggcactggatttgcttcaacgaatcagtcgagaagaggatattcgtaatttcatgattgatttccagaaatatctccaacaaaatgattcgcagtccttccaagtccaattgacgcattggcaaacgattatttttcaagaaaaatgtagattctctaacaatcaattgcggcgggtgaagcaatactataagatgttcaccgatcttgaaatcatgccaactattaatttgacgatgcaactgaaaacgagaatgtcaactatcgacaattataaggccacaacatatatatcaaaaggagaaaaggtgaccgttgtcaaaattatcgatgtcgagaaaagtgtcattccgaagctggagcgtttttctgtatcaaaacaattacgccatgattcctataccgatggcaaaattgttatcggaattggtggagattcgggaggaggaacgacgaaactttgtcttctgatcggaaattgtgatcatgcgaactcgccgcaccgaatcgtccttctcgctgtttttgatgactctgattcgcgggaacttatcatggcctacctgtctgatcttatcgtgaagatcaacaacttcaccagcatcacctatatggaagatggaagggtggtgacacgtggcgttgtccaaaaagtagtcggtgattttaagtttacatgtgatttactgtcacataaaaaacaatgtgccacctatttttgcccgttttgtttcgaaaccaatccaagaggaggactgatgcaaaagcttaaagatctgaacttgcaaaagatatatttcttgagaacaatgaattcttacaagctcaattctaaatacggtagctttggtgtcagatgcggaagtggaccaattcttcaaaatgtcaagttggaacactacttgcctgcaatgctacacttgattgtcggactgttcacgaaatacatctttgagccgatttggatggccgttgtgagtttagacaataaaacatcatttgaaattcgaagaaataaagaagagacaaaaagagttgccgatttgaagattcaagacgcgaacaaaaaattcgaagcttcaccacttaagagaaaaagagagatgaaagcacagtttactgctctgaaagaagaaaaagtgttattggacgagacgctggatggacttgctggtggatatcttaaacaatttgagaatgatctggaagaagttggtgcaacacgccgagcgtggtttcaaatgtacactggaaatcacacgaagctgattttgtctgagaaaggtgtcacagctgcattcaaaaatctgaaaaaccacatgactccgatgcttctgaatgtcaagaacgcgatgagcaggctgtccaaaatcatgtcattgtcggccaatcgattgctctctgacgatgacatcagcgaattggatgagtcgatgaaggaattcgttgaattcctacaagctgcccacccagaagaatcaatcactcaaaaattgcatgtcctggttgctcacgtagtagaagtcgcaaaaacggaaaggagctgggaaggctttcggagcaaggaatcgaatcgcttcatgccgttttcaatcgcctcgaaagacgcttccactcggttagaaacacagggaaaagatacctgtacattgctaaggagctgtcatgtagcaatctaatttctgatatggaggaagtaagtggaacctttcaaaaaaacaaatcgtaacattttctttgcaggacgcctcgtcttcccaataaaccacccagcgctctccctcacacaattctgaaccttttcgcaattaaaatttcgggatttcctcaatatcataaaattcgggattttctatggttagttgtgcttaaaattattaaaaaattattaatttcattaaatttcaggtaaaaacattgacaagcagatactcggtgcaaatttgatggatttctcatcaatattttgtcgaggatttcataagcggagatgctccacatggtttctttcaacttggaaaattgtcgatgtgacaccttaggattctatcgaaaagctaaaattcaggatttccttcggtttgttgggattaaaattattcgaaaataattgattttatgaaatttcagattgaaaccttgtcaatgagctactcagcaaaatttcgatggatttatctcaaaaagttcaccaggagtcgaggcaattaagaaatttcaccgggtttccttattttcgaaagttttacctagaaaaatttcaacttgcatccgtaaagcgaagaaggaggctgattccacatattaaattcgttttttgaatcgatttgtatgtatttttcaattttcaatcttccgtttcgattatttctttgttttactgttcaaaattaatttttctgtgtttcaataattcaattctcaattgttagctactaaattgaataataaaatctattttttatttttttttcaccttgaaactattaattctgccttttttctgctaaaatttgcttcaaaaatctatttgccgctgcaatgttttgcggactacggtacgcaagtacgcaaacaccgcgatgacacattgcggaccatttcgctgcgtacctgcgagatctttctcaaattttacgagagatctagtttttgtgataccgtg'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatctagcaactgaaccaccgtg'),
bytearray(b'cacggtggttcaatcgctagatggaggcaaacacgctctattgcgaacaattgtttaaatttcaagcctttcactcggcatcgattctctccgttttttgccgattttaaccgtttttccgtatcttttttgttatttttttttgtaaaatcgaaaaagtgtgtcttgtttcaggtagtaagcgaaaaaattcgaggaaaacaaaggagcgtcggcgtgggcagcaaggaaaacgaattaagtgagttttttgacgagttctcttaaaaaatcacaaaatattttatttttttgattgaaaaagagagagctaaaaattctgaacaagatacgggccgaattttttagaaatcgtaccttcttcattaattttttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttatatttttattgccgaattagaatcagctaaaaaatctctacaagataaaggcaaacatatacacgaatcgtacatttttcacttaaaattttccaaaagtaaatttttttcactgaaaatagtttttaaaaacgttccctttttcaagaaaatcaaaaactctttacgatttatagttgttttatgatcaaaatttaattgtgtataagatggaactcaattcatagagaaatcttaatttttcattaaaattttttcgaaaaaatttttttttgtaataagaatttataaaataattttttttgccaaaaaaatcacaaaatgtgttagttttttgcttcaaagagaaagaactgttaattttaagcaagatacagagagaattttttagaaatcgtacattcttcaagaaaattttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttttatttttattgccgaattagaatcagctaaaaattctctacaagataaaggcaaacatatacataaatcgtacatttttcacttaaaattttccaaaagtaaatttttttcactgaaaattgaggaaaaaatttttttttcagatgcctcctccatcaaatcaatcagcatggtcttcaaatcttctaaatgctcggaaacgacgatggaaaaaagatgaagaagacgaaaattcaaatactgaaagaaatgcttccctggacctaaacctgtcgaatgtagaattgacagaagctctcgacagaaacggagatgcgcagatcactcctgagagaagacgagtgtctccatttgtggattttgatgataatctggatgtggagtcgaaacgcaatatcgatttgaatctgacaataacacctgaaactgatattgtaagtgaattaatccatgctacaactcggatcctaagacgattataatattttaaaaatgatttttccagccgttcgacacacaaccatcaacttcttctattgcggaatcttcttcattacctccaagaacaccacactcttttgaaagatcgtcaagtccactctattcatcgagtgacagcgtcggcgtattttcgaatgagacaccatatgtgaagggaattagagaagacttgagtagagaacgaatgagattgctagaagttgattccattttggatagcatatcagaggtagatttatttctttgttgttgattgtttattcaattttccatttcagaaccttgacgccatcctgtccagctcaaaccattcggtccagctcaatccaagaacagacattgacgaagagacacttctaaatcggttgatggagagagacgacgagtttgccgttttttctgatgaatttttcatgcaaaggaaagagataatttcaataagaaaaacatcaaaacggaaaaccaacgaaaatctgatattaagaaaacaaatcgctcaaattacgccaagggcgaaaaagtacagaataatatggatatcagagttgcgaagaaggaatcgttcacaaagccgaaacgttgtcacgcttcccaagtataccatcgaatggagcaagctcgaaaataagagctctggaaaatctaggatagatttggcactggatttgcttcaacgaatcagtcgagaagaggatattcgtaatttcatgattgatttccagaaatatctccaacaaaatgattcgcagtccttccaagtccaattgacgcattggcaaacgattatttttcaagaaaaatgtagattctctaacaatcaattgcggcgggtgaagcaatactataagatgttcaccgatcttgaaatcatgccaactattaatttgacgatgcaactgaaaacgagaatgtcaactatcgacaattataaggccacaacatatatatcaaaaggagaaaaggtgaccgttgtcaaaattatcgatgtcgagaaaagtgtcattccgaagctggagcgtttttctgtatcaaaacaattacgccatgattcctataccgatggcaaaattgttatcggaattggtggagattcgggaggaggaacgacgaaactttgtcttctgatcggaaattgtgatcatgcgaactcgccgcaccgaatcgtccttctcgctgtttttgatgactctgattcgcgggaacttatcatggcctacctgtctgatcttatcgtgaagatcaacaacttcaccagcatcacctatatggaagatggaagggtggtgacacgtggcgttgtccaaaaagtagtcggtgattttaagtttacatgtgatttactgtcacataaaaaacaatgtgccacctatttttgcccgttttgtttcgaaaccaatccaagaggaggactgatgcaaaagcttaaagatctgaacttgcaaaagatatatttcttgagaacaatgaattcttacaagctcaattctaaatacggtagctttggtgtcagatgcggaagtggaccaattcttcaaaatgtcaagttggaacactacttgcctgcaatgctacacttgattgtcggactgttcacgaaatacatctttgagccgatttggatggccgttgtgagtttagacaataaaacatcatttgaaattcgaagaaataaagaagagacaaaaagagttgccgatttgaagattcaagacgcgaacaaaaaattcgaagcttcaccacttaagagaaaaagagagatgaaagcacagtttactgctctgaaagaagaaaaagtgttattggacgagacgctggatggacttgctggtggatatcttaaacaatttgagaatgatctggaagaagttggtgcaacacgccgagcgtggtttcaaatgtacactggaaatcacacgaagctgattttgtctgagaaaggtgtcacagctgcattcaaaaatctgaaaaaccacatgactccgatgcttctgaatgtcaagaacgcgatgagcaggctgtccaaaatcatgtcattgtcggccaatcgattgctctctgacgatgacatcagcgaattggatgagtcgatgaaggaattcgttgaattcctacaagctgcccacccagaagaatcaatcactcaaaaattgcatgtcctggttgctcacgtagtagaagtcgcaaaaacggaaaggagctgggaaggctttcggagcaaggaatcgaatcgcttcatgccgttttcaatcgcctcgaaagacgcttccactcggttagaaacacagggaaaagatacctgtacattgctaaggagctgtcatgtagcaatctaatttctgatatggaggaagtaagtggaacctttcaaaaaaacaaatcgtaacattttctttgcaggacgcctcgtcttcccaataaaccacccagcgctctccctcacacaattctgaaccttttcgcaattaaaatttcgggatttcctcaatatcataaaattcgggattttctatggttagttgtgcttaaaattattaaaaaattattaatttcattaaatttcaggtaaaaacattgacaagcagatactcggtgcaaatttgatggatttctcatcaatattttgtcgaggatttcataagcggagatgctccacatggtttctttcaacttggaaaattgtcgatgtgacaccttaggattctatcgaaaagctaaaattcaggatttccttcggtttgttgggattaaaattattcgaaaataattgattttatgaaatttcagattgaaaccttgtcaatgagctactcagcaaaatttcgatggatttatctcaaaaagttcaccaggagtcgaggcaattaagaaatttcaccgggtttccttattttcgaaagttttacctagaaaaatttcaacttgcatccgtaaagcgaagaaggaggctgattccacatattaaattcgttttttgaatcgatttgtatgtatttttcaattttcaatcttccgtttcgattatttctttgttttactgttcaaaattaatttttctgtgtttcaataattcaattctcaattgttagctactaaattgaataataaaatctattttttatttttttttcaccttgaaactattaattctgccttttttctgctaaaatttgcttcaaaaatctatttgccgctgcaatgttttgcggactacggtacgcaagtacgcaaacaccgcgatgacacattgcggaccatttcgctgcgtacctgcgagatctttctcaaattttacgagagatctagtttttgtgataccgtg'),
bytearray(b'cacggcccggcggggggtacatggacgagaattctctaccgtattccaatttggctgactgcgtgctcaacgttgaatactcagtgtaaactttcgtacaccgttgcgtactgcacagcgcgcattttaattgacgacatttagcaaaaattgaacagaagatttttcggaattatgaagctcaattttcacaaaaataatgagttttttgtagaatttatgaaaaaacgtgaatatatagattttttgttcatgatattcaagaaaaatcgatttttagttcttcacagagtaatcctatcgcatttcacttgctcatgatgtttttgctcgactttaggacgataaaaatgcgaattgttgataaaatgaatgaacaatataaagaa'),
bytearray(b'ggggctgctggaaccaatgtcggcatgacgagagttccggtcttctggatccatttcctgcgtgggctgtggcgacgagctgcacgtctgaaaatcaagtttttgtaatt'),
bytearray(b'tttgggcgcatgatatggagctgaatcattcgattttagaatcagcaagcttttattcatattttaggatctttttaaaaaatctggaccaacagtttttgaaaaaaaatacttttcgttcagaaatgtactgattttccactgattttcacgaaatttgaaaaaatcaataatttgggcgcatgatattgagctgaatgtttcgaatttagaatcagcatgcttttattcatattttaggatctttttaaaaaatctggcccaacagttttcgaaaaaatttaatttttgttcagaaatgtgaatattcacgaaatcgaaaaaaataattgcaaaatccgtcagctgaacattcaaaacttatcaatttgaaatcagcatatttcagtgtataattaaaaaaggtttcaaaaattctgagaccaatttttattgagaaaaataatttttcgctcgaattattgaattttcactaaatgcaaaaaacagtaaacttgggcccgtgctacaagcctgaatctttcaaattaaaaaccagcatgattttttcaatattctaggacgtttaaaaaaaatctggaccaacagtttttgaggaacgtaattttttatacaaaaatgtactgatttttcactaaactcaaaaaaatagtcaagttgggcccatgctatacacctaaattattaaaattcagaaccgccatgtattttttcatactataggctctttaaaaaaaatctggaccaacagtttttgagatatttagaaaaacaactcacttttcgacgtttttcgccttttcgcggctcacccggtcgatttttgcggcgatttgtgttctttcgctgaaaatattatttttatttcaattattaacgaagaaaacaagaaaaaacgacgagaaaacatcaaaaaaacgcgaaaaaacatcgaaaaaccaccgaaacctcatgaaaaaaataaagcattgcagccgcgggattagttttcgcaactttctaggccatgtcccgttcgccgtgccgtg'),
bytearray(b'aactagatctctcgtaaaatttgagaaagatctcgcaggtacgcagcgaaatggtccgcaatgtgtcatcgcggtgtttgcgtacttgcgtaccgtagtccgcaaaacattgcagcggcaaatagatttttgaagcaaattttagcagaaaaaaggcagaattaatagtttcaaggtgaaaaaaaaaaaaatagattttattattcaatttattagctaacaattgagaattgaattatcaaaacacagaaaaattaattttgaacagtaaaacaaagaaataatcgaaacggtagattgaaaattgaaaaatacatacaaaacgattcaaaaaacgaattaatatgtggaatcggcctccttcttcgctttacggatgcaagttgagatttttcttggaaaaactttcgaaaataaggaaatcagtgggaacttcttatttcctcgactcctgcaggatcctggtgaactttttctgttaaatccatcgaaattgtgcggagtagctcattgacaaggtttcaatctgaaattttgtgaaattttatatttttgaataattttaatcacagcaaacctagggaaatcccgaattcgagcctttcgataaaatccagagatgtcacatcgccacttttccaaaatgaaggaaaccaggtggagcttctcagctttttggcttcctggtcgaaatcttgatgaaaaaaccatcgaaatttacttgagcagcttcttggcaaggtttttgtctgaaattttaggattttaatgatttttaacatttttaaacacaactaaccataaacaatccggattttttcggttttgactgaatccttggatttatgtagaaaacatgcccagaaatcaaggaacgaggtggaacatctcatttttttgaaattctggtgtaattcttgatgaaaaatccatcgacattacgttgaacgtcttcttggcaaggtgttttcttctgaaaattcatga'),
bytearray(b'ctgtaacatctaagcagtttttgaaaagttttcgaaaaaaaaataaatttcagtttttgagatttaattttcagtgaaaaaaatttacttttggaaaattttaagtgaaaaatgtaccgtttctgaaaatgtttgcttttatcgtgtagagaatttttagctggttctaatccggcaagaaaaacagaacattttgtgatcgttttcgaaaaaaaaatttttttctttaattttaagtgcaaaattttagtttcgataatttttctgtgaagaatgtacgatttctagaaaattctgcctgtatcttgcttaaaatgaacagttctttctttttaaagcaaaaaactaacacattttgtgattttttttggcaaaaaaaattattttataatttcttatttcaaaaaatgttttttcgaaaaaattttaatgaaaaattaatatttctttatgaacttagttccgtcttatacaaaatttaatgctgattataaaataactataaaacgtgaaga'),
bytearray(b'cacggcccggcgaaagagacttggccgcgagagctgcgccggctaggccaccgcctcctatggttaagatttttgaacgaataaacatttttaatttggctgctaagctcatttattttcgttttttctcgtttttttctcatttttatcgataaaaatatattttttgttgcagaaaatcaaaaaaccgcgacaaaacagcactcaaccgccaactgggaggaggaaaatccgaaaaaagtgagtttttt'),
bytearray(b'tgcgaaaaactgtttaaagtatcgattttcttcaatatcagcaacatacaatcctttaaaatgattattttttgtaaattcgataaaaattaatttatttttcacaatttctgcccgaaaattgccgaaataaccagcgtttctataactaaaacaagtgtcgtcaattaaaatgccgcatccgcaaaatgtcgtacgaaacttttcgctgagtatcaaacgttgaatattcagtcagccaaattttactacggtagagattttacagccacgtacggttcgccgggccgtg'),
bytearray(b'ttgttcagtggagcgtatttgcataaatctagcaactgaaccaccatg'),
bytearray(b'attgaccaaaatcgagaaacattgcgaaaaactgtttaaagtgtcgattttcttcaatatcagcaacatacaatactttcaaatgattattttctgtaaattcgataaaaatccatttatttttcacaatttctgcccgaaaattgccgaaataaccagcgtttctataactaagaaagcgtcgtcaattaaaatgccgcgtccgcaaaatgtcgtacgaaacttttcgctgagtatcaaacgttgaatattcagtcagccaaattttactacggtagagaatttacagccacgtacggttcgccgggccgtg'),
bytearray(b'cacggcccggcgaaccgtacgtggctgtaaaatctctaccgtagtaaaatttggctgactgaatattcaacgtttgatactcagcgaaaagtttcgtacgacattttgcggacgcggcattttaattgacgacacttgttttagttatagaaacgctggttatttcggcaattttcgggcagaaattgtgacaaataaatgaatttttatcgaatttacaaaaaataatcattttaaaggattgtatgttgctgatattgaagaaaatcgatactttaaacagtttttcgca'),
bytearray(b'aaattgtagtcagtatcactgcagatgctggagcaggaatcacaaagttttgtctgattatcgagaattgt'),
bytearray(b'ttttcgagatattttggcaaaaacctcacttttcgtcgttttcctcctactgcgatcgattttcgccccgatgattagctgaaaataattttatatgttagttagtaacaaagaaaataagaaaaattgagaaaaaacaatcaaaaactcgagaaaa'),
bytearray(b'cgaattgtcgtcaaattaaatgcgcgtacgcaaagtacgcaatagtgtacaaaacttcttcgcgttgaatattcagtcgcgactagtcagccaaatgggactactgtagaaatttcctcggccacgttccaaacgccgctccgtg'),
bytearray(b'ttgttcgatggagcgcgtttgaacccatctagcaactgaaccaccgtg'),
bytearray(b'cacggtggttcagttgctagttgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'cacggtggttcagttgctagttgggtgcaaacgcgctccaccgaacaa'),
bytearray(b'cacggtagcacagaaactagatctctcgtaaaatttgagaaagatctcgcagcgtacgcagcgaaatggtccgcaatgtgtcatcgcggtgtttgcgtacttgcgtaccgtagtccgcaaaacaatgaagcggcaaatagatttttgaagcaaattttaacagaaaaacagcagaattaatagtttcaagatgaaaaaaaaaataaaaaatcagagtttattatgaaatttataaactaacaattgataattgaagtgttaaaacacaaaaaaagttgattttgaacagaaaaggaaagttatgaaataaaaaggcaatataaaattgagaaaacacacaatttgatacaaaaaacttaattaatat'),
bytearray(b'tgcataaattcaatttcatcttgcccagaattttaagctgattcgaattcatatcaaaatctgaagattcttcaattttttttatcgaaaaatttttttttctaaattttgtgaaaaatttt'),
bytearray(b'tttagcttcatcttgtagataactttaagctgagttcaatttgataacaaaatcgataacatttggggatttctccacataaaaaattacgatttttaattttttagtgaaaaatatttttactttcgaaattttgagcatgacacaaactttcaatcgaaaaaaaagcgtacatatctggtacagaattttatgtagattattttttcataaaaaactgatccaattttgggattcgttttcgagatagtgatcattttctatttttggtctaaaatttttgatttcgaggcaaaaaaaaattattctgaataaattaagttacatcttattcagaatttttagcttaatttctagctcgaattaaataaaaaatctgaagattcttcaatttttttttaccaaaaacagttttttttctaaactttgtgaaaaaatttttttaattcaaatggatctgacaaatttttttcaatctcaattattttagctttctcttgtacagaattttaagctgttttcaattcaataacaaaatcgataacatttgggggtttctcctcaaaaaaagttatgattttgcatttttcagcgtaaaataattttacttttaaaattttgagcacgatacaaactttcaatccaaaaaaaggtagtatttctagtacagaattttacgctgattattttttcataaaaaacttacataaatttgggattcattttcgagaaaatgatgattttctatttttggccgaaaaattgtaatttcgagacaaaaaaaaaccttttgttctgaataaattaagttatatcttcttcagaattttttgctctttctttttcaatcgaaaagtaatacattttttaattttttcaaagagaaatagtcaaaaaactcactgaattcgttttccttgttgctcacggtgacgttccttcgttttcctcgaaatgtttcgattactacctaaaacaagacacactttttcaattttgcaaaaaaaaataacaaaaaagatacggaaaaacggttaaaatcggcaaaaaacggagagaatccatgccgagcgaaaggcttgaaatttaaacgattgttcgcaatagagcgtgtttgccgccatctagcgactgaaccaccgtg'),
bytearray(b'tcagttttcacaaaatgtcgtcaattaaaatgcgcgctgtgcagtacgcattcggtgtataaattgtaaac'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'tttgctggttacggtaaaaaagtacgcaaacaccaaacgtgaagtgcagacattgcgttttaccgtacttccgtgttcttttt'),
bytearray(b'cacggcacggcgaacgggacatggcctagaaagttgcgaaaactagtcccgcggctgcaatgctttatttttttcatttggttgcggtgggcttttcgatgttttttcgtgttttttgaagttttttcgtcgttttttcttgttttcttcgttaataattgaatttaaaataatattttcagtaaaaggccacaaatcgccgcaaaaatcgaccgcgtgagccgcgaaacggcggaaaacgtctaaaagtgagttttttttctaaatatctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaa'),
bytearray(b'tagaaagatttgagactcaagctcgcgtatttcagcttatttttttcatttggttgcggtgggcttttcgatgttttttcgtgttttttgaagttttttcgtcgttttttcttgttttcttcgttaataattgaatttaaaataatattttcagtaaaaggccacaaatcgccgcaaaaatcgaccgcgtgagccgcgaaacggcggaaaacgtctaaaagtgagttttttttctaaatatctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttattttttttgaaactgcactcaaataagaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttcggaattagaaagatttgagactcaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaattgttttctcaaaaatttgacgcttattttttttgaaactgcactcaaataagaagaaaagcatgattaaccagaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaaattgttttctcaaaaatttgacgcttattttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacagttgagacccaagctcgcgtatttcatattttttttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaaattgttttctcaaaaatttgacgcttttttttttgaaactgcactcaaatatgaagaaaagcatgattaaccagaaaaagtaacatttgagactcaagctcgcgtatttcattttttttttcgattttagcacaaaatcaacactcttttgaaaaaaatattcgaaatattcgctcctaatttttaatgatttccagatgacaagctggttcacaccacattctactgaggaaatcgcttcaaataaccgaaatcatcccgacattggctccagcagctccgcaactacctacagctcgtcaaattgtttgaatatggatgaaggaattctgacgagagacgttgaaatggctttcagtgatgaaattccgactttttctaaaaccgtaatttttttaaaaatttcaaaaataacattatatgattttgcagaggacctccaattccgtgtttcctcctgcaccttatcgaaattcgttcagt'),
bytearray(b'ttctttacattattcattcattttatcaacaattcgcacttctatcgctctaaagtcgatcaaaaaagcttatcagcaactgccgtcgagtgaaatgcgatacaatttgtctgtgaaaaaacaaatatcgattttccttgaatatcgtgaagaacaaatcttcatacttacgattattcacaaattcgatgaaaatctgatagttttttccaatttcagctctataattccaaaaaaaaaccttctgttaattttcgcgaaatgtcgtcaattaaaatgcgcgctgtgcagtacgcaacggtgtacgaaactttaaactgagtattcaacgttgagcacgcagtcagccaaattggaatacggtagagaattctcgtccatgtaccccccgccgggccgtg'),
bytearray(b'gctgatattgacgaattgtcgtcgaataaaatgcgcgtacgcaaagtacgcaatagtgtacaaaacttattcgcgttgaatattcagtcagccaaatgggactactgtagaaatttcttcggccacgt'),
bytearray(b'cacggtggttcaatcgctagatggaggcaaacacgctctattgcgaacaattgtttaaatttcaagcctttcactcggcatcgattctctccgttttttgccgattttaaccgtttttccgtatcttttttgttatttttttttgtaaaatcgaaaaagtgtgtcttgtttcaggtagtaagcgaaaaaattcgaggaaaacaaaggagcgtcggcgtgggcagcaaggaaaacgaattaagtgagttttttgacgagttctcttaaaaaatcacaaaatattttatttttttgattgaaaaagagagagctaaaaattctgaacaagatacgggccgaattttttagaaatcgtaccttcttcattaatttttttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttatatttttattgccgaattagaatcagctaaaaaatctctacaagataaaggcaaacatatacacgaatcgtacatttttcacttaaaattttccaaaagtaaattattttcactgaaaatagtttttaaaaacgttccctttttcaagaaaatcaaaaactctttacgatttatagttgttttatgatcaaaatttaattgtgtataagatggaactcaattcatagagaaatcttaatttttcattaaaattttttcgaaaaaatttttttttgtaataagaatttataaaataattttttttgccaaaaaaatcacaaaatgtgttagttttttgcttcaaagagaaagaactgtttattttaagcaagatacagagagaattttttagaaatcgtacattcttcaagaaaattttttcgaaactaaaattttgcactgaaaatggaggaaaaatgatttttttcgaaaaaaatcacaaaatgttttatttttattgccgaattagaatcagctaaaaattctctacaagataaaggcaaacatatacacaaatcgtacatttttcacttaaaattttccaaaagtaaatttttttcactgaaaattgaggaaaaaatttttttttcagatgcctcctccatcaaatcaatcagcatggtcttcaaattttttaaatgctcggaaacgacgatggaaaaaagatgaagaagacgaaaattcaaatactgaaagaaatgcttccctggacctaaacctgtcgaatgtagaattgacagaagctcttgacagaaacggagatgcgccgatcactcctgagagaagacgagtgtctccatttgtggattttggtgataatctggatgtggagtcgaaacgcaatattgatttgaatctgacaataacacctgaaactgatattgtaagtgaattaatccatgctacaactcggatcctaagacgattataatattttaaaaataattttcccagccgttcgacacacaaccatcaacttcttctattgcggaatcttcttcatcacctccaagaacaccacactcttttgaaagatcgtcaagtccactctattcatcgagtgacagcgtcggcgtattttcgaatgagacaccatatgtgaagggaattagagaagacttgagtagagaacgaatgagattgctagaagttgattccattttggatagcatatcagaggtagatttatttctttgttgttgattgtttattcaattttccatttcagaaccttgacgccatcctgtccagctcaaaccattcggtccagctcaatccaagaacagacattgacgaagagacacttctaaatcggttgatggagagagacgacgagtttgccgttttttctgatgaatttttcatgcaaaggaaagagataatttcaataagaaaaacatcaaaacggaaaaccaacgaaaatctgatattaagaaaacaaatcgctcaaattacgtcaagggcgaaaaagtacagaataatatggatatcagagttgcgaagaaggaatcgttctcaaagccgaaacgttgtcacgcttcccaagtatatcatcgaatggagcaagctcgaaaataagagctctggaaaatctaggatagatttggcactggatttgcttcaacgaatcagtcgagaagaggatattcgtaatttcatgattgatttccagaaatatctccaacaaaatgattcgcagtccttccaagtccaattgacgcattggcaaacgattatttttcaagaaaaatgtagattctctaacaatcaattgcggcgggtgaagcaatactataagatgttcaccgatcttgaaatcatgccaactattaatttgacgatgcaactgaaaacgagaatgtcaactatcgacaattataaggccacaacatatatatcaaaaggagaaaaggtgaccgttgtcaaaattatcgatgtcgagaaaagtgtcattccgaagctggagcgtttgtctgtatcaaaacaattacgccatgattcctataccgatggcaaaattgttatcggaattggtggagattcgggaggaggaacgacgaaactttgtcttctgatcggaaattgtgatcatgcgaactcgccgcaccgaattgtccttctcgctgtttttgatgactctgattcgcgggaacttatcatggcctacctgtctgatcttatcgtgaagatcaacaacttcaccagcatcacctatatggaagatggaagggtggtgacacgtggcgttgtccaaaaagtagtcggtgattttaagtttacatgtgatttactgtcacataaaaaacaatgtgccacctatttttgcccgttttgtttcgaaaccaatccaagaggaggactgatgcaaaacttaaagatctgaacttgcaaaagatatatttcttgagaacaatgaattcttacaagctcaattctaaatacggtagctttggtgtcagatgcggaagtggaccaattcttcaaaatgtcaagttggaacactacttgcctgcaatgctacacttgattgtcggactgttcacgaaatacatctttgagccgatttggatggccgttgtgagtttagacaataaaacatcatttgaaattcgaagaaataaagaagagacaaaaagagttgccgatttgaagattcaagacgcgaacaaaaaattcgaagcttcaccacttaagagaaaaagagagatgaaagcacagtttactgctctgaaagaagaaaaagtgttattggacgagacgctggatggacttgctggtggatatcttaaacaatttgagaatgatctggaagaagttggtgcaacacgccgagcgtggtttcaaatgtacactggagctgattttgtctgagaaagctgattttgtctgagaaaggtgtcacagctgcattcaaaaatctgaaaaaccacatgactccgatgcttctgaatgtcaagaacgcgatgagcaggctgtccaaaatcatgtcattgtcggccaatcgattgctctctgacgatgacatcagcgaattggatgagtcgatgaaggaattcgttgaattcctacaagctgcccacccagaagaatcaatcactcaaaaattgcatgtcctggttgctcacgtagtagaagtcgcaaaaacggaaaggaatttgggaaggctttcggagcaaggaatcgaatcgcttcatgccgttttcaatcgcctcgaaagacgcttccactcggttagaaacacagggaaaagatacctgtacattgctaaggagctgtcatgtagcaatctaatttctgatatggaggaagtaagtggaacctttcaaaaaaacaaatcgtaacattttctttgcaggacgcctcgtcttcccaataaaccacccagcgctctccctcacacaattctgaaccttttcgcaattaaaatttcgggatttcctcaatatcataaaattcgggattttctatggttagttgtgcttaaaattattaaaaaattattaatttcattaaatttcaggtaaaaacattgacaagcagatactcggtgcaaatttgatggatttctcatcaatattttgtcgaggatttcataagcggagatgctccacatggtttctttcaacttggaaaattgtcgatgtgacaccttaggattctatcgaaaagctaaaattcaggatttccttcggtttgttgggattaaaattattcgaaaataattgattttatgaaatttcagattgaaaccttgtcaatgagctactcagcaaaatttcgatggatttatctcaaaaagttcaccaggagtcgaggcaattaagaaatttcaccgggtttccttattttcgaaagttttaccaagaaaaatttcaacttgcatccgtaaagcgaagaaggaggctgattccacatattaaattcgttttttgaatcgatttgtatgtatttttcaattttcaatcttccgtttcgattatttctttgttttactgttcaaaattaatttttctgtgtttcaataattcaattctcaattgttagctactaaattgaataataaaatctattttttatttttttttcaccttgaaactattaattctgccttttttctgctaaaatttgcttcaaaaatctatttgccgctgcaatgttttgcggactacggtacgcaagtacgcaaacaccgcgatgacacattgcggaccatttcgctgcgtacctgcgagatctttctcaaattttacgagagatctagtttttgtgataccgtg'),
bytearray(b'atgtcgtcaactaaaatgccgcggtacacgtccgcagtcggtgaacaaaacttttcgttgaatactcagtcagccaaatttaactactgtagaaatttctccccacacgtcgcgattgccgctccgtg'),
bytearray(b'ttgtttaaaataagtttccttttttttgaatacgcaaagaacttgatttttccaaaaaaaaaaattgttttcgaatttttatgataaaaaaaaatttttt'),
bytearray(b'cacggcacggcgaacgggacatggcctagaaagttgcgaaaactagtcccgcggctgcaatgctttattttttcatttggttgcggtgggtttttcgatgttttttcgtgtttttttaatgttttttcgtcgttttttcttgttttcttcgttaataattgaatttaaaataatattttcagcaaaaggccacaaatcgccgcaaaaatcgaccgcgtgagccgcgaaacggcggaaaacgtctaaaagtgagttgtttttctaaatatctcaaaaatttgacgcttatttttttcgaaaaagctctcaaatatggagaaaaacatgtttgttctgaattagaaagatttgagactcaagctcgcgtatttcaaatttattttttcgattttagcgaaaaatcaacacattttcgatcatttttgaacaaaaaaattgttttctcaaaaatttgacgcttattttttttgaaactgcactcaaatatgaagaaaattttgattaaccggaaaaagtaacatttgagacccaagctcgcgtatttcatatttattttttcgattttagcgaaaaatcaacacattttcgatcattttcgaacaaaaaattcattttctcaaaaattcaacgcttattttttttgaaactgcactcaaatatgaagaaaaacatgcttgttctgaattagtaagaattgcgactcaagctcgcgtatctcatatttatttgttcgatttcagtggaaaatcaacacatttttgggaaaaattttttttcgaaa'),
bytearray(b'ttctaatttttaatgattttcagatgacacgccggttcacaccacattctactgaggagatcgcttcaaataaccgaaaccatcccgacattggctcca'),
bytearray(b'ctctatattattcattcattttatcaacaattcgcacttctatcgccctaacgtcgatcaaaaaagctcatcagcaactgccgtcgagtgaaatgcgatagaatttgtctgtgaataaccaaatatcgattttccttgtatatcgtgaagaacaaatcttcatatttacgattcttcacaaattcgatgaaaatctgatagttttttcaatttcagctctataattccaaaaaaaatcttttgtcaatttcgcgaaatgtcgtcaattaaaatgcgcgctgtgcagtacgcaacggtgtacaaaactttaaactgagtattcaacgttgagcacgcagtcagccaaattggaatacggtagagaattctcgtctatgtaccccccgccgggccgtg'),
bytearray(b'gctgctgcagtgttttggcgactacggtacgctgttacgcaaaccgcgcaatgacaacatt'),
bytearray(b'attgaacagggcatgaaaggattcaatgccctgctccgatgatcttccc'),
bytearray(b'cacggcccggcgaaagagacgtggccgcgagagctgcgccggctaggccaccgcctcctatggttaagatttttgaacggataaaaatttttaatttggctgctaagctcatttatcttcgttttttctcgttttttctcatttttatcgataaaaatatattttttgttgcagaaaatcaaaaaaccacgacaaaacagcactcaaccgccaactgggaggaggaaaatccgaaaaaagtgagtttttt'),
bytearray(b'tgcgaaaaactgtttaaagtatcgattttcttagtaaatatcagcatcataaaattatttaaaattattattttctgtaaattcgataaaaatccatttatttttcacaatttctgcccgaaaattaataaccagcgtttctataactaagaaggtgtcgtcaattaaaatgccgcgtccgcaaaatgtcgtacgaaacttttcgctgagtatcaaacgttgaatattcagtcagccaaattttactacggtagagattttacagccacgtacggttcgccgggccgtg'),
bytearray(b'gttcgttcagctcaattttgtctactttaattaagttggagtagtttgttctaacttcaaaaacgtgattttcagacgtgcagctcgtcgccacaggctaatcaggaaatcagttcagaagatcgaaactcccaccatcccgacattgcttccagtagctccgcaactacatgcaggtcgtcaaattgtttcaatatggatgaaggaattctgacgagagacgtggaaatggattttagtgacgaaattccgactttttctcacaatgtaattttttttcaatatttcaaaactaatattatatgattttgcagagattctccaatcccgtgtttccaccgacaccttcacgaaactcgttcagtacaccgagggttcgaataatctccaatgttgagcaaattccgcagatcaaaatgctccgactaacgatcgaacgactgaaaaatgaaaaaatgaaatatcagaagaaatacaaaagcggaatggtgagtttgcggtgttttcggaagaaatactaatagtccgccaagaaaatatcactgtaagagcatcgaattcttcgctgcaaatggaagtatgcaatatgcgaacgatcgtcagtggtatagaaggtcgattggtgagaagctgggcagatgtgaagaaaagtggcaaagctatggttctgagtctacaaagacacgccgaagaaattgaaaacctcaaaaaatcagcttttaaagttaaattttcgaatctaagaaacaaaagttgtatcaaaaatcgatttcgcaaagctcttcacttgcttcaacacgaaatctgcctcgaggatgacgtcggacaatttattcgaaaattcacaaaattcctgaattcagaagagaatcaaagctacaaaaacaagctttcaaactttgaggccgttgactttttatcaaaatgcggattatctctaagtcagatggaaaaaatcaaaaactatttgactaattccattggctatgatttgcttccattggtaaagaacacaagagatttgtcaatccagttatcaatgatctccagtttcaaagtttccacttcgttcgacaataaaggaaaagtgattaccattgtgcagtgcatgaaaattgcagaagttttagcctatcgaatagagcttctatgcaattcaaatcaatttgtggacgatggctacacgaaaggagtaatcaagattggagtcgttggtgacgctggaggtggaagcacaaagttggcattggtaatcggaatgtatctcgaccgaattcttcgagacacgttatggtaattgcagtatatgacggatccgacaattactcgtgtctgaaaaaattcattcccgacgttcttgagcagcttggcaagctgacaaaaattcgatatttggacaaaggagttcaaaaaaccgcaaaaattgtgcaaattgcaactggagattgcaagtttcaatttgatattcttggacaccaaggacattcatcccacaatttctgcttcaagtgttttgcccagaatccacggggagctgaagagaggatgaaaatcaaggatatgaatgttgatgaaacattccacccacgaacaatcgatctttacaaagcttgttgtactccacttctgccacacgttgcgatcatgttctatttgattcctcttctgcacattatcatgggtatttttgacaaatatatcttcaatcccctctggaaatattctgtcactttggataacactacttgtttccctatcctgaaaacgcgaaaagcaactctaaaaaacgctgttaacctaatcaaaagtgcggaagaaaagtatcaagcagccaccggaaagatgaaattggagtcgcatgctgaactgaaagcacttcaatcggaaaaattgttgctcgacacaattgtgaatggtactcccggaggaacacttgaaaaaatggaacagtgttgggctaagtttggagcggacaagcaggcttggttccaatcattttgcggaaaccatttgaaattgctgctaacgccggctattgtagaggagactttcaacatttttggcccaaatttatgcccaatgttgctcggattgaaatcagcaatgggaaagctttcaactattatgtcactttcgggaaacaagtttttaaatgattccgacgtcttaatacttcaaaattcgattcgcggatttgtggaggatttgaaagtagcggttcccgaagaaacaattattttaaagctgcacttggtggtctaccatgcaccacaaatggcaaaagatgtgagaaacattggaaggatcacagaaccaggagttgaatctgtccacgcgattttcaatgcacttgagaggagattttgcaattaccgggacaagaaaaggcgttacatccatgttctgagagagctcatgtgtcgtaacatcatgaatgacatgactatggtaatttttgtacagaaaaaaatttctgcattcttaaaaacacaactatttcagaacacgtccattcccggtctttccatcgtaccaaaagatgcaacagttctgccaatttcttcgcaaaatgatcttcgggacccaaaagtagttattcgggacgtcacggctgcccaaaaaagaaaaaatttagcgaaaaaaaaaatcaattttagctccaagaaaaggcttccacaatcgacaaaatctcgaaatgctgtggtgagatttttatttttcacgaaaagtttcaaattttgaacattttcagcccgacttctcacaaaacgtacaaacttctattgcatttggaggcaattcaacgaaaattggattccgtggtctaccgagcaatgctctttctcgtccgaacccgatggtaggttcactgctttaaatcaactcaaaattgaaatcaaaaattatttcaggcaatcacacaaacttgctcgacaagccagccaatcactagaagccatccacttgttgctccaccagtatcttatcaacttccgaaatgcagtaatagtttagctgaagtgagtgtaaaggtgaaatgacaatttcaataagttaaaaattttagaaaattcgagcaagttgtgggcaagcaagaaggagtgctgcagaatcttcttcgactgaaagccacgtggacacaaattcgaagaagaaagtttacagactaaaattcgtccaaaaaagtcgtgaaattccgaagaaatagtaatatatcgattttctgttttttttattatgatttctaatatttttattttaatgcttaatttaaaatgaccgaaaataaagaattgattctcaaaaagtaatctcagagcaatatatgttgtttctttatataatttatccattttatcaacaaatcgcatttttatcgctcaaaaatcgataaaaacagctcatcagaaactttcatcgagtaaaatgagacatcattcgtctgtgaaaagctaaatatcgattttccttgaatatagtgaagaaaagatcttcatacttacgatttttcacaaattcgatgaaaatttgatagttttttttcaatttcagctctataattccgaaaaaaatcttctgataattttcgcgaaatgtcgtcaattaaaatgcgcgctgtgcagtacgcaacggtgtacgaaactttaaactgagtattcaacgttgagcacgcagtcagccaaattggaatacggtagagaattctcgtccatgtaccccccgccgggccgtg'),
bytearray(b'ttgttcggtggatcgcgtttgcacccatctagcaactgaaccacagtg'),
bytearray(b'GAAGCCAATGTCGGGATGGTTTCGGTTATTTGAAGCGATTTCCTTAGTAGAATGTGGTGTGAACCGGCGtgccatctggaaatc'),
bytearray(b'tgccatctggaaatccttaaaaatttggtgcgaatatttcgaaaaaaagttttccaaaaatgtgttgattttccactaaaatcgaaaaaataaatatgaaatacgcgagcttgagtctcaattcttactaattcagaacaagcatttttttctccatattcgagtgcagtttcaaaaaaattaaacgttgaatttttgagaaaataatttttttgttcgaaaatgtgttgattttccactaacataaaatcgaaaaagt'),
bytearray(b'cacggtggttcagttgctagatgggtgcaaacgcgctccaccgaataa'),
bytearray(b'gtcttaatacttcaaaattcgattcgcggatttgtggaggatttgaaagtagctgttcccgaagaaacaattattttaaagctgcacttgttgatctaccatgcaccacaaatgacaaaagatctgagaaacattggaaagatcacagaacaaggagttgaatctgtccacgcgattttcaatgcacttgagaggagattttgcaattaccgggacaagaaaagacgttacatccatgttctgagagagctcatgtgtcgtaacatcatgaatggtaattttggtacagaaaaacatttctgcattcttaaaaacacaactatttcagaacacgtccattcccggtctttccatcgtaccaaaatatgcaacagttctgccaatttcttcgcaaaatgatcttcgggacccaaaagcagttatacgggacgtcacggctgcccaaaaaagaaaaaatttagcgaaaaaaaaaaatcaattttcgctccaagaaaaggcttccacaatcgacaaaatctcgaaatgctgtggtgagatttttatttttcacgaaaagttttaaattttgaaaattttcagcccgacttctcacaaaacgtacaaacttctatttcatttggaggcaattcgacgaaaattggattccgtggtctaccgagcaatgctctttctcgtccgaacccgatggtacgttcactgctttaaatccactcaaaattgaaatcaaaaattatttcagccaatcacacaaacttgctcaacaagccagcaaatcactagaagccatccacttgttgctccaccagtatcttatcaacttccgaaatgcagtaatagtttagctgaagtgagtgtaaacgcgaaatgacaatttcaataagttaaaaattttagaaaattcgagcgagttgtgggcaagcaagaaggagtgctgcagaatcttcttcgactcaaagccacgtggacacaaattcgaagaagaaagtttacagactaaaattcgttcaaaaaagtcgtgaaattccgaagaaatagtaatatatcgattttctgtttaaatatttatgatttctaatatttttattttaatgcttcatttaaaatgaccgaaaaataaagaatagattttcaaaaagtaatctcagagcaatatatgttgtttctttatataatttatccattttatcaacaaatcgcatttttatccctcaaaaatcgataaaaacagctca'),
bytearray(b'ttgttcggtggagcgcgtttgcacccatttagcaactgaaccaccgtg'),
bytearray(b'actacggtgtgcaagtacgcaaacaccgcggcggcaatttgc'),
bytearray(b'ttcttcaaaaaaacttcttcgaaattcaaattttgcaccaaaaa'),
bytearray(b'ttgttcggtggagcgcgtttgcacctatttaacaactgaaccaccgtg'),
bytearray(b'tattaattgctaaaatttatgtggactacggtagtcaagtccgcaaacaccacg'),
bytearray(b'cgtggtgtttgcggacttgactaccgtagtccacataaattttagcaattaata'),
bytearray(b'aaattatacacgtttgttcggtggagcgagtttgcttccatctagcaactgaaccaccgtg'),
bytearray(b'gtgtgcaacttgccgccgcggtgtttgcgtacttgcacaccgtagt')]
In [70]:
from operator import attrgetter
' '.join(map(attrgetter('rep_cl'), reps[:60]))
Out[70]:
'Simple_repeat Simple_repeat Satellite Simple_repeat DNA/MULE-MuDR DNA DNA DNA Simple_repeat Unknown Unknown DNA/PiggyBac? DNA DNA DNA DNA DNA DNA DNA DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA DNA Simple_repeat DNA/hAT DNA/hAT DNA/MULE-MuDR DNA/hAT DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA/hAT DNA/MULE-MuDR Simple_repeat DNA/hAT Unknown Unknown Unknown DNA DNA DNA/CMC-Chapaev DNA/CMC-Chapaev DNA/CMC-Chapaev RC/Helitron Simple_repeat DNA DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA/TcMar-Tc1? DNA/TcMar-Pogo Simple_repeat DNA Simple_repeat DNA DNA DNA/TcMar-Tc1 DNA/TcMar-Tc1 DNA/hAT Simple_repeat'
You'll notice a few things. (1) The family names seem to have some hierarchical relationships; e.g. DNA/TcMar-Tc1 seems to be more specific than DNA, (2) some of them end in a question mark, (3) some of them are Unknown. I don't really know what these mean or what to do as a result -- you'll have to navigate that issue. Seems like you can often look up the family names on RepeatMasker's site and find more detailed info (e.g., here are the details for DNA/TcMar-Tc1).
Dfam is an alternative. Note that Dfam ultimately relies on Repbase for its "seed alignments." Also, the only the human genome has a pre-built Dfam database, as far as I can tell.
In [ ]:
Content source: mmcco/bioinformatics
Similar notebooks: