The objective is to assemble three 50 bp sequences into one circular sequence.
We will use the assembly_fragments function and the Assembly class.
In [1]:
from pydna.all import *
The sequences below were generated here.
In [2]:
frags = parse('''
>1|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 50 bp
ccagaatacagtgccttagatctacggatcgtatctgcgatttggccgat
>2|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 50 bp
gccctgcttggtagatcaggcgagccaataacattctatagtgtagcctt
>3|random sequence|A: 0.25|C: 0.25|G: 0.25|T: 0.25|length: 50 bp
gagagcgctcctgtttcaatgcttgcaaactctagcagctatactgtagg ''' )
In [3]:
frags
Out[3]:
We make a list of amplicons (sequences with pairs of primers from the Dseqrecords)
In [4]:
amplicons = [primer_design(f) for f in frags]
We need a list of golden gate linkers, these could be generated automatically in some other way.
In [5]:
golden_gate_linkers = [Dseqrecord(lnk) for lnk in "GAAT GATC AATT GAAT".split()]
In [6]:
golden_gate_linkers
Out[6]:
In [7]:
from itertools import chain, zip_longest
we zip together the golden gate linkers and sequences to a flat list.
In [8]:
seqlist = list( chain.from_iterable( zip_longest(golden_gate_linkers, amplicons)))[:-1]
In [9]:
seqlist
Out[9]:
The optional settings below are important. Sequences with a size equal to or shorter than maxlink will be incorporated in the primers. overlap controls the overlap between the sequences in the assembly.
In [10]:
a,b,c = assembly_fragments( seqlist, maxlink=4, overlap=4 )
We get only three sequences, since the golden gate linkers are incorporated in the primers. Lets give them nicer names:
In [11]:
a.locus, b.locus, c.locus = "sequenceA", "sequenceB", "sequenceC"
In [12]:
a.figure()
Out[12]:
In [13]:
b.figure()
Out[13]:
In [14]:
c.figure()
Out[14]:
We can assemble these by setting the limit to 4 and only_terminal_overlaps to True. With such short homology limit, we need to consider only terminal overlaps, otherwise we would get many irrelevant results.
In [15]:
from pydna.assembly import terminal_overlap
In [16]:
asm = Assembly((a,b,c), limit=4, algorithm=terminal_overlap)
asm
Out[16]:
We got three circular products. The second one should be the same as the theoretical one below:
In [17]:
correct = Dseqrecord("")
for s in seqlist[1:]:
correct += s
correct = correct.looped()
In [18]:
correct.cseguid()
Out[18]:
In [19]:
candidate = asm.assemble_circular()[1]
In [20]:
candidate.cseguid()
Out[20]:
The candidate and the correct sequence has the same cseguid, so they represent the same circular sequence. We need to add the BsaI restriction enzyme recognition sequence (plus one nucleotide to get the cut right) to the primers:
In [21]:
from Bio.Restriction import BsaI
In [22]:
BsaI.site
Out[22]:
In [23]:
for f in (a,b,c):
f.forward_primer = BsaI.site + "a" + f.forward_primer
f.reverse_primer = BsaI.site + "a" + f.reverse_primer
print(f.name)
print(f.forward_primer.format("tab"))
print(f.reverse_primer.format("tab"))
print(f.figure())
In [24]:
first_prod = pcr(a.forward_primer, a.reverse_primer, a.template)
In [25]:
first_prod.figure()
Out[25]:
In [26]:
first_prod.cut(BsaI)
Out[26]:
In [27]:
first_prod.cut(BsaI)[1].seq
Out[27]: