A strawberry flavour gene vector for Saccharomyces cerevisiae

This Jupyter notebook describes the simulated cloning of the strawberry Fragaria × ananassa alcohol acyltransferase SAAT gene and the construction of a S. cerevisiae expression vector for this gene.

The SAAT gene is involved in the production of the strawberry fragrance. It is necessary to first produce cDNA, a process which is not decribed in this notebook. Here is a recent protocol for the extraction of nucleic acids from Strawberry.



In [1]:

    
# Import the pydna package functions
from pydna.all import *



In [2]:

    
# Give your email address to Genbank, so they can contact you.
# This is a requirement for using their services
gb=Genbank("bjornjobb@gmail.com")
# download the SAAT CDS from Genbank
# We know from inspecting the 
saat = gb.nucleotide("AF193791 REGION: 78..1895")



In [3]:

    
# The representation of the saat Dseqrecord object contains a link to Genbank
saat









    Out[3]:




AF193791  78-1895



In [4]:

    
# design two new primers for SAAT
saat_amplicon = primer_design(saat)



In [5]:

    
fw="aa"+saat_amplicon.forward_primer
rv=saat_amplicon.reverse_primer



In [6]:

    
# We can set the primer identities to something descriptive
fw.id, rv.id = "fw_saat_cds", "rv_saat_cds"



In [7]:

    
saat_pcr_prod = pcr(fw,rv, saat)



In [8]:

    
# The result is an object of the Amplicon class 
saat_pcr_prod









    Out[8]:




Amplicon(1820)



In [9]:

    
# The object has several useful methods like .figure() 
# which shows how the primers anneal 
saat_pcr_prod.figure()









    Out[9]:





  5ATGGACACCAAGATTGG...CCCACCTAATCCTCAGTAA3
                       ||||||||||||||||||| tm 53.4 (dbd) 58.3
                      3GGGTGGATTAGGAGTCATT5
5aaATGGACACCAAGATTGG3
   ||||||||||||||||| tm 52.2 (dbd) 58.5
  3TACCTGTGGTTCTAACC...GGGTGGATTAGGAGTCATT5



In [10]:

    
# read the cloning vector from a local file
pYPKa=read("pYPKa.gb")



In [11]:

    
# This is a GenbankFile object, its representation include a link to the local file:
pYPKa









    Out[11]:




pYPKa.gb



In [12]:

    
# import the restriction enzyme AjiI from Biopython
from Bio.Restriction import AjiI



In [13]:

    
# cut the vector with the .linearize method. This will give an error is more than one 
# fragment is formed
pYPKa_AjiI = pYPKa.linearize(AjiI)



In [14]:

    
# The result from the digestion is a linear Dseqrecord object
pYPKa_AjiI









    Out[14]:





Dseqrecord(-3128)



In [15]:

    
# clone the PCR product by adding the linearized vector to the insert
# and close it using the .looped() method.
pYPKa_A_saat = ( pYPKa_AjiI + saat_pcr_prod ).looped()
pYPKa_A_saat









    Out[15]:





Dseqrecord(o4948)



In [16]:

    
# read promoter vector from a local file
pYPKa_Z_prom = read("pYPKa_Z_TEF1.gb")
# read terminator vector from a local file
pYPKa_E_term = read("pYPKa_E_TPI1.gb")



In [17]:

    
pYPKa_Z_prom









    Out[17]:




pYPKa_Z_TEF1.gb



In [18]:

    
pYPKa_E_term









    Out[18]:




pYPKa_E_TPI1.gb



In [19]:

    
[pYPKa_Z_prom,pYPKa_Z_prom]









    Out[19]:





[File(-)(o3721), File(-)(o3721)]

In the cell below, primers relevant to the Yeast Pathway Kit are read into six sequence objects. These are similar to the ones created in cell [3]



In [20]:

    
# Standard primers
p567,p577,p468,p467,p568,p578  =  parse_primers('''

>567_pCAPsAjiIF (23-mer)
GTcggctgcaggtcactagtgag
>577_crp585-557 (29-mer)
gttctgatcctcgagcatcttaagaattc

>468_pCAPs_release_fw (25-mer)
gtcgaggaacgccaggttgcccact
>467_pCAPs_release_re (31-mer) 
ATTTAAatcctgatgcgtttgtctgcacaga

>568_pCAPsAjiIR (22-mer) 
GTGCcatctgtgcagacaaacg
>578_crp42-70 (29-mer)
gttcttgtctcattgccacattcataagt''')



In [21]:

    
p567









    Out[21]:





567_pCAPsAjiIF 23-mer:5'-GTcggctgcaggtca..gag-3'



In [22]:

    
# Promoter amplified using p577 and p567
p = pcr(p577, p567, pYPKa_Z_prom)



In [23]:

    
# Gene amplified using p468 and p467
g = pcr(p468, p467, pYPKa_A_saat)



In [24]:

    
# Terminator amplified using p568 and p578
t = pcr(p568, p578, pYPKa_E_term)



In [25]:

    
# Yeast backbone vector read from a local file
pYPKpw = read("pYPKpw.gb")



In [26]:

    
from Bio.Restriction import ZraI



In [27]:

    
# Vector linearized with ZraI
pYPKpw_lin = pYPKpw.linearize(ZraI)



In [28]:

    
# Assembly simulation between four linear DNA fragments:
# plasmid, promoter, gene and terminator
# Only one circular product is formed (8769 bp)
asm = Assembly( (pYPKpw_lin, p, g, t) )



In [29]:

    
asm









    Out[29]:





Assembly
fragments..: 5603bp 811bp 1907bp 922bp
limit(bp)..: 25
G.nodes....: 8
algorithm..: common_sub_strings



In [30]:

    
# Inspect the only circular product
candidate = asm.assemble_circular()[0]
candidate.figure()









    Out[30]:





 -|pYPKpw_lin|124
|             \/
|             /\
|             124|811bp_PCR_prod|50
|                                \/
|                                /\
|                                50|1907bp_PCR_prod|37
|                                                   \/
|                                                   /\
|                                                   37|922bp_PCR_prod|242
|                                                                     \/
|                                                                     /\
|                                                                     242-
|                                                                        |
 ------------------------------------------------------------------------



In [31]:

    
# Synchronize vectors
pYPK0_TDH3_FaPDC_TEF1 = candidate.synced(pYPKa)



In [32]:

    
# Write new vector to local file
pYPK0_TDH3_FaPDC_TEF1.write("pYPK0_TDH3_FaPDC_TPI1.gb")









    




pYPK0_TDH3_FaPDC_TPI1.gb

The final vector pYPKa_TDH3_FaPDC_TEF1 has 8769 bp. The sequence can be inspected by the hyperlink above.

The restriction enzyme PvuI cuts twice in the plasmid backbone and once in the SAAT gene.



In [33]:

    
from Bio.Restriction import PvuI



In [35]:

    
#PYTEST_VALIDATE_IGNORE_OUTPUT
%matplotlib inline

from pydna.gel import Gel, weight_standard_sample

standard = weight_standard_sample('1kb+_GeneRuler')

Gel( [ standard, 
       pYPKpw.cut(PvuI),
       pYPK0_TDH3_FaPDC_TEF1.cut(PvuI) ] ).run()

The gel above shows that the empty vector (pYPKpw) is easily distinguishable from the expected final construct by digestion with PvuI.