Mumberg 32 expression vectors

This pydna notebook automatically assembles the sequences of the 32 expression vectors described in

Mumberg,D., Müller,R. and Funk,M. (1995) Yeast vectors for the controlled expression of heterologous proteins in different genetic backgrounds. Gene, 156, 119–122.

These plasmids were made from the pRS series of yeast vectors.

a [XhoI - KpnI fragment of the ScCYC1 terminator] was cloned using the same sites in pRS416 resulting in p416CYC1t

PCR generated SacI - XbaI promoter fragments of ScTDH3pr, ScTEF1pr, ScADH1pr and ScCYC1pr promoters were cloned using the same sites resulting in:

p416GPD sequenced SacI,XbaI,XhoI,KpnI are unique
p416TEF sequenced SacI,XbaI,XhoI,KpnI are unique
p416ADH
p416CYC sequenced SacI,XbaI,KpnI are unique

The SacI - KpnI promoter + terminator fragment was transferred to other pRS vectors using a strategy that is undisclosed in the paper.

ScTDH3pr-ScCYC1t
ScTEF1pr-ScCYC1t
ScADH1pr-ScCYC1t
ScCYC1pr-ScCYC1t

Probably SacI - KpnI were used. In the case of pRS423 and pRS425, there are two KpnI sites, so maybe a partial cut was used?

                        this part is the opposite direction in the paper (this version here is correct!)
                           <--------------->
mcs:
     7 SpeI      19 SmaI     31 EcoRI    43 HindIII           64 XhoI
1 XbaI      13 BamHI    25 PstI     37 EcoRV    49 ClaI  58 SalI
|     |     |     |     |     |     |     |     |        |     |     
TCTAGAACTAGTGGATCCCCCGGGCTGCAGGAATTCGATATCAAGCTTATCGATACCGTCGACCTCGAG
TCTAGAACTAGTGGATCCCCCGGGCTGCAGC...|
                          ---------
                          |...GAATTCGATATCAAGCTTATCGATACCGTCG               



TCTAGAACTAGTGGATCCCCCGGGCTGCAGAatggttttcggtaacaggca

       ccgttcggtctttagatagttaaGAATTCGATATCAAGCTTATCGATACCGTCGACCTCGAG

>primerfw
TCTAGAACTAGTGGATCCCCCGGGCTGCAGA...
>primerrev
CGACGGTATCGATAAGCTTGATATCGAATTC...

            XbaI SpeI BamHI SmaI PstI EcoRI EcoRV HindIII ClaI SalI XhoI

p413 HIS3   XbaI SpeI BamHI SmaI      EcoRI EcoRV         ClaI SalI XhoI
p414 TRP1   XbaI SpeI BamHI SmaI PstI EcoRI               ClaI SalI XhoI
p415 LEU2   XbaI SpeI BamHI SmaI PstI             HindIII      SalI XhoI
p416 URA3   XbaI SpeI BamHI SmaI      EcoRI       HindIII ClaI SalI XhoI

p423 HIS3        SpeI BamHI SmaI      EcoRI EcoRV         ClaI SalI XhoI
p424 TRP1        SpeI BamHI SmaI PstI EcoRI               ClaI SalI XhoI
p425 LEU2        SpeI BamHI SmaI PstI             HindIII      SalI XhoI
p426 URA3        SpeI BamHI SmaI      EcoRI       HindIII ClaI SalI XhoI

XbaI only unique in p41XXXX (All centromeric vectors, but no episomal vectors) XhoI is not unique in p4XXCYC1 (any vector with ScCYC1 promoter)

M#  vector          ori         marker
--  -------  
M1  p413GPD         CEN6/ARSH4  HIS3
M2  p423GPD          2 um       HIS3
M3  p414GPD         CEN6/ARSH4  TRP1
M4  p424GPD          2 um       TRP1
M5  p415GPD         CEN6/ARSH4  LEU2
M6  p425GPD          2 um       LEU2
M7  p416GPD         CEN6/ARSH4  URA3 sequenced http://www.ncbi.nlm.nih.gov/nuccore/82548122
M8  p426GPD          2 um       URA3 sequenced http://www.ncbi.nlm.nih.gov/nuccore/76496266
M9  p413TEF         CEN6/ARSH4  HIS3
M10 p423TEF          2 um       HIS3
M11 p414TEF         CEN6/ARSH4  TRP1
M12 p424TEF          2 um       TRP1
M13 p415TEF         CEN6/ARSH4  LEU2
M14 p425TEF          2 um       LEU2
M15 p416TEF         CEN6/ARSH4  URA3 sequenced http://www.ncbi.nlm.nih.gov/nuccore/124365231
M16 p426TEF          2 um       URA3 assemb
M17 p413ADH         CEN6/ARSH4  HIS3
M18 p423ADH          2 um       HIS3
M19 p414ADH         CEN6/ARSH4  TRP1
M20 p424ADH          2 um       TRP1
M21 p415ADH         CEN6/ARSH4  LEU2
M22 p425ADH          2 um       LEU2 <-bad, not sold from ATCC anymore
M23 p416ADH         CEN6/ARSH4  URA3
M24 p426ADH          2 um       URA3
M25 p413CYC         CEN6/ARSH4  HIS3
M26 p423CYC          2 um       HIS3
M27 p414CYC         CEN6/ARSH4  TRP1
M28 p424CYC          2 um       TRP1
M29 p415CYC         CEN6/ARSH4  LEU2
M30 p425CYC          2 um       LEU2
M31 p416CYC         CEN6/ARSH4  URA3 sequenced
M32 p426CYC         2 um        URA3

In [1]:
from pydna.all import *
gb =Genbank("bjornjobb@gmail.com")

# The DNA fragments below were obtained from Saccharomyces Genome Database
# www.sgd.org


cyc1term =Dseqrecord("ctcgagtcatgtaattagttatgtcacgcttacattcacgccctccccccacat"
                            "ccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctattt"
                            "atttttttatagttatgttagtattaagaacgttatttatatttcaaatttttc"
                            "ttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgc"
                            "ttgagaaggttttgggacgctcgaaggctttaatttgcggccggtaccn")

tefprom =Dseqrecord("gagctcatagcttcaaaatgtttctactccttttttactcttccagattttctcgga"
                           "ctccgcgcatcgccgtaccacttcaaaacacccaagcacagcatactaaatttcccc"
                           "tctttcttcctctagggtgtcgttaattacccgtactaaaggtttggaaaagaaaaa"
                           "agagaccgcctcgtttctttttcttcgtcgaaaaaggcaataaaaatttttatcacg"
                           "tttctttttcttgaaaattttttttttgatttttttctctttcgatgacctcccatt"
                           "gatatttaagttaataaacggtcttcaatttctcaagtttcagtttcatttttcttg"
                           "ttctattacaactttttttacttcttgctcattagaaagaaagcatagcaatctaat"
                           "ctaagttttctaga")

cycprom =Dseqrecord("gagctcatttggcgagcgttggttggtggatcaagcccacgcgtaggcaatcctcga"
                           "gcagatccgccaggcgtgtatatatagcgtggatggccaggcaactttagtgctgac"
                           "acatacaggcatatatatatgtgtgcgacgacacatgatcatatggcatgcatgtgc"
                           "tctgtatgtatataaaactcttgttttcttcttttctctaaatattctttccttata"
                           "cattaggacctttgcagcataaattactatacttctatagacacgcaaacacaaata"
                           "cacacactaatctaga")

adhprom =Dseqrecord("gagctcTAAAACAAGAAGAGGGTTGACTACATCACGATGAGGGGGATCGAAGAAATG"
                           "ATGGTAAATGAAATAGGAAATCAAGGAGCATGAAGGCAAAAGACAAATATAAGGGTC"
                           "GAACGAAAAATAAAGTGAAAAGTGTTGATATGATGTATTTGGCTTTGCGGCGCCGAA"
                           "AAAACGAGTTTACGCAATTGCACAATCATGCTGACTCTGTGGCGGACCCGCGCTCTT"
                           "GCCGGCCCGGCGATAACGCTGGGCGTGAGGCTGTGCCCGGCGGAGTTTTTTGCGCCT"
                           "GCATTTTCCAAGGTTTACCCTGCGCTAAGGGGCGAGATTGGAGAAGCAATAAGAATG"
                           "CCGGTTGGGGTTGCGATGATGACGACCACGACAACTGGTGTCATTATTTAAGTTGCC"
                           "GAAAGAACCTGAGTGCATTTGCAACATGAGTATACTAGAAGAATGAGCCAAGACTTG"
                           "CGAGACGCGAGTTTGCCGGTGGTGCGAACAATAGAGCGACCATGACCTTGAAGGTGA"
                           "GACGCGCATAACCGCTAGAGTACTTTGAAGAGGAAACAGCAATAGGGTTGCTACCAG"
                           "TATAAATAGACAGGTACATACAACACTGGAAATGGTTGTCTGTTTGAGTACGCTTTC"
                           "AATTCATTTGGGTGTGCACTTTATTATGTTACAATATGGAAGGGAACTTTACACTTC"
                           "TCCTATGCACATATATTAATTAAAGTCCAATGCTAGTAGAGAAGGGGGGTAACACCC"
                           "CTCCGCGCTCTTTTCCGATTTTTTTCTAAACCGTGGAATATTTCGGATATCCTTTTG"
                           "TTGTTTCCGGGTGTACAATATGGACTTCCTCTTTTCTGGCAACCAAACCCATACATC"
                           "GGGATTCCTATAATACCTTCGTTGGTCTCCCTAACATGTAGGTGGCGGAGGGGAGAT"
                           "ATACAATAGAACAGATACCAGACAAGACATAATGGGCTAAACAAGACTACACCAATT"
                           "ACACTGCCTCATTGATGGTGGTACATAACGAACTAATACTGTAGCCCTAGACTTGAT"
                           "AGCCATCATCATATCGAAGTTTCACTACCCTTTTTCCATTTGCCATCTATTGAAGTA"
                           "ATAATAGGCGCATGCAACTTCTTTTCTTTTTTTTTCTTTTCTCTCTCCCCCGTTGTT"
                           "GTCTCACCATATCCGCAATGACAAAAAAATGATGGAAGACACTAAAGGAAAAAATTA"
                           "ACGACAAAGACAGCACCAACAGATGTCGTTGTTCCAGAGCTGATGAGGGGTATCTCG"
                           "AAGCACACGAAACTTTTTCCTTCCTTCATTCACGCACACTACTCTCTAATGAGCAAC"
                           "GGTATACGGCCTTCCTTCCAGTTACTTGAATTTGAAATAAAAAAAAGTTTGCTGTCT"
                           "TGCTATCAAGTATAAATAGACCTGCAATTATTAATCTTTTGTTTCCTCGTCATTGTT"
                           "CTCGTTCCCTTTCTTCCTTGTTTCTTTTTCTGCACAATATTTCAAGCTATACCAAGC"
                           "ATACAATCAACTATCTCATATACAtctaga")

gpdprom =Dseqrecord("gagctcagtttatcattatcaatactcgccatttcaaagaatacgtaaataattaat"
                           "agtagtgattttcctaactttatttagtcaaaaaattagccttttaattctgctgta"
                           "acccgtacatgcccaaaatagggggcgggttacacagaatatataacatcgtaggtg"
                           "tctgggtgaacagtttattcctggcatccactaaatataatggagcccgctttttaa"
                           "gctggcatccagaaaaaaaaagaatcccagcaccaaaatattgttttcttcaccaac"
                           "catcagttcataggtccattctcttagcgcaactacagagaacaggggcacaaacag"
                           "gcaaaaaacgggcacaacctcaatggagtgatgcaacctgcctggagtaaatgatga"
                           "cacaaggcaattgacccacgcatgtatctatctcattttcttacaccttctattacc"
                           "ttctgctctctctgatttggaaaaagctgaaaaaaaaggttgaaaccagttccctga"
                           "aattattcccctacttgactaataagtatataaagacggtaggtattgattgtaatt"
                           "ctgtaaatctatttcttaaacttcttaaattctacttttatagttagtctttttttt"
                           "agttttaaaacaccagaacttagtttcgacggattctaga")

In [2]:
from Bio.Restriction import XhoI, KpnI, SacI, XbaI

# The Mumberg vectors depend on these pRS plasmids

pRS413 = gb.nucleotide("U03447")
pRS414 = gb.nucleotide("U03448")
pRS415 = gb.nucleotide("U03449")
pRS416 = gb.nucleotide("U03450")

pRS423 = gb.nucleotide("U03454")
pRS424 = gb.nucleotide("U03453")
pRS425 = gb.nucleotide("U03452")
pRS426 = gb.nucleotide("U03451")

In [3]:
# The CYC1 terminator was cloned in pRS416 using XhoI and KpnI

p416cyc1term = ( pRS416.cut(XhoI, KpnI)[0].rc() + cyc1term.cut(XhoI, KpnI)[1] ).looped()

In [4]:
p416cyc1term


Out[4]:
Dseqrecord(o5141)

In [5]:
# The p416cyc1term was opened using SacI and XbaI

stuffer, bb = p416cyc1term.cut(SacI, XbaI)
bb, stuffer


Out[5]:
(Dseqrecord(-5121), Dseqrecord(-28))

In [6]:
# Each promoter was cloned in front of the CYC1 terminator


p416GPD = ( bb + gpdprom.cut(SacI, XbaI)[1] ).looped().synced("gacgaaagggcctcgtgatacgccta")
p416TEF = ( bb + tefprom.cut(SacI, XbaI)[1] ).looped().synced("gacgaaagggcctcgtgatacgccta")
p416ADH = ( bb + adhprom.cut(SacI, XbaI)[1] ).looped().synced("gacgaaagggcctcgtgatacgccta")
p416CYC = ( bb + cycprom.cut(SacI, XbaI)[1] ).looped().synced("gacgaaagggcctcgtgatacgccta")

In [7]:
# Write the first four plamids to files 

p416GPD.write("p416GPD.gb")
p416TEF.write("p416TEF.gb")
p416ADH.write("p416ADH.gb")
p416CYC.write("p416CYC.gb")





In [8]:
# cut out each type of promoter

gpdcyc  = p416GPD.cut(SacI, KpnI)[0]
gpdcyc.name = "GPD"
tefcyc  = p416TEF.cut(SacI, KpnI)[0]
tefcyc.name = "TEF"
adhcyc  = p416ADH.cut(SacI, KpnI)[0]
adhcyc.name = "ADH"
cyccyc  = p416CYC.cut(SacI, KpnI)[0]
cyccyc.name = "CYC"

In [9]:
from IPython.display import FileLink

cassettes = [ gpdcyc, tefcyc, adhcyc, cyccyc ]
pRS       = [ pRS413, pRS423, pRS414, pRS424, pRS415, pRS425, pRS426 ]
             
for cassette in cassettes:
    for v in pRS: 
        pl = (v.cut(SacI, KpnI)[0].rc() + cassette ).looped().synced("gacgaaagggcctcgtgatacgccta")
        new_name = "p{no}{prom}.gb".format(no = v.name[3:], prom = cassette.name)
        pl.write(new_name)
        if new_name =="p426GPD.gb":
            p426GPD=pl





In [10]:
# download four of the sequences available from genbank

p416GPD_genbank = gb.nucleotide("DQ269148")
p426GPD_genbank = gb.nucleotide("DQ019861").looped() # This Genbank record is linear although it describes a circular plasmid
p416TEF_genbank = gb.nucleotide("EF210199")
p416CYC_genbank = gb.nucleotide("EF210198")

In [11]:
# verify that the sequences made here and the ones 
# available from genbank are equivalent

assert eq(p416GPD_genbank, p416GPD)
assert eq(p416TEF_genbank, p416TEF)
assert eq(p416CYC_genbank, p416CYC)
assert eq(p426GPD_genbank, p426GPD)

print("done!")


done!