Demonstration of the Dseqrecord object



In [1]:

    
from pydna.dseqrecord import Dseqrecord

A small Dseqrecord object can be created directly. The Dseqrecord class is a double stranded version of the Biopython SeqRecord class.



In [2]:

    
mysequence = Dseqrecord("GGATCCAAA")

The representation below indicate the size of the sequence and the fact that it is linear (- symbol).



In [3]:

    
mysequence









    Out[3]:





Dseqrecord(-9)

The Dseqrecord class is the main pydna data type together with the Dseq class. The sequence information is actually held by an internal Dseq object that is accessible from the .seq property:



In [4]:

    
mysequence.seq









    Out[4]:





Dseq(-9)
GGATCCAAA
CCTAGGTTT



In [5]:

    
from pydna.readers import read

Dseqrecords can be read from local files in several formats



In [6]:

    
read_from_fasta = read("fastaseq.fasta")



In [7]:

    
read_from_gb = read("gbseq.gb")



In [8]:

    
read_from_embl = read("emblseq.emb")

The sequence files above all contain the same sequence. We can print the sequence by the .seq property.



In [9]:

    
print(read_from_fasta.seq)
print(read_from_gb.seq)
print(read_from_embl.seq)









    



GGATCCAAA
GGATCCAAA
GGATCCAAA

We can also read from a string defined directly in the code:



In [10]:

    
read_from_string = read('''

>seq_from_string
GGATCCAAA

''')

We can also download sequences from genbank if we know the accession number. The plasmid pUC19 has the accession number L09137. We have to give pydna a valid email address before we use Genbank in this way. Please change the email address to your own when executing this script. Genbank require to be able to contact its users if there is a problem.



In [11]:

    
from pydna.genbank import Genbank



In [12]:

    
gb = Genbank("bjornjobb@gmail.com")
pUC19 = gb.nucleotide("L09137")

This molecule is circular so the representation below begins with a "o". The size is 2686 bp.



In [13]:

    
pUC19









    Out[13]:




L09137



In [14]:

    
from pydna.download import download_text

We can also read sequences remotely from other web sites for example this sequence for YEplac181:



In [15]:

    
text = download_text("https://gist.githubusercontent.com/BjornFJohansson/e445e5039d61bdcdf933c435438b4585/raw/a6d57a8d5cffcbf0ab76307c82746e5b7265d0c8/YEPlac181snapgene.gb")



In [16]:

    
YEplac181 = read(text)



In [17]:

    
YEplac181









    Out[17]:





Dseqrecord(o5741)

Dseqrecord supports the same kind of digestion / ligation functionality as shown for Dseq.



In [18]:

    
from Bio.Restriction import BamHI



In [19]:

    
a, b = mysequence.cut(BamHI)



In [20]:

    
a









    Out[20]:





Dseqrecord(-5)



In [21]:

    
b









    Out[21]:





Dseqrecord(-8)



In [22]:

    
a.seq









    Out[22]:





Dseq(-5)
G
CCTAG



In [23]:

    
b.seq









    Out[23]:





Dseq(-8)
GATCCAAA
    GTTT



In [24]:

    
a+b









    Out[24]:





Dseqrecord(-9)

Finally, we can save Dseqrecords in a local file. The default format is Genbank.



In [25]:

    
mysequence.write("new_sequence.gb")









    




new_sequence.gb