Demonstration of the Dseqrecord object


In [1]:
from pydna.dseqrecord import Dseqrecord

A small Dseqrecord object can be created directly. The Dseqrecord class is a double stranded version of the Biopython SeqRecord class.


In [2]:
mysequence = Dseqrecord("GGATCCAAA")

The representation below indicate the size of the sequence and the fact that it is linear (- symbol).


In [3]:
mysequence


Out[3]:
Dseqrecord(-9)

The Dseqrecord class is the main pydna data type together with the Dseq class. The sequence information is actually held by an internal Dseq object that is accessible from the .seq property:


In [4]:
mysequence.seq


Out[4]:
Dseq(-9)
GGATCCAAA
CCTAGGTTT

In [5]:
from pydna.readers import read

Dseqrecords can be read from local files in several formats


In [6]:
read_from_fasta = read("fastaseq.fasta")

In [7]:
read_from_gb = read("gbseq.gb")

In [8]:
read_from_embl = read("emblseq.emb")

The sequence files above all contain the same sequence. We can print the sequence by the .seq property.


In [9]:
print(read_from_fasta.seq)
print(read_from_gb.seq)
print(read_from_embl.seq)


GGATCCAAA
GGATCCAAA
GGATCCAAA

We can also read from a string defined directly in the code:


In [10]:
read_from_string = read('''

>seq_from_string
GGATCCAAA

''')

We can also download sequences from genbank if we know the accession number. The plasmid pUC19 has the accession number L09137. We have to give pydna a valid email address before we use Genbank in this way. Please change the email address to your own when executing this script. Genbank require to be able to contact its users if there is a problem.


In [11]:
from pydna.genbank import Genbank

In [12]:
gb = Genbank("bjornjobb@gmail.com")
pUC19 = gb.nucleotide("L09137")

This molecule is circular so the representation below begins with a "o". The size is 2686 bp.


In [13]:
pUC19


Out[13]:

In [14]:
from pydna.download import download_text

We can also read sequences remotely from other web sites for example this sequence for YEplac181:


In [15]:
text = download_text("https://gist.githubusercontent.com/BjornFJohansson/e445e5039d61bdcdf933c435438b4585/raw/a6d57a8d5cffcbf0ab76307c82746e5b7265d0c8/YEPlac181snapgene.gb")

In [16]:
YEplac181 = read(text)

In [17]:
YEplac181


Out[17]:
Dseqrecord(o5741)

Dseqrecord supports the same kind of digestion / ligation functionality as shown for Dseq.


In [18]:
from Bio.Restriction import BamHI

In [19]:
a, b = mysequence.cut(BamHI)

In [20]:
a


Out[20]:
Dseqrecord(-5)

In [21]:
b


Out[21]:
Dseqrecord(-8)

In [22]:
a.seq


Out[22]:
Dseq(-5)
G
CCTAG

In [23]:
b.seq


Out[23]:
Dseq(-8)
GATCCAAA
    GTTT

In [24]:
a+b


Out[24]:
Dseqrecord(-9)

Finally, we can save Dseqrecords in a local file. The default format is Genbank.


In [25]:
mysequence.write("new_sequence.gb")