In [1]:
from pydna.dseqrecord import Dseqrecord
In [2]:
mysequence = Dseqrecord("GGATCCAAA")
The representation below indicate the size of the sequence and the fact that it is linear (- symbol).
In [3]:
mysequence
Out[3]:
The Dseqrecord class is the main pydna data type together with the Dseq class. The sequence information is actually held by an internal Dseq object that is accessible from the .seq
property:
In [4]:
mysequence.seq
Out[4]:
In [5]:
from pydna.readers import read
Dseqrecords can be read from local files in several formats
In [6]:
read_from_fasta = read("fastaseq.fasta")
In [7]:
read_from_gb = read("gbseq.gb")
In [8]:
read_from_embl = read("emblseq.emb")
The sequence files above all contain the same sequence. We can print the sequence by the .seq
property.
In [9]:
print(read_from_fasta.seq)
print(read_from_gb.seq)
print(read_from_embl.seq)
We can also read from a string defined directly in the code:
In [10]:
read_from_string = read('''
>seq_from_string
GGATCCAAA
''')
We can also download sequences from genbank if we know the accession number. The plasmid pUC19 has the accession number L09137. We have to give pydna a valid email address before we use Genbank in this way. Please change the email address to your own when executing this script. Genbank require to be able to contact its users if there is a problem.
In [11]:
from pydna.genbank import Genbank
In [12]:
gb = Genbank("bjornjobb@gmail.com")
pUC19 = gb.nucleotide("L09137")
This molecule is circular so the representation below begins with a "o". The size is 2686 bp.
In [13]:
pUC19
Out[13]:
In [14]:
from pydna.download import download_text
We can also read sequences remotely from other web sites for example this sequence for YEplac181:
In [15]:
text = download_text("https://gist.githubusercontent.com/BjornFJohansson/e445e5039d61bdcdf933c435438b4585/raw/a6d57a8d5cffcbf0ab76307c82746e5b7265d0c8/YEPlac181snapgene.gb")
In [16]:
YEplac181 = read(text)
In [17]:
YEplac181
Out[17]:
Dseqrecord supports the same kind of digestion / ligation functionality as shown for Dseq.
In [18]:
from Bio.Restriction import BamHI
In [19]:
a, b = mysequence.cut(BamHI)
In [20]:
a
Out[20]:
In [21]:
b
Out[21]:
In [22]:
a.seq
Out[22]:
In [23]:
b.seq
Out[23]:
In [24]:
a+b
Out[24]:
Finally, we can save Dseqrecords in a local file. The default format is Genbank.
In [25]:
mysequence.write("new_sequence.gb")