A SequenceVariant consists of an accession (a string), a sequence type (a string), and a PosEdit, like this:
var = hgvs.sequencevariant.SequenceVariant(ac='NM_01234.5', type='c', posedit=...)
Unsurprisingly, a PosEdit consists of separate position and Edit objects. A position is generally an Interval, which in turn is comprised of SimplePosition or BaseOffsetPosition objects. An edit is a subclass of Edit, which includes classes like NARefAlt for substitutions, deletions, and insertions) and Dup (for duplications).
Importantly, each of the objects we're building has a rule in the parser, which means that you have the tools to serialize and deserialize (parse) each of the components that we're about to construct.
In [1]:
import hgvs.location
import hgvs.posedit
In [2]:
start = hgvs.location.BaseOffsetPosition(base=200,offset=-6,datum=hgvs.location.Datum.CDS_START)
start, str(start)
Out[2]:
In [3]:
end = hgvs.location.BaseOffsetPosition(base=22,datum=hgvs.location.Datum.CDS_END)
end, str(end)
Out[3]:
In [4]:
iv = hgvs.location.Interval(start=start,end=end)
iv, str(iv)
Out[4]:
In [5]:
import hgvs.edit, hgvs.posedit
In [6]:
edit = hgvs.edit.NARefAlt(ref='A',alt='T')
edit, str(edit)
Out[6]:
In [7]:
posedit = hgvs.posedit.PosEdit(pos=iv,edit=edit)
posedit, str(posedit)
Out[7]:
In [8]:
import hgvs.sequencevariant
In [9]:
var = hgvs.sequencevariant.SequenceVariant(ac='NM_01234.5', type='c', posedit=posedit)
var, str(var)
Out[9]:
Important: It is possible to bogus variants with the hgvs package. For example, the above interval is incompatible with a SNV. See hgvs.validator.Validator for validation options.
The stringification happens on-the-fly. That means that you can update components of the variant and see the effects immediately.
In [10]:
import copy
In [11]:
var2 = copy.deepcopy(var)
var2.posedit.pos.start.base=456
str(var2)
Out[11]:
In [12]:
var2 = copy.deepcopy(var)
var2.posedit.edit.alt='CT'
str(var2)
Out[12]:
In [13]:
var2 = copy.deepcopy(var)
str(var2)
Out[13]:
In [ ]: