Parse and Format a Variant


In [2]:
import hgvs.parser
hp = hgvs.parser.Parser()

g_hgvs = 'NC_000001.10:g.150550916_150550920delGACAAinsCAATACC'
g_var = hp.parse_hgvs_variant(g_hgvs)
g_var


Out[2]:
SequenceVariant(ac=NC_000001.10, type=g, posedit=150550916_150550920delGACAAinsCAATACC)

Map variants between sequences


In [3]:
import hgvs.dataproviders.uta
hdp = hgvs.dataproviders.uta.connect()

import hgvs.variantmapper
am = hgvs.variantmapper.AssemblyMapper(hdp, assembly_name='GRCh37', alt_aln_method='splign')


INFO:hgvs.dataproviders.uta:connected to postgresql://anonymous:anonymous@localhost/uta_dev/uta_20150827

In [ ]:
c_vars = [am.g_to_c(g_var, ac) for ac in am.relevant_transcripts(g_var)]
c_vars

In [ ]:
p_vars = [am.c_to_p(c_var) for c_var in c_vars]
p_vars

Normalize (Shuffle and Rewrite) Variants


In [ ]:
import hgvs.normalizer
hn = hgvs.normalizer.Normalizer(hdp)
hn.normalize(hp.parse_hgvs_variant('NM_021960.4:c.735_736insT'))

Validate Variants


In [ ]:
import hgvs.validator
hv = hgvs.validator.Validator(hdp)
try:
    hv.validate(hp.parse_hgvs_variant('NM_021960.4:c.736_740delATGTCinsGGTATTG'))
except Exception as e:
    print(e)

SequenceVariant instances are structured variant representations


In [ ]:
g_var.ac, g_var.type, g_var.posedit

In [ ]:
g_var.posedit.pos.start

In [ ]:
g_var.posedit.edit.ref, g_var.posedit.edit.alt

Format variants simply by "stringifying" them with print or format


In [ ]:
print("\n".join(["Your variant was {v} mapped to:".format(v=g_var)]
                + ["  {c_var} ({p_var})".format(c_var=c_var, p_var=p_var)
                   for c_var,p_var in zip(c_vars,p_vars)]))

NCBI SeqViewer for MCL1 region on GRCh37.p13