Adding optional relationships changes the dcnt value

SYNOPSIS: For GO:0019012, virion, the descendants count dcnt, is:

  • 0 when using is_a relationships (the default) and
  • 48 when adding the optional relationship, part_of.

Table of Contents:

  1. Download Ontologies, if necessary
  2. Depth-01 term, virion, has dcnt=0 through is_a relationships (default)
  3. Depth-01 term, virion, dcnt value is higher using all relationships
  4. Depth-01 term, virion, dcnt value is higher using part_of relationships
  5. Descendants under virion
  6. Plot some descendants of virion

1) Download Ontologies, if necessary


In [1]:
from goatools.base import get_godag
godag = get_godag("go-basic.obo", optional_attrs={'relationship'})
go_leafs = set(o.item_id for o in godag.values() if not o.children)


  EXISTS: go-basic.obo
go-basic.obo: fmt(1.2) rel(2019-06-10) 47,442 GO Terms; optional_attrs(relationship)

2) Depth-01 term, GO:0019012 (virion) has dcnt=0 through is_a relationships (default)

GO:0019012, virion, has no GO terms below it through the is_a relationship, so the default value of dcnt will be zero, even though it is very high in the DAG at depth=01.


In [2]:
virion = 'GO:0019012'

In [3]:
from goatools.gosubdag.gosubdag import GoSubDag
gosubdag_r0 = GoSubDag(go_leafs, godag)


INITIALIZING GoSubDag: 27482 sources in 44990 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth GO alt GO_name dcnt D1 id
             GoSubDag: relationships: set()

Notice that dcnt=0 for GO:0019012, virion, even though it is very high in the DAG hierarchy (depth=1). This is because there are no GO IDs under GO:0019012 (virion) using the is_a relationship.


In [4]:
nt_virion = gosubdag_r0.go2nt[virion]
print(nt_virion)
print('THE VALUE OF dcnt IS: {dcnt}'.format(dcnt=nt_virion.dcnt))


NtGo(NS='CC', level=1, depth=1, GO='GO:0019012', alt='', GO_name='virion', dcnt=0, D1='T', id='GO:0019012')
THE VALUE OF dcnt IS: 0

3) Depth-01 term, GO:0019012 (virion) dcnt value is higher using all relationships

Load all relationships into GoSubDag using relationships=True


In [5]:
gosubdag_r1 = GoSubDag(go_leafs, godag, relationships=True)


INITIALIZING GoSubDag: 27482 sources in 44990 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth reldepth GO alt GO_name dcnt D1 childcnt REL REL_short rel id
             GoSubDag: relationships: {'part_of', 'negatively_regulates', 'positively_regulates', 'regulates'}

In [6]:
nt_virion = gosubdag_r1.go2nt[virion]
print(nt_virion)
print('THE VALUE OF dcnt IS: {dcnt}'.format(dcnt=nt_virion.dcnt))


NtGo(NS='CC', level=1, depth=1, reldepth=1, GO='GO:0019012', alt='', GO_name='virion', dcnt=48, D1='Q', childcnt=0, REL='....', REL_short='', rel='p...', id='GO:0019012')
THE VALUE OF dcnt IS: 48

4) Depth-01 term, GO:0019012 (virion) dcnt value is higher using part_of relationships

Load all relationships into GoSubDag using relationships={'part_of'}


In [7]:
gosubdag_partof = GoSubDag(go_leafs, godag, relationships={'part_of'})


INITIALIZING GoSubDag: 27482 sources in 44990 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth reldepth GO alt GO_name dcnt D1 childcnt REL REL_short rel id
             GoSubDag: relationships: {'part_of'}

In [8]:
nt_virion = gosubdag_partof.go2nt[virion]
print(nt_virion)
print('THE VALUE OF dcnt IS: {dcnt}'.format(dcnt=nt_virion.dcnt))


NtGo(NS='CC', level=1, depth=1, reldepth=1, GO='GO:0019012', alt='', GO_name='virion', dcnt=48, D1='Q', childcnt=0, REL='.', REL_short='', rel='p', id='GO:0019012')
THE VALUE OF dcnt IS: 48

5) Descendants under GO:0019012 (virion)


In [9]:
virion_descendants = gosubdag_partof.rcntobj.go2descendants[virion]
print('{N} descendants of virion were found'.format(N=len(virion_descendants)))


48 descendants of virion were found

6) Plot descendants of virion


In [10]:
from goatools.gosubdag.plot.gosubdag_plot import GoSubDagPlot

# Limit plot of descendants to get a smaller plot
virion_capsid_fiber = {'GO:0098033', 'GO:0098032'}
nts = gosubdag_partof.prt_goids(virion_capsid_fiber, 
                         '{NS} {GO} dcnt({dcnt}) D-{depth:02} {GO_name}')


CC GO:0098032 dcnt(0) D-04 icosahedral viral capsid, collar fiber
CC GO:0098033 dcnt(0) D-04 icosahedral viral capsid, neck fiber

In [11]:
# Limit plot size by choosing just two virion descendants
# Get a subset containing only a couple virion descendants and their ancestors
pltdag = GoSubDag(virion_capsid_fiber, godag, relationships={'part_of'})
pltobj = GoSubDagPlot(pltdag)
pltobj.plt_dag('virion_capsid_fiber.png')


INITIALIZING GoSubDag:   2 sources in  11 GOs rcnt(True). 0 alt GO IDs
             GoSubDag: namedtuple fields: NS level depth reldepth GO alt GO_name dcnt D1 childcnt REL REL_short rel id
             GoSubDag: relationships: {'part_of'}
    2 usr  11 GOs  WROTE: virion_capsid_fiber.png

The descendant count is the number next to the d in the GO term boxes.

The c0 seen in some boxes, like GO:0019012 virion, indicates that this term has no children through the default is_a relationship, but does have children when viewed using additional optional relationships.