ArangoDB with Graphistry

We explore Game of Thrones data in ArangoDB to show how Arango's graph support interops with Graphistry pretty quickly.

This tutorial shares two sample transforms:

  • Visualize the full graph
  • Visualize the result of a traversal query

Each runs an AQL query via python-arango, automatically converts to pandas, and plots with graphistry.

Setup


In [ ]:
!pip install python-arango --user -q

In [1]:
from arango import ArangoClient
import pandas as pd
import graphistry

In [3]:
def paths_to_graph(paths, source='_from', destination='_to', node='_id'):
    nodes_df = pd.DataFrame()
    edges_df = pd.DataFrame()
    for graph in paths:
        nodes_df = pd.concat([ nodes_df, pd.DataFrame(graph['vertices']) ], ignore_index=True)
        edges_df = pd.concat([ edges_df, pd.DataFrame(graph['edges']) ], ignore_index=True)
    nodes_df = nodes_df.drop_duplicates([node])
    edges_df = edges_df.drop_duplicates([node])
    return graphistry.bind(source=source, destination=destination, node=node).nodes(nodes_df).edges(edges_df)

def graph_to_graphistry(graph, source='_from', destination='_to', node='_id'):
    nodes_df = pd.DataFrame()
    for vc_name in graph.vertex_collections():
        nodes_df = pd.concat([nodes_df, pd.DataFrame([x for x in graph.vertex_collection(vc_name)])], ignore_index=True)
    edges_df = pd.DataFrame()
    for edge_def in graph.edge_definitions():
        edges_df = pd.concat([edges_df, pd.DataFrame([x for x in graph.edge_collection(edge_def['edge_collection'])])], ignore_index=True)
    return graphistry.bind(source=source, destination=destination, node=node).nodes(nodes_df).edges(edges_df)

Connect


In [ ]:
#graphistry.register(key="...", protocol="https", server="www.site.com", api=1)

In [4]:
client = ArangoClient(protocol='http', host='localhost', port=8529)
db = client.db('GoT', username='root', password='1234')

Demo 1: Traversal viz

  • Use python-arango's traverse() call to descendants of Ned Stark
  • Convert result paths to pandas and Graphistry
  • Plot, and instead of using raw Arango vertex IDs, use the first name

In [7]:
paths = db.graph('theGraph').traverse(
    start_vertex='Characters/4814',
    direction='outbound',
    strategy='breadthfirst'
)['paths']

In [8]:
g = paths_to_graph(paths)
g.bind(point_title='name').plot()


Out[8]:

Demo 2: Full graph

  • Use python-arango on a graph to identify and download the involved vertex/edge collections
  • Convert the results to pandas and Graphistry
  • Plot, and instead of using raw Arango vertex IDs, use the first name

In [11]:
g = graph_to_graphistry( db.graph('theGraph') )
g.bind(point_title='name').plot()


Out[11]:

In [ ]: