Dense Datasets

  • This notebook is used for benchmarking and debugging sparse datasets

Import the necessary libaries


In [ ]:
import random
import graphistry as g
import pandas as pd

Check the version of the Graphistry module


In [ ]:
g.__version__

Set your API key and Graphistry Server Location

  • To use our public server at labs.graphistry.com, you must have a valid API key

In [ ]:
API_KEY = 'Go to www.graphistry.com/api-request to get your API key'

In [ ]:
g.register(api=2)

100 dense columns with 100K edges (restricted set of integer values 1-100)

Values can be 1-100


In [ ]:
edges = [{'src': x, 'dst': (x + 1) % 100000} for x in range(0, 100000)]
for i, edge in enumerate(edges):
    for fld in range(0, 100):
        edge['fld' + str((fld))] = (fld + i) % 100
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()

100 dense columns with 100K edges (random floats)

Each edge as 100 attributes which is a randomly selected float


In [ ]:
edges = [{'src': x, 'dst': (x + 1) % 100000} for x in range(0, 100000)]
for i, edge in enumerate(edges):
    for fld in range(0, 100):
        edge['fld' + str((fld))] = random.random()
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()

100 dense columns with 100K edges (random strings)

Each edge as 100 attributes which is a randomly selected float


In [ ]:
edges = [{'src': x, 'dst': (x + 1) % 100000} for x in range(0, 100000)]
for i, edge in enumerate(edges):
    for fld in range(0, 100):
        edge['fld' + str((fld))] = 'String' + str(random.random())
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()

10 dense columns with 800K edges (restricted set of integers 1-100)


In [ ]:
edges = [{'src': (x % 300), 'dst': ((x + 1) % 800)} for x in range(0, 800000)]
for i, edge in enumerate(edges):
    for fld in range(0, 10):
        edge['fld' + str((fld))] = (fld + i) % 100
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()

10 dense columns with 800K edges (random float)


In [ ]:
edges = [{'src': (x % 300), 'dst': ((x + 1) % 800)} for x in range(0, 800000)]
for i, edge in enumerate(edges):
    for fld in range(0, 10):
        edge['fld' + str((fld))] = random.random()
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()

10 dense columns with 800K edges (random strings)


In [ ]:
edges = [{'src': (x % 300), 'dst': ((x + 1) % 800)} for x in range(0, 800000)]
for i, edge in enumerate(edges):
    for fld in range(0, 10):
        edge['fld' + str((fld))] = 'String + ' + str(random.random())
edges = pd.DataFrame(edges)
edges[:3]

In [ ]:
g.edges(edges).bind(source='src', destination='dst').plot()