Visualizing data is awesome. In this post, I decided to use D3 in iPython notebook to visualize the "network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand".
There is something therapeutically beautiful about force directed layouts of which you can pull and push around.
The first thing -- after downloading the dolphins dataset -- was to wrangle the data to a workable format.
In [1]:
import networkx as nx
G = nx.read_gml('dolphins.gml') ##downloaded from above link
category = {}
for i,k in G.edge.iteritems():
if len(k) < 4:
category[i] = '< 4 neighbors'
elif len(k) < 11:
category[i] = '5-10 neighbors'
else:
category[i] = '> 10 neighbors'
_nodes = []
for i in range(0,62):
profile = G.node[i]
_nodes.append({'name':profile['label'].encode("utf-8"),
'group':category[i]})
_edges = [{'source':i[0], 'target':i[1]} for i in G.edges()]
Initially, I thought JSON format (code to do this below) was the way to go, but then later realized that I wanted to keep this post simple (and because the D3 code was extrapolated from other code -- used in conjunction with PHP while pulling data from a MySQL database -- of which was not meant to take in JSON formatted data).
import json
with open('dolphins.json', 'w') as out:
dat = {"nodes":_nodes,
"links":_edges}
json.dump(dat, out)
Therefore, I pre-processed the nodes and links variables to JavaScript format and outputed this information into dolphins.js.
In [2]:
import sys
datfile = 'dolphins.js'
def print_list_JavaScript_format(x, dat, out = sys.stdout):
out.write('var %s = [\n' % x)
for i in dat:
out.write('%s,\n' % i)
out.write('];\n')
with open(datfile, 'w') as out:
print_list_JavaScript_format('nodes', _nodes, out)
print_list_JavaScript_format('links', _edges, out)
The next thing I did was to write to fdg-dolphins.html the D3 JavaScript code. I also added a <!--ADD-DATASET--> comment so that I can later replace this with the contents of dolphins.js.
Here, I would also like to mention the D3 code was inspired from the Force-Directed Graph of co-occurring character in Les Misérables. In addition, I also added a legend categorizing the quantity of neighbors a node has (i.e. is this dolphin friendly?)
In [3]:
%%writefile fdg-dolphins.html
<!DOCTYPE html>
<html>
<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js"></script>
<style>
.node {
stroke: #fff;
stroke-width: 1.5px;
}
.link {
stroke: #999;
stroke-opacity: .6;
}
</style>
<body>
<div class="chart">
<script>
<!--ADD-DATASET-->
var width = 640,
height = 480;
var color = d3.scale.category10()
.domain(['< 4 neighbors', '5-10 neighbors', '> 10 neighbors']);
var svg = d3.select('.chart').append('svg')
.attr('width', width)
.attr('height', height);
var force = d3.layout.force()
.size([width, height])
.charge(-120)
.linkDistance(50)
.nodes(nodes)
.links(links);
var link = svg.selectAll('.link')
.data(links)
.enter().append('line')
.attr('class', 'link')
.style("stroke-width", function(d) { return Math.sqrt(d.value); });
var node = svg.selectAll('.node')
.data(nodes)
.enter().append('circle')
.attr('class', 'node')
.attr("r", 5)
.style("fill", function(d) { return color(d.group); })
.call(force.drag);
node.append("title")
.text(function(d) { return d.name; });
force.on("tick", function() {
link.attr("x1", function(d) { return d.source.x; })
.attr("y1", function(d) { return d.source.y; })
.attr("x2", function(d) { return d.target.x; })
.attr("y2", function(d) { return d.target.y; });
node.attr("cx", function(d) { return d.x; })
.attr("cy", function(d) { return d.y; });
});
force.start();
//Legend
var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(0," + i * 20 + ")"; });
legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);
legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d){return d});
</script>
</div>
</body>
</html>
In the following bit of code, <!--ADD-DATASET--> in fdg-dolphins.html was replaced with the contents of dolphins.js.
In [4]:
import re
htmlfile = 'fdg-dolphins.html'
with open(datfile) as f:
dat = f.read()
with open(htmlfile) as f:
dat = re.sub('<!--ADD-DATASET-->', dat, f.read())
with open(htmlfile, 'w') as f:
f.write(dat)
Finally, the D3 dolphins network in iPython notebook is visualized!
In [5]:
from IPython.display import IFrame
IFrame(htmlfile,650,500)
Out[5]:
From the network, we can clearly see that there are 3 dolphins that are more friendly/popular than others.