Symmetry Metrics for Balanced Tree Forests

The goal is to understand how nauty output and symmetry metrics behave when we consider a forest with multiple disconnected trees. This situation would occur if we calculated symmetries across the entire set of trait trees when capturing statistics.

Some statistics, such as orbits, might crosscut trait trees if two trait trees shared adjacency relationships. In other situations, I'd expect them to be a disjoint union of results, and thus the statistics might simply be additive (and thus needing to be normalized by the number of configured trait trees). In other cases, such as $|\textit{Aut}(G)|$, the relationship might be multiplicative and combinatorial.

I proceed by constructing a union graph of 2 balanced trait trees, then 3 and 4, and examining how the nauty output statistics scale. Then I look at a combination of different balanced trees.


In [1]:
import networkx as nx
import madsenlab.axelrod.analysis as stats
import madsenlab.axelrod.utils as utils


/Users/mark/anaconda/lib/python2.7/site-packages/pytz/__init__.py:31: UserWarning: Module argparse was already imported from /Users/mark/anaconda/python.app/Contents/lib/python2.7/argparse.pyc, but /Users/mark/anaconda/lib/python2.7/site-packages is being added to sys.path
  from pkg_resources import resource_stream
/Users/mark/anaconda/lib/python2.7/site-packages/pytz/__init__.py:31: UserWarning: Module logging was already imported from /Users/mark/anaconda/python.app/Contents/lib/python2.7/logging/__init__.pyc, but /Users/mark/anaconda/lib/python2.7/site-packages is being added to sys.path
  from pkg_resources import resource_stream
/Users/mark/anaconda/lib/python2.7/site-packages/pytz/__init__.py:31: UserWarning: Module multiprocessing was already imported from /Users/mark/anaconda/python.app/Contents/lib/python2.7/multiprocessing/__init__.pyc, but /Users/mark/anaconda/lib/python2.7/site-packages is being added to sys.path
  from pkg_resources import resource_stream
/Users/mark/anaconda/lib/python2.7/site-packages/pytz/__init__.py:31: UserWarning: Module uuid was already imported from /Users/mark/anaconda/python.app/Contents/lib/python2.7/uuid.pyc, but /Users/mark/anaconda/lib/python2.7/site-packages is being added to sys.path
  from pkg_resources import resource_stream

In [2]:
conf = utils.TreeStructuredConfiguration("dummyconf.json")
conf.branching_factor = 4
conf.depth_factor = 4

In [3]:
sym = stats.BalancedTreeAutomorphismStatistics(conf)

In [4]:
g1 = nx.balanced_tree(4,4)
g2 = nx.balanced_tree(4,4)

In [5]:
print sym.calculate_graph_symmetries(g1)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-a7656a549444> in <module>()
----> 1 print sym.calculate_graph_symmetries(g1)

/Users/mark/anaconda/lib/python2.7/site-packages/madsenlab/axelrod/analysis/trait_tree_statistics.pyc in calculate_graph_symmetries(self, graph)
    134 
    135         # TODO: Figure out how to handle density and radius for multi-component graphs
--> 136         results['remainingdensity'] = float(g.number_of_nodes()) / (float(self.n_per_tree) * float(self.num_trees))
    137 
    138 

TypeError: float() argument must be a string or a number

In [ ]:
order_tree = stats.num_nodes_balanced_tree(4,4)

In [ ]:
starting_vertex = order_tree

In [ ]:
g2A = nx.convert_node_labels_to_integers(g2, first_label = starting_vertex)

In [ ]:
graphs = []
graphs.append(g1)
graphs.append(g2A)

In [ ]:
forest = nx.union_all(graphs)

In [ ]:
forest.number_of_nodes()

In [ ]:
forestd = sym._format_graph_as_nauty(forest)

Below is the output for a forest with two trees with $r=4, h=4$. We can see that the orbit number is the same, the multiplicities are simply different because the members of each orbit in each tree are interchangeable, so each multiplicity is twice the value in a single tree. The group size is much larger than twice, however, but we should expect there to be a multiplicative effect given that automorphisms scale as $n!$ for the complete graphs $K_n$.


In [ ]:
print sym._get_raw_nauty_output(forestd)

In [ ]:
g3 = nx.balanced_tree(4,4)

In [ ]:
g3a = nx.convert_node_labels_to_integers(g3, first_label=(order_tree * 2))

In [ ]:
graphs.append(g3a)

In [ ]:
forest2 = nx.union_all(graphs)

In [ ]:
f2d = sym._format_graph_as_nauty(forest2)

In [ ]:
print sym._get_raw_nauty_output(f2d)

Adding a third balanced tree follows the same pattern. The only concern is the scaling of group size. Adding a fourth tree to give me some data points to work with.


In [ ]:
g4 = nx.balanced_tree(4,4)

In [ ]:
g4a = nx.convert_node_labels_to_integers(g4, first_label=(order_tree * 3))

In [ ]:
graphs.append(g4a)
forest3 = nx.union_all(graphs)

In [ ]:
f3d = sym._format_graph_as_nauty(forest3)
print sym._get_raw_nauty_output(f3d)

The exponent is definitely going up as the power of the number of trees: ${10^{117}}^2 = 10^{234}$, and so on. But the other values are rising as well, and not by a simple multiplicative factor (which looked like 2). It looks like $n!$.

So, with a forest of identical trees, with n trees, and $s = |\textit{Aut}(g)|$ for each tree, the total automorphism group size for the entire forest is thus $n! s^n$. This is true for identical tree copies, but obviously more complex for non-identical components. We would not also see orbit multiplicities which are a simple factor of the base orbit multiplicity, and we'd expect that the orbit number would not be constant, of course.

So how do the orbit numbers etc behave with multiple non-identical components? I'm not going to get a general answer here (but of course that's possible), but what follows are some simple examples to guide intuition.


In [7]:
gx1 = nx.balanced_tree(3,3)
nx1 = stats.num_nodes_balanced_tree(3,3)
gx2 = nx.balanced_tree(4,4)
gx2a = nx.convert_node_labels_to_integers(gx2, first_label=nx1)

In [8]:
graphs2 = []
graphs2.append(gx1)
graphs2.append(gx2a)

In [9]:
foresth1 = nx.union_all(graphs2)

In [10]:
print sym._get_raw_nauty_output(sym._format_graph_as_nauty(foresth1))


9 orbits; grpsize=2.715950187930e127; 281 gens; 36753 nodes; maxlev=225
cpu time = 0.06 seconds
 0; 1:3 (3); 4:12 (9); 13:39 (27); 40; 41:44 (4); 45:60 (16); 61:124 (64);
    125:380 (256);

So...since the graph components are different, the orbits are disjoint and restricted to their own component. The total orbit number is simply the sum of the orbit numbers for the two components, and the orbit multiplicites and vertex membership are obviously restricted to each component (easy to see given the sequential vertex numbering). The group size is greater than that of a single tree, but many orders of magnitude smaller than the combined group size of two identical trees. This likely reflects there being very few automorphisms between trees.

In [11]:
print sym._get_raw_nauty_output(sym._format_graph_as_nauty(gx1))


4 orbits; grpsize=1.306069401600e10; 26 gens; 336 nodes; maxlev=21
cpu time = 0.00 seconds
 0; 1:3 (3); 4:12 (9); 13:39 (27);


In [12]:
print sym._get_raw_nauty_output(sym._format_graph_as_nauty(gx2))


5 orbits; grpsize=2.079483819622e117; 255 gens; 30388 nodes; maxlev=205
cpu time = 0.04 seconds
 0; 1:4 (4); 5:20 (16); 21:84 (64); 85:340 (256);


In [ ]: