Application 3: RRG Chat Messages

Identifying instructions from chat messages in the Radiation Response Game

Goal: To determine if the provenance network analytics method can identify instructions from the provenance of a chat messages.
Classification labels: $\mathcal{L} = \left\{ \textit{instruction}, \textit{other} \right\} $.
Training data: 69 chat messages manually categorised by HCI researchers.

Reading data

The datasets from this application are provided in the folder rrg. Each CSV file, depgraphs-$k$ .csv with $k = 1 \ldots 18$, is a table whose rows correspond to individual chat messages in RRG:

First column: the identifier of the chat message
label: the manual classification of the message (e.g., instruction, information, requests, etc.)
The remaining columns provide the provenance network metrics calculated from the dependency provenance graph of the message to the depth of $k$.



In [1]:

    
import pandas as pd



In [2]:

    
filepath = lambda k: "rrg/depgraphs-%d.csv" % k



In [3]:

    
# An example of reading the data file
df = pd.read_csv(filepath(5), index_col=0)
df.head()









    Out[3]:







  
    
      
      label
      entities
      agents
      activities
      nodes
      edges
      diameter
      assortativity
      acc
      acc_e
      ...
      mfd_e_a
      mfd_e_ag
      mfd_a_e
      mfd_a_a
      mfd_a_ag
      mfd_ag_e
      mfd_ag_a
      mfd_ag_ag
      mfd_der
      powerlaw_alpha
    
  
  
    
      21
      requests
      53
      0
      1
      54
      63
      6
      0.197008
      0.393939
      0.090909
      ...
      4
      0
      4
      0
      0
      0
      0
      0
      5
      -1.0
    
    
      20
      commissives
      57
      0
      3
      60
      84
      7
      0.046717
      0.403367
      0.105051
      ...
      5
      0
      5
      0
      0
      0
      0
      0
      5
      -1.0
    
    
      23
      assertives
      62
      0
      2
      64
      84
      7
      0.191105
      0.393939
      0.090909
      ...
      5
      0
      3
      0
      0
      0
      0
      0
      6
      -1.0
    
    
      25
      instruction
      55
      0
      2
      57
      77
      6
      0.128594
      0.393939
      0.090909
      ...
      5
      0
      5
      4
      0
      0
      0
      0
      5
      -1.0
    
    
      24
      instruction
      52
      0
      1
      53
      62
      7
      0.175250
      0.394949
      0.092424
      ...
      4
      0
      5
      0
      0
      0
      0
      0
      5
      -1.0
    
  

5 rows × 23 columns

Labelling data

Since we are only interested in the instruction messages, we categorise the data entity into two sets: instruction and other.

Note: This section is just an example to show the data transformation to be applied on each dataset.



In [4]:

    
label = lambda l: 'other' if l != 'instruction' else l



In [5]:

    
df.label = df.label.apply(label).astype('category')
df.head()









    Out[5]:







  
    
      
      label
      entities
      agents
      activities
      nodes
      edges
      diameter
      assortativity
      acc
      acc_e
      ...
      mfd_e_a
      mfd_e_ag
      mfd_a_e
      mfd_a_a
      mfd_a_ag
      mfd_ag_e
      mfd_ag_a
      mfd_ag_ag
      mfd_der
      powerlaw_alpha
    
  
  
    
      21
      other
      53
      0
      1
      54
      63
      6
      0.197008
      0.393939
      0.090909
      ...
      4
      0
      4
      0
      0
      0
      0
      0
      5
      -1.0
    
    
      20
      other
      57
      0
      3
      60
      84
      7
      0.046717
      0.403367
      0.105051
      ...
      5
      0
      5
      0
      0
      0
      0
      0
      5
      -1.0
    
    
      23
      other
      62
      0
      2
      64
      84
      7
      0.191105
      0.393939
      0.090909
      ...
      5
      0
      3
      0
      0
      0
      0
      0
      6
      -1.0
    
    
      25
      instruction
      55
      0
      2
      57
      77
      6
      0.128594
      0.393939
      0.090909
      ...
      5
      0
      5
      4
      0
      0
      0
      0
      5
      -1.0
    
    
      24
      instruction
      52
      0
      1
      53
      62
      7
      0.175250
      0.394949
      0.092424
      ...
      4
      0
      5
      0
      0
      0
      0
      0
      5
      -1.0
    
  

5 rows × 23 columns

Balancing data

This section explore the balance of the RRG datasets.



In [6]:

    
# Examine the balance of the dataset
df.label.value_counts()









    Out[6]:





other          37
instruction    32
Name: label, dtype: int64

Since both labels have roughly the same number of data points, we decide not to balance the RRG datasets.

Cross validation

We now run the cross validation tests on the 18 datasets ($k = 1 \ldots 18$) using all the features (combined), only the generic network metrics (generic), and only the provenance-specific network metrics (provenance). The folowing steps are applied to each dataset:

Read the dataset from the CSV file
Label the data (see above)
Carry out the cross validation test
Append the test result into results and the feature importance into importances

Please refer to Cross Validation Code.ipynb for the detailed description of the cross validation code.



In [7]:

    
from analytics import test_classification



In [8]:

    
results = pd.DataFrame()
importances = pd.DataFrame()
for k in range(1, 19):
    df = pd.read_csv(filepath(k), index_col=0)
    df.label = df.label.apply(label).astype('category')

    res, imps = test_classification(df, n_iterations=1000, test_id=str(k))
    res['$k$'] = k
    imps['$k$'] = k

    # storing the results and importance of features
    results = results.append(res, ignore_index=True)
    importances = importances.append(imps, ignore_index=True)









    



Accuracy: 53.57% ±0.2216 <-- 1-combined
Accuracy: 53.57% ±0.2216 <-- 1-generic
Accuracy: 53.57% ±0.2216 <-- 1-provenance
Accuracy: 70.69% ±0.9766 <-- 2-combined
Accuracy: 71.06% ±0.9238 <-- 2-generic
Accuracy: 70.44% ±0.9567 <-- 2-provenance
Accuracy: 82.12% ±0.8706 <-- 3-combined
Accuracy: 82.59% ±0.8471 <-- 3-generic
Accuracy: 76.97% ±0.8917 <-- 3-provenance
Accuracy: 78.14% ±0.9607 <-- 4-combined
Accuracy: 75.64% ±0.9324 <-- 4-generic
Accuracy: 72.01% ±0.9712 <-- 4-provenance
Accuracy: 75.94% ±1.0142 <-- 5-combined
Accuracy: 75.04% ±0.9833 <-- 5-generic
Accuracy: 78.20% ±0.9767 <-- 5-provenance
Accuracy: 80.32% ±0.8902 <-- 6-combined
Accuracy: 78.80% ±0.8886 <-- 6-generic
Accuracy: 78.28% ±0.9354 <-- 6-provenance
Accuracy: 80.04% ±0.9246 <-- 7-combined
Accuracy: 79.71% ±0.9206 <-- 7-generic
Accuracy: 78.41% ±0.9294 <-- 7-provenance
Accuracy: 83.04% ±0.8573 <-- 8-combined
Accuracy: 83.43% ±0.8413 <-- 8-generic
Accuracy: 83.01% ±0.8509 <-- 8-provenance
Accuracy: 77.65% ±0.9467 <-- 9-combined
Accuracy: 80.00% ±0.9303 <-- 9-generic
Accuracy: 77.95% ±0.9707 <-- 9-provenance
Accuracy: 78.41% ±0.9444 <-- 10-combined
Accuracy: 76.35% ±0.9573 <-- 10-generic
Accuracy: 81.06% ±0.8990 <-- 10-provenance
Accuracy: 85.13% ±0.7883 <-- 11-combined
Accuracy: 85.11% ±0.8229 <-- 11-generic
Accuracy: 84.68% ±0.7948 <-- 11-provenance
Accuracy: 78.68% ±0.9448 <-- 12-combined
Accuracy: 75.82% ±0.9552 <-- 12-generic
Accuracy: 84.07% ±0.9171 <-- 12-provenance
Accuracy: 80.92% ±0.8581 <-- 13-combined
Accuracy: 85.24% ±0.8555 <-- 13-generic
Accuracy: 78.61% ±0.9031 <-- 13-provenance
Accuracy: 82.06% ±0.8376 <-- 14-combined
Accuracy: 73.98% ±0.9360 <-- 14-generic
Accuracy: 81.76% ±0.8362 <-- 14-provenance
Accuracy: 85.13% ±0.8144 <-- 15-combined
Accuracy: 79.38% ±0.9295 <-- 15-generic
Accuracy: 83.82% ±0.8687 <-- 15-provenance
Accuracy: 76.70% ±0.9083 <-- 16-combined
Accuracy: 79.91% ±0.9507 <-- 16-generic
Accuracy: 79.81% ±0.8504 <-- 16-provenance
Accuracy: 79.35% ±0.8664 <-- 17-combined
Accuracy: 77.12% ±0.8863 <-- 17-generic
Accuracy: 80.59% ±0.8716 <-- 17-provenance
Accuracy: 74.42% ±0.9875 <-- 18-combined
Accuracy: 70.79% ±0.9429 <-- 18-generic
Accuracy: 75.42% ±0.9804 <-- 18-provenance

Optionally, we can save the test results to save time the next time we want to re-explore them:



In [9]:

    
results.to_pickle("rrg/results.pkl")
importances.to_pickle("rrg/importances.pkl")

Next time, we can reload the results as follows:



In [10]:

    
import pandas as pd
results = pd.read_pickle("rrg/results.pkl")
importances = pd.read_pickle("rrg/importances.pkl")
results.shape, importances.shape









    Out[10]:





((54000, 3), (18000, 23))

Charting the resutls



In [11]:

    
%matplotlib inline
import seaborn as sns
sns.set_style("whitegrid")
sns.set_context("paper", font_scale=1.4)

For this application, with the many configuration to chart, it is difficult to determine which configuration yields the best accuracy from a figure. Instead, we determine this from the data. We group the performance of all classifiers by the set of metrics they used and the $k$ value; then, we calculate the mean accuracy of those groups.



In [12]:

    
results['Accuracy'] = results['Accuracy'] * 100  # converting accuracy values to percent



In [13]:

    
# define a function to calculate the mean and its confidence interval from a group of values
import scipy.stats as st
def calc_means_ci(group):
    mean = group.mean()
    ci_low, ci_high = st.t.interval(0.95, group.size - 1, loc=mean, scale=st.sem(group))
    return pd.Series({
        'mean': mean,
        'ci_low': ci_low,
        'ci_high': ci_high
    })



In [14]:

    
accuracy_by_metrics_k = results.groupby(["Metrics", "$k$"])  # grouping results by metrics sets and k



In [15]:

    
# Calculate the means and the confidence intervals over the grouped data (using the calc_means_ci function above)
results_means_ci = accuracy_by_metrics_k.Accuracy.apply(calc_means_ci).unstack()
results_means_ci = results_means_ci[['mean', 'ci_low', 'ci_high']]  # reorder the column
results_means_ci

Next, we sort the mean accuracy values in each metrics sets and find $k$ value that yields the highest accuracy for each set of metrics (i.e. combined, generic, and provenance).



In [16]:

    
# Looking at only the means in each set of metrics
results_means_ci['mean'].unstack()



In [17]:

    
# Finding the highest accuracy value in each row (i.e. each set of metrics)
highest_accuracy_configurations = [
    (row_name, row.sort_values(ascending=False)[:1].index.get_values()[0])  # the index (i.e. k value) of the highest accuracy (i.e. first one)
    for row_name, row in results_means_ci['mean'].unstack().iterrows()
]
highest_accuracy_configurations









    Out[17]:





[('combined', 11), ('generic', 13), ('provenance', 11)]



In [18]:

    
results_means_ci.loc[highest_accuracy_configurations, :]

The results above shows that $k = 13$ - generic yields the highest accuracy level: 85.24%. Using all the metrics or only the provenance-specific metrics yield comparable levels of accuracy (in the confidence interval of the highest accuracy) with $k = 11$.

For a visual comparison of all the configurations tested, we chart their accuracy next.



In [19]:

    
pal = sns.light_palette("seagreen", n_colors=3, reverse=True)
plot = sns.barplot(x="$k$", y="Accuracy", hue='Metrics', palette=pal, errwidth=1, capsize=0.04, data=results)
plot.figure.set_size_inches((10, 4))
plot.legend(loc='upper center', bbox_to_anchor=(0.5, 1.02), ncol=3)
plot.set_ylabel('Accuracy (%)')
plot.set_ylim(50, 95)

# drawing a line at the highest accuracy for visual comparison between configurations
highest_accuracy = results_means_ci['mean'].max()
plot.axes.plot([0, 17], [highest_accuracy, highest_accuracy], 'g')
highest_accuracy









    Out[19]:





85.239285714285714

The chart shows that the configurations yield the highest accuracy are: $k = 11$ - combined/generic/provenance, $k = 13$ - generic, and $k = 15$ - combined. The accuracy level seems to decrease with $k > 15$.

Saving the chart above to Fig6.eps to be included in the paper:



In [20]:

    
plot.figure.savefig("figures/Fig6.eps")

Analysing the importance of features

In this section, we explore the relevance of each features in classifying messages in RRG. To do so, we analyse the feature importance values provided by the decision tree training done above - the importances data frame.



In [21]:

    
# Rename the columns with Math notation for consistency with the metrics symbols in the paper
feature_name_maths_mapping = {
    "entities": "$n_e$", "agents": "$n_{ag}$", "activities": "$n_a$", "nodes": "$n$", "edges": "$e$",
    "diameter": "$d$", "assortativity": "$r$", "acc": "$\\mathsf{ACC}$",
    "acc_e": "$\\mathsf{ACC}_e$",  "acc_a": "$\\mathsf{ACC}_a$",  "acc_ag": "$\\mathsf{ACC}_{ag}$",
    "mfd_e_e": "$\\mathrm{mfd}_{e \\rightarrow e}$", "mfd_e_a": "$\\mathrm{mfd}_{e \\rightarrow a}$",
    "mfd_e_ag": "$\\mathrm{mfd}_{e \\rightarrow ag}$", "mfd_a_e": "$\\mathrm{mfd}_{a \\rightarrow e}$",
    "mfd_a_a": "$\\mathrm{mfd}_{a \\rightarrow a}$", "mfd_a_ag": "$\\mathrm{mfd}_{a \\rightarrow ag}$",
    "mfd_ag_e": "$\\mathrm{mfd}_{ag \\rightarrow e}$", "mfd_ag_a": "$\\mathrm{mfd}_{ag \\rightarrow a}$",
    "mfd_ag_ag": "$\\mathrm{mfd}_{ag \\rightarrow ag}$", "mfd_der": "$\\mathrm{mfd}_\\mathit{der}$", "powerlaw_alpha": "$\\alpha$"
}
importances.rename(columns=feature_name_maths_mapping, inplace=True)



In [22]:

    
grouped =importances.groupby("$k$")  # Grouping the importance values by k



In [23]:

    
# Calculate the mean importance of each feature for each data type
imp_means = grouped.mean()



In [24]:

    
three_most_relevant_metrics = pd.DataFrame(
    {row_name: row.sort_values(ascending=False)[:3].index.get_values()  # three highest importance values in each row
        for row_name, row in imp_means.iterrows()
    }
)
three_most_relevant_metrics









    Out[24]:







  
    
      
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
    
  
  
    
      0
      $\alpha$
      $e$
      $e$
      $e$
      $n_e$
      $n$
      $n_e$
      $n$
      $e$
      $n_e$
      $n_e$
      $e$
      $e$
      $\mathrm{mfd}_{e \rightarrow a}$
      $e$
      $n_e$
      $\mathrm{mfd}_{a \rightarrow a}$
      $\mathrm{mfd}_{a \rightarrow a}$
    
    
      1
      $\mathrm{mfd}_\mathit{der}$
      $r$
      $n_e$
      $n_e$
      $n$
      $\mathrm{mfd}_{a \rightarrow a}$
      $e$
      $n_e$
      $\mathrm{mfd}_{e \rightarrow a}$
      $e$
      $\mathsf{ACC}_e$
      $\mathrm{mfd}_{a \rightarrow e}$
      $\mathrm{mfd}_{e \rightarrow a}$
      $e$
      $\mathrm{mfd}_{e \rightarrow a}$
      $\mathrm{mfd}_{e \rightarrow a}$
      $\mathrm{mfd}_{e \rightarrow a}$
      $\mathrm{mfd}_{e \rightarrow a}$
    
    
      2
      $n_{ag}$
      $n_a$
      $\mathsf{ACC}$
      $d$
      $e$
      $n_e$
      $\mathsf{ACC}$
      $e$
      $r$
      $n$
      $\mathsf{ACC}$
      $\mathrm{mfd}_{a \rightarrow a}$
      $\mathsf{ACC}$
      $n_e$
      $n_e$
      $\mathsf{ACC}_e$
      $e$
      $\mathsf{ACC}_e$

The table above shows the most important metrics as reported by the decision tree classifiers during their training for each value of $k$.

Apart from $k = 1$, whose performance is no better than the random baseline, we count the occurences of the most relevant metrics in cases where $k \geq 2$ to find the most common metrics in the table above.



In [25]:

    
metrics_occurrences = three_most_relevant_metrics.loc[:,2:].apply(pd.value_counts, axis=1).fillna(0)  # excluding k = 1
metrics_occurrences









    Out[25]:







  
    
      
      $\mathrm{mfd}_{a \rightarrow a}$
      $\mathrm{mfd}_{a \rightarrow e}$
      $\mathrm{mfd}_{e \rightarrow a}$
      $\mathsf{ACC}$
      $\mathsf{ACC}_e$
      $d$
      $e$
      $n$
      $n_a$
      $n_e$
      $r$
    
  
  
    
      0
      2.0
      0.0
      1.0
      0.0
      0.0
      0.0
      7.0
      2.0
      0.0
      5.0
      0.0
    
    
      1
      1.0
      1.0
      6.0
      0.0
      1.0
      0.0
      3.0
      1.0
      0.0
      3.0
      1.0
    
    
      2
      1.0
      0.0
      0.0
      4.0
      2.0
      1.0
      3.0
      1.0
      1.0
      3.0
      1.0



In [26]:

    
# sorting the sum of the metrics occurences
pd.DataFrame(metrics_occurrences.sum().sort_values(ascending=False), columns=['occurences'])









    Out[26]:







  
    
      
      occurences
    
  
  
    
      $e$
      13.0
    
    
      $n_e$
      11.0
    
    
      $\mathrm{mfd}_{e \rightarrow a}$
      7.0
    
    
      $n$
      4.0
    
    
      $\mathsf{ACC}$
      4.0
    
    
      $\mathrm{mfd}_{a \rightarrow a}$
      4.0
    
    
      $\mathsf{ACC}_e$
      3.0
    
    
      $r$
      2.0
    
    
      $n_a$
      1.0
    
    
      $d$
      1.0
    
    
      $\mathrm{mfd}_{a \rightarrow e}$
      1.0

As shown above, the number of edges $e$, the number of entities $n_e$, and the maximum finite distance between entities and activities $\mathrm{mfd}_{e \rightarrow a}$ are the most common metrics in the table of the most relevant metrics.

Classification using full dependency graphs (Extra)

In this extra experiement, we tested run the same experiment as above but on the full dependency graphs of messages (similar to the experiements in Application 2), i.e. without restricting a dependency graph to $k$ edges away from a message entity. The provenance network metrics of those dependency graphs are provided in rrg/depgraphs.csv, which has the same format as the other CSV files provided in this application.



In [27]:

    
# Reading the data
df = pd.read_csv("rrg/depgraphs.csv", index_col=0)
# Generate the label for classification
df.label = df.label.apply(label).astype('category')



In [28]:

    
res, imps = test_classification(df, n_iterations=1000)









    



Accuracy: 65.49% ±1.0807 <-- combined
Accuracy: 60.76% ±1.1250 <-- generic
Accuracy: 64.91% ±1.0958 <-- provenance

Results: The above accuracy levels are very low (compared with the 50% baseline accuracy of random selection between two labels), indicating that the provenance network metrics of full dependency graphs of RRG messages do not correlate well with the nature of the messages.

The reason for this is that a RRG provenance graph captures all the activities in a game which are all connected. As a RRG provenance graph evolves linearly along the lifeline of a RRG game, the size of a dependency graph varies greatly depending on when in a game the message was sent; messages sent at the beginning of a game have significantly more (potential) dependants than those sent later in the game. This is shown in the histograms of the number of nodes and edges below. As a result, their network metrics also similarly vary (in another word, noisy) and are not a good predictor of the message type.



In [29]:

    
df.nodes.hist()









    Out[29]:





<matplotlib.axes._subplots.AxesSubplot at 0x114b1ebe0>



In [30]:

    
df.edges.hist()









    Out[30]:





<matplotlib.axes._subplots.AxesSubplot at 0x115bbbda0>

		mean	ci_low	ci_high
Metrics	$k$
combined	1	53.571429	53.349694	53.793163
	2	70.694643	69.717504	71.671781
	3	82.123810	81.252813	82.994806
	4	78.135119	77.173957	79.096281
	5	75.938095	74.923410	76.952781
	6	80.321429	79.430772	81.212085
	7	80.044048	79.118974	80.969121
	8	83.039881	82.182169	83.897593
	9	77.651786	76.704650	78.598922
	10	78.414286	77.469403	79.359168
	11	85.133333	84.344666	85.922001
	12	78.677381	77.732121	79.622641
	13	80.919048	80.060477	81.777618
	14	82.057143	81.219102	82.895184
	15	85.129762	84.314906	85.944618
	16	76.697024	75.788237	77.605811
	17	79.345238	78.478390	80.212086
	18	74.423214	73.435232	75.411197
generic	1	53.571429	53.349694	53.793163
	2	71.056548	70.132292	71.980803
	3	82.589286	81.741715	83.436857
	4	75.643452	74.710589	76.576315
	5	75.035714	74.051876	76.019553
	6	78.804762	77.915717	79.693807
	7	79.705357	78.784296	80.626418
	8	83.432143	82.590400	84.273886
	9	80.003571	79.072804	80.934339
	10	76.353571	75.395842	77.311301
	11	85.112500	84.289158	85.935842
	12	75.823214	74.867505	76.778923
	13	85.239286	84.383318	86.095253
	14	73.975000	73.038561	74.911439
	15	79.379167	78.449196	80.309137
	16	79.908333	78.957141	80.859525
	17	77.121429	76.234707	78.008150
	18	70.788690	69.845326	71.732055
provenance	1	53.571429	53.349694	53.793163
	2	70.435119	69.477929	71.392309
	3	76.973810	76.081626	77.865993
	4	72.008333	71.036694	72.979972
	5	78.201190	77.223992	79.178389
	6	78.280357	77.344481	79.216234
	7	78.414286	77.484401	79.344170
	8	83.013095	82.161806	83.864384
	9	77.951786	76.980571	78.923000
	10	81.062500	80.163019	81.961981
	11	84.680357	83.885165	85.475549
	12	84.067857	83.150331	84.985383
	13	78.607738	77.704217	79.511260
	14	81.760714	80.924072	82.597356
	15	83.820833	82.951727	84.689939
	16	79.811905	78.961087	80.662722
	17	80.589286	79.717217	81.461355
	18	75.419048	74.438139	76.399957

$k$	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
Metrics
combined	53.571429	70.694643	82.123810	78.135119	75.938095	80.321429	80.044048	83.039881	77.651786	78.414286	85.133333	78.677381	80.919048	82.057143	85.129762	76.697024	79.345238	74.423214
generic	53.571429	71.056548	82.589286	75.643452	75.035714	78.804762	79.705357	83.432143	80.003571	76.353571	85.112500	75.823214	85.239286	73.975000	79.379167	79.908333	77.121429	70.788690
provenance	53.571429	70.435119	76.973810	72.008333	78.201190	78.280357	78.414286	83.013095	77.951786	81.062500	84.680357	84.067857	78.607738	81.760714	83.820833	79.811905	80.589286	75.419048

	label	entities	activities	nodes	edges	diameter	assortativity	acc	acc_e	...	mfd_e_a	mfd_a_e	mfd_a_a	mfd_der	powerlaw_alpha
21	requests	53	1	54	63	6	0.197008	0.393939	0.090909	...	4	4	0	5	-1.0
20	commissives	57	3	60	84	7	0.046717	0.403367	0.105051	...	5	5	0	5	-1.0
23	assertives	62	2	64	84	7	0.191105	0.393939	0.090909	...	5	3	0	6	-1.0
25	instruction	55	2	57	77	6	0.128594	0.393939	0.090909	...	5	5	4	5	-1.0
24	instruction	52	1	53	62	7	0.175250	0.394949	0.092424	...	4	5	0	5	-1.0

	label	entities	activities	nodes	edges	diameter	assortativity	acc	acc_e	...	mfd_e_a	mfd_a_e	mfd_a_a	mfd_der	powerlaw_alpha
21	other	53	1	54	63	6	0.197008	0.393939	0.090909	...	4	4	0	5	-1.0
20	other	57	3	60	84	7	0.046717	0.403367	0.105051	...	5	5	0	5	-1.0
23	other	62	2	64	84	7	0.191105	0.393939	0.090909	...	5	3	0	6	-1.0
25	instruction	55	2	57	77	6	0.128594	0.393939	0.090909	...	5	5	4	5	-1.0
24	instruction	52	1	53	62	7	0.175250	0.394949	0.092424	...	4	5	0	5	-1.0

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
0	$\alpha$	$e$	$e$	$e$	$n_e$	$n$	$n_e$	$n$	$e$	$n_e$	$n_e$	$e$	$e$	$\mathrm{mfd}_{e \rightarrow a}$	$e$	$n_e$	$\mathrm{mfd}_{a \rightarrow a}$	$\mathrm{mfd}_{a \rightarrow a}$
1	$\mathrm{mfd}_\mathit{der}$	$r$	$n_e$	$n_e$	$n$	$\mathrm{mfd}_{a \rightarrow a}$	$e$	$n_e$	$\mathrm{mfd}_{e \rightarrow a}$	$e$	$\mathsf{ACC}_e$	$\mathrm{mfd}_{a \rightarrow e}$	$\mathrm{mfd}_{e \rightarrow a}$	$e$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathrm{mfd}_{e \rightarrow a}$
2	$n_{ag}$	$n_a$	$\mathsf{ACC}$	$d$	$e$	$n_e$	$\mathsf{ACC}$	$e$	$r$	$n$	$\mathsf{ACC}$	$\mathrm{mfd}_{a \rightarrow a}$	$\mathsf{ACC}$	$n_e$	$n_e$	$\mathsf{ACC}_e$	$e$	$\mathsf{ACC}_e$

	$\mathrm{mfd}_{a \rightarrow a}$	$\mathrm{mfd}_{a \rightarrow e}$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathsf{ACC}$	$\mathsf{ACC}_e$	$d$	$e$	$n$	$n_a$	$n_e$	$r$
0	2.0	0.0	1.0	0.0	0.0	0.0	7.0	2.0	0.0	5.0	0.0
1	1.0	1.0	6.0	0.0	1.0	0.0	3.0	1.0	0.0	3.0	1.0
2	1.0	0.0	0.0	4.0	2.0	1.0	3.0	1.0	1.0	3.0	1.0

	occurences
$e$	13.0
$n_e$	11.0
$\mathrm{mfd}_{e \rightarrow a}$	7.0
$n$	4.0
$\mathsf{ACC}$	4.0
$\mathrm{mfd}_{a \rightarrow a}$	4.0
$\mathsf{ACC}_e$	3.0
$r$	2.0
$n_a$	1.0
$d$	1.0
$\mathrm{mfd}_{a \rightarrow e}$	1.0

	$\mathrm{mfd}_{a \rightarrow a}$	$\mathrm{mfd}_{a \rightarrow e}$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathsf{ACC}$	$\mathsf{ACC}_e$	$d$	$e$	$n$	$n_a$	$n_e$	$r$
0	2.0	0.0	1.0	0.0	0.0	0.0	7.0	2.0	0.0	5.0	0.0
1	1.0	1.0	6.0	0.0	1.0	0.0	3.0	1.0	0.0	3.0	1.0
2	1.0	0.0	0.0	4.0	2.0	1.0	3.0	1.0	1.0	3.0	1.0

	$\mathrm{mfd}_{a \rightarrow a}$	$\mathrm{mfd}_{a \rightarrow e}$	$\mathrm{mfd}_{e \rightarrow a}$	$\mathsf{ACC}$	$\mathsf{ACC}_e$	$d$	$e$	$n$	$n_a$	$n_e$	$r$
0	2.0	0.0	1.0	0.0	0.0	0.0	7.0	2.0	0.0	5.0	0.0
1	1.0	1.0	6.0	0.0	1.0	0.0	3.0	1.0	0.0	3.0	1.0
2	1.0	0.0	0.0	4.0	2.0	1.0	3.0	1.0	1.0	3.0	1.0