In [1]:
%matplotlib inline
import trappy
from matplotlib import pyplot as plt
trace = trappy.FTrace("./trace_stats.dat")
In [2]:
# Execute to view it
trappy.plotter.plot_trace(trace)
A trigger is the combination of the following:
* A TRAPpy event
* A pivot for the event
* A set of filters
* A value
The example below explains how to create a trigger for the event when a particular process is switched in or out.
This uses the trappy.sched.SchedSwitch event in the trace object. This event looks for the unique_word sched_switch in the trace file.
Here is sample line in the text trace that corresponds to the SchedSwitch event.
sched_switch: prev_comm=trace-cmd prev_pid=4731 prev_prio=120 prev_state=1 next_comm=trace-cmd next_pid=4730 next_prio=120
We can see that this event is populated in the above trace object as:
In [3]:
trace.sched_switch.data_frame.head()
Out[3]:
Now, let us suppose we want to create a trigger for the event when the task trace-cmd (pid=4729)
is switched in and assign a value of 1 to the Trigger signal. In pseudo code:
if next_pid == 4729:
Trigger_In(t) = 1
We can similarly create a Trigger for the event when the task is switched out:
if prev_pid == 4729:
Trigger_Out(t) = -1
A pivot is the column along which the data is orthogonal. If the data is pivotable it is possibly a super-position of smaller data elements corresponding to each unique pivot value. In our case the pivot value is "__cpu" and the data can be split into the scheduler switches happening on the different CPUs
In [4]:
from trappy.stats.Trigger import Trigger
task_pid = 4729
trigger_switch_in = Trigger(trace,
trappy.sched.SchedSwitch,
pivot = "__cpu",
filters = {
"next_pid" : task_pid
},
value = 1)
trigger_switch_out = Trigger(trace,
trappy.sched.SchedSwitch,
pivot = "__cpu",
filters = {
"prev_pid" : task_pid
},
value = -1)
A topology can be explained as a collection of different arrangements of groups of nodes. These different arrangements are called levels. Each level has a multiple groups of nodes. For example for a CPU topology can have the following levels:
CPU
[
[cpu0],
[cpu1],
.
.
[cpuN],
]
Cluster
[
[custer1_cpu1, cluster1_cpu2, ... cluster1_cpuM)],
.
.
.
[custerK_cpu1, clusterK_cpu2, ... clusterK_cpuP)]
]
System
[
[cpu0,
cpu1,
.
.
.
cpuN]
]
In [5]:
from trappy.stats.Topology import Topology
cluster_0 = [0, 3, 4, 5]
cluster_1 = [1, 2]
clusters = [cluster_0, cluster_1]
topology = Topology(clusters=clusters)
pandas.Series as an input. For example. def aggfunc(series):
return modify(series)
"__cpu", so, the Topology corresponds to a CPU topology.
In [6]:
topology.flatten()
Out[6]:
Here is an informal pseudo-code for this base aggregation:
base_signals = {}
for node in topology.flatten():
node_signal = init_signal()
for trigger in triggers:
node_signal.union(trigger.get_signal_for_pivot_val(node))
base_signal[node] = node_signal
The Aggregator.aggregate accepts a level parameter:
These can be accessed as
topology.get_level(level)
In [7]:
topology.get_level("cluster")
Out[7]:
Here is an informal psuedo-code for this aggregation over the Topology:
agg_signal = []
for group_of_nodes in Topology.get_level(level):
for node in group_of_nodes:
group_signal = init_signal()
group_signal += aggfunc(base_signals[node])
agg_signal.append(group_signal)
In [8]:
from trappy.stats.Aggregator import MultiTriggerAggregator
def no_operation_aggfunc(series):
return series
triggers = [
trigger_switch_in,
trigger_switch_out
]
vector_agg = MultiTriggerAggregator(triggers, topology, no_operation_aggfunc)
In [9]:
level = "cluster"
result = vector_agg.aggregate(level=level)
# Utility Code for Viewing the data
clusters = (str(c) for c in topology.get_level(level))
for series, cluster in zip(result, clusters):
plt.figure(figsize=(15,7))
plt.plot(series.index, series.values)
plt.title(cluster)
In [10]:
level = "cpu"
result = vector_agg.aggregate(level=level)
# Utility Code for Viewing the data
cpus = (str(c) for c in topology.get_level(level))
for series, cpu in zip(result, cpus):
plt.figure(figsize=(15,7))
plt.plot(series.index, series.values)
plt.title(cpu)
An aggregator function can return a scalar, for example the number of "switch ins". Each value in the result has a one-to-one correspondence with the groups in Topology.get_level(level)
In [11]:
def num_switch_ins(series):
return len(series[series == 1])
scalar_agg = MultiTriggerAggregator(triggers, topology, num_switch_ins)
print scalar_agg.aggregate(level="cpu")
print scalar_agg.aggregate(level="cluster")