visISC Example: Visualizing Anomalous Frequency Data with Hierarchical Data

In this example, we will show what to do when you are analysing frequency counts of data and the data is organized in an hierarchy. For instance, when you are analysing message or alarm rates over time, and you have many different types of messages or alarms, including higher level alarms.


In [ ]:
import pyisc;
import visisc;
import numpy as np
import datetime
from scipy.stats import poisson, norm, multivariate_normal
%matplotlib wx

Event Frequency Data

In this example, similarly to the previous example with a flat structure, we create a data set with a set of sources and a set of Poisson distributed event frequency counts, but with many more event columns:


In [ ]:
n_sources = 10
n_events = 100
num_of_normal_days = 200
num_of_anomalous_days = 10
data = None
days_list = [num_of_normal_days, num_of_anomalous_days]
dates = []
for state in [0,1]: # normal, anomalous data
    num_of_days = days_list[state]
    for i in range(n_sources):
        data0 = None
        for j in range(n_events):
            if state == 0:# Normal
                po_dist = poisson(int((10+2*(n_sources-i))*(float(j)/n_events/2+0.75))) # from 0.75 to 1.25
            else: # anomalous
                po_dist = poisson(int((20+2*(n_sources-i))*(float(j)/n_events+0.5))) # from 0.5 to 1.5

            tmp = po_dist.rvs(num_of_days)
            if data0 is None:
                data0 = tmp
            else:
                data0 = np.c_[data0,tmp]

        tmp =  np.c_[
                    [i] * (num_of_days), # Sources
                    [ # Timestamp
                        datetime.date(2015,02,24) + datetime.timedelta(d) 
                        for d in np.array(range(num_of_days)) + (0 if state==0 else num_of_normal_days)
                    ], 
                    [1] * (num_of_days), # Measurement period
                    data0, # Event frequency counts
                    
                    ]
        
        if data is None:
            data = tmp
        else:
            data = np.r_[
                tmp,
                data
            ]

# Column index into the data
source_column = 0
date_column = 1
period_column = 2
first_event_column = 3
last_event_column = first_event_column + n_events

Hierarchical Event Data Model

Next, we create a event data model that describes how our events are organized in a type hierarchy. In this case, we assume a hierachical structure for the events, where the path of the event is returned by event_path (given the evetn column index) and likewise, it is also possible to return a severity level of the event in order to evaluate its importance.


In [ ]:
def event_path(x): # Returns a list of strings with 3 elements
    return ["Type_%i"%(x/N) for N in [50, 10, 2]]

def severity_level(x): # returns 3 different severity levels: 0, 1, 2
    return x-(x/3)*3

model = visisc.EventDataModel.hierarchical_model(
    event_columns=range(first_event_column,last_event_column),
    get_event_path = event_path,
    get_severity_level = severity_level,
    num_of_severity_levels=3
)

data_object = model.data_object(
    data,
    source_column = source_column,
    class_column = source_column,
    period_column=period_column,
    date_column=date_column
)

anomaly_detector = model.fit_anomaly_detector(data_object,poisson_onesided=True)

Visualization

Finally, we can viualize the event frequency data using the Visualization class. However, due to incompatibility between the used 3D engine and Jupyter notebook, we have to run the notebook as a script. Notice, on Windows, it has to be run in a comand window. Remove the '!' and run it in the docs catalog in the visic catalog.

vis = visisc.EventVisualization(model, 13.8,start_day=209)


In [ ]:
!ipython --matplotlib=wx --gui=wx -i visISC_hierachical_frequency_data_example.py

Class Level Visualization

Now, you should see a window similar to the picture shown below. This is very similar to the what we got with the flat model example. However, in this case, we also have different shades of red to indicate different severity levels. Darker red indicates more sever events and lighter red indicates less sever events. Each column shows the total number of events for each source (or event type in next pictures) and the color the most anomalous severity level.

Root Level Visualization

However, now when we click on a source label, only the event type levels below the root level are shown.

Middle Event Level Visualization

It is now also possible to click on the event types to zoom down in the event hierarchy in order to find where the anomalies originated from. By clicking on the event types below the root, we get to the middle level event types shown below.

Ground Level Visualization

Finally, by clicking on the middle level event types we get to the leaf nodes of the hierarchy. Similarly to the flat model case, the anomalies are almost only visible at higher levels of the hierarchy.


In [ ]: