visISC Example: Interactive Query Dialog with Visualization

In this example, we will show ho you can use the GUI component EventSelectionDialog tother with EventSelectionQuery for letting the use select which events to visualize. We start by creating a data set similar to the previous example on Visualizing Anomalous Frequency Data with Hierarchical Data but that also includes source classes (for instance, machine types). So, the data set becomes quite large and thereby we need to be able to select a subset of the data that we are most interested in comparing.


In [ ]:
import pyisc;
import visisc;
import numpy as np
import datetime
from scipy.stats import poisson, norm, multivariate_normal
%matplotlib wx
%gui wx

n_sources = 10
n_source_classes = 10
n_events = 100
num_of_normal_days = 200
num_of_anomalous_days = 10
data = None
days_list = [num_of_normal_days, num_of_anomalous_days]
dates = []
for state in [0,1]: # normal, anomalous data
    num_of_days = days_list[state]
    for k in range(n_source_classes):
        for i in range(n_sources):
            data0 = None
            for j in range(n_events):
                if state == 0:# Normal
                    po_dist = poisson(int((10+2*(n_source_classes-k))*(float(j)/n_events/2+0.75))) # from 0.75 to 1.25
                else: # anomalous
                    po_dist = poisson(int((20+2*(n_source_classes-k))*(float(j)/n_events+0.5))) # from 0.5 to 1.5

                tmp = po_dist.rvs(num_of_days)
                if data0 is None:
                    data0 = tmp
                else:
                    data0 = np.c_[data0,tmp]

            tmp =  np.c_[
                        [k*n_sources+i] * (num_of_days), # Sources
                        [k] * (num_of_days), # Source classes
                        [ # Timestamp
                            datetime.date(2015,02,24) + datetime.timedelta(d) 
                            for d in np.array(range(num_of_days)) + (0 if state==0 else num_of_normal_days)
                        ], 
                        [1] * (num_of_days), # Measurement period
                        data0, # Event frequency counts

                        ]

            if data is None:
                data = tmp
            else:
                data = np.r_[
                    tmp,
                    data
                ]

# Column index into the data
source_column = 0
class_column = 1
date_column = 2
period_column = 3
first_event_column = 4
last_event_column = first_event_column + n_events

Likewise, as before we need to create an event parth function and a severity level function.


In [ ]:
event_names = ["event_%i"%i for i in range(n_events)]

def event_path(x): # Returns a list of strings with 3 elements
    return ["Type_%i"%(x/N) for N in [50, 10]]+[event_names[x]]

def severity_level(x): # returns 3 different severity levels: 0, 1, 2
    return x-(x/3)*3

Next, we need to make an subclass or an instance of the visisc.EventSelectionQuery. This class uses the Traits library which is also used by Mayavi, the 3D visualization library that we use for visualizing the data. In the initialization of an instance, we need to set four Trait lists: list_of_source_ids, list_of_source_classes, list_of_event_names, list_of_event_severity_levels. In addition to that, we need to set period_start_date and period_end_date. In the current version, we also need to programatically set selected_list_of_source_ids. We need also implement the execute_query method similarly to as shown below. The execute_query can access the users selection from selected_list_of_source_ids, selected_list_of_source_classes, selected_list_of_event_names, and selected_list_of_event_severity_levels.


In [ ]:
class MySelectionQuery(visisc.EventSelectionQuery):
    def __init__(self):
        self.list_of_source_ids = [i for i in range(n_sources*n_classes)]
        # Below: a list of pairs with id and name, where the name is shown in the GUI while the id is put into teh selection. 
        self.list_of_source_classes = [(i, "class_%i"%i) for i in range(n_source_classes)] 
        self.list_of_event_names = event_names
        # Below: a list of pairs with id and name, where the name is shown in the GUI while the id is put into teh selection. 
        self.list_of_event_severity_levels = [(i, "Level %i"%i) for i in range(3)] 
        self.period_start_date = data.T[date_column].min()
        self.period_end_date = data.T[date_column].max()
    
    def execute_query(self):
        query = self
        query.selected_list_of_source_ids = query.list_of_source_ids

        data_query = np.array(
            [
            data[i] for i in range(len(data)) if 
                data[i][source_column] in query.selected_list_of_source_ids and
                data[i][class_column] in query.selected_list_of_source_classes and
                data[i][date_column] >= query.period_start_date and
                data[i][date_column] <= query.period_end_date
            ]
        )

        event_columns = [first_event_column+event_names.index(e) for e in query.selected_list_of_event_names
             if severity_level(first_event_column+event_names.index(e)) in query.selected_list_of_event_severity_levels]

        model = visisc.EventDataModel.hierarchical_model(
            event_columns=event_columns,
            get_event_path = event_path,
            get_severity_level = severity_level,
            num_of_severity_levels=3
        )

        data_object = model.data_object(
            data_query,
            source_column = source_column,
            class_column = class_column,
            period_column=period_column,
            date_column=date_column
        )

        anomaly_detector = model.fit_anomaly_detector(data_object,poisson_onesided=True)

        vis = visisc.EventVisualization(model, 13.8,
                                 start_day=query.period_end_date,# yes confusing, start day in the EventVisualization is backward looking
                                 precompute_cache=True) # Precompute all anomaly calculation in order to speed up visualization.

Given that we have the query class, we can now create and open a query selection dialog where it is possible to customize the labels for source classes and the severity levels.


In [ ]:
query = MySelectionQuery()

dialog = visisc.EventSelectionDialog(
    query,
    source_class_label="Select Machine Types",
    severity_level_label="Select Event Severity Types"
)

For opening the window, we can the call. However, simarly to previous visualization examples, we have to run it outside the Jupyter notebook by calling ipython directly.

dialog.configure_traits()


In [ ]:
!ipython --matplotlib=wx --gui=wx -i visISC_query_dialog_example.py

The result from running the above statement will look similar to what is shown below.

By selecting severity level 0 and class 0, and then, press the run query button, we will see a similar window as in previous examples:

In addition, we can also select which events we want to visualize by typing search engine like queries using:
Allowed charachters: alphanumeric and '_'and '.'
Space indicate OR-separated queries
'?' = matches any character
'*' = matches any number of characters
'^' = matches beginning of event name
'\$' = matches end of event name

In the example above, the query "1\$ 2\$" matches all event names ending with 1 or 2.


In [ ]: