Judging Information

This document explains how the judging will be done for the "Best Classification" system.

Read the Step 1: Get Data tutorial for information on where to get the training and test data sets.

Generate CSV File with Scores

Your job is to create a CSV file that holds the scores for each of the signals found in the test data set.

Each line of your CSV file "Scorecard", will contain the signal's UUID, followed by scores for each signal class. The order of the scores is absolutely critical.

The signal class with the highest score would be your model's class estimation. Typically these scores are the probability estimates for each class.

For each data file in the test set, generate the appropriate spectrogram, and then pass that to your signal classifier (machine-learning model) to calculate the scores for each class.

For example, each line your the CSV scorecard file should look something like:

abdefgadbc1223234123123cvaf, 0.1, 0.023, 0.451, 0.232, 0.001, 0.07, 0.0083

THE COLUMN ORDER IS ABSOLUTELY CRITICAL!

THE ORDER OF THE SCORES IN EACH ROW OF YOUR CSV FILE MUST BE:

brightpixel, narrowband, narrowbanddrd, noise, squarepulsednarrowband, squiggle, squigglesquarepulsednarrowband

This is in alphabetical order.

Example PseudoCode



In [ ]:

    
import ibmseti
import csv
import ALL LIBS FOR YOUR CLASSIFIER (tensorflow, Watson, etc)

my_model #this is your trained signal classification model
my_output_results = mydatafolder + '/signal_class_results.csv'
zz = zipfile.ZipFile(mydatafolder + '/' + 'primary_testset_preview_v3.zip')

for fn in zz.namelist():
    data = zz.open(fn).read()
    aca = ibmseti.compamp.SimCompamp(data)
    uuid = aca.header()['uuid']
    spectrogram = draw_spectrogram(aca) #whatever signal processing code you need would go in your `draw_spectrogram` code
    
    #cr = class results. In this example, it's a dictionary. But in your experience it could be something else
    #       like a simple list.
    cr = my_model.classify(spectrogram)
    
    with open(my_output_results, 'a') as csvfile:
        fwriter = csv.writer(csvfile, delimiter=',')
        fwrite.writerow([uuid, cr['brightpixel'], cr['narrowband'], cr['narrowbanddrd'],
                         cr['noise'], cr['squarepulsednarrowband'], cr['squiggle'],
                         cr['squigglesquarepulsednarrowband']
                        ])

The Scoreboard for the Preview Test is here.. You can submit up to 10 scorecards to this scoreboard. Also, the Preview test set UUID,class labels are now found in the results folder of this repository. The results folder also contains some code that you can use to score your own scorecard. (This means you can submit a perfect score to the Preview scoreboard -- but please don't do this. Your score will be deleted.)

The Scoreboard for the Final Test is here.. You can submit only one scorecard to this scoreboard. The UUID,class labels for the final test set will not be published and you can use this as a final test of your model.

Please read this walkthrough to sign up for the Scoreboard system, form your team, and submit an example result. An example scorecard is found below.

Example Scorecard for Preview Testset and Scoreboard

The scores in this example file are random values between 0 and 1. Typically, your scores will be your classification model's estimate of probability for each class. (As such, they would sum to 1.0. However, to be sure, the Log-loss calculator in our Scoreboard will normalize your score to ensure the values sum to 1.0.)

Example Preview Test Set Scorecard

With this scorecard, you should get the exact same values as "TeamRandom" that is currently on the Preview Scoreboard.

Measured by Logistical Loss

In this contest we are using the Log-Loss function as a measure of your model's performance.

Retrieving your CSV File from IBM DSX

If you are running your analysis on IBM DSX (IBM Apache Spark), you'll need to get your .csv file to a local machine in order to submit your results to the Scoreboard.

One way is to move your .csv file to your Object Storage account that is provided in DSX. This tutorial shows you the basic steps to move data to and from your Object Storage instance.

Then, from DSX (or Bluemix), navigate to your Object Storage container and you can download the file to your local machine with a click.

Another good option is to use Pixiedust. Among the many features of Pixiedust, you can load a .csv file into a Pandas or Spark DataFrame and Pixiedust will display that data in your Jupyter notebook. From the display, there is an icon that lets you download the data directly. This is probably the easier option, though you will need to "pip install --user pixiedust" and restart your kernel.



In [ ]: