(1) average the scores between the _0,_1,_2,_3 directions to get average score per image in each HOG configuration.

(2) In each HOG configuration, calculate the Precision and Recall values.

(3) "Bootstrap" or "jacknife" to get an error on the AUC for each HOG configuration, describe how you bootstrapped it in words.

(4) Output should look like:

HOG config | Precision | Recall | AUC | AUCerr


In [1]:
import glob
import numpy as np
import pandas as pd
from sklearn.metrics import precision_score, recall_score, roc_auc_score

In [ ]:


In [2]:
def get_data(datadir):
    """
    Read the data files from different subdirectories of datadir corresponding
    to different HOG configurations.
    
    Inputs
    
    datadir: top level directory in which there are subdirectories corresponding
             to different HOG configurations
    
    Output
    
    data: {hogname: list(pd.DataFrame)} where each key corresponds to a
          different subdirectory (HOG configuration) and the value is
          a list of dataframes read from each of the files in that
          subdirectory
    """
    hognames = [s.split('/')[-1] for s in glob.glob(datadir + '/*')]
    return {hogname: [pd.read_csv(filename, sep=None)
                      for filename in glob.glob('{}/{}/filenames_*.txt'.format(datadir, hogname))]
            for hogname in hognames}

In [3]:
def get_average_scores(dataframes):
    """
    Average the scores from several different rotations.
    
    Inputs
    
    dataframes: list(pd.DataFrame['filename', 'score', 'label'])
    
    Output
    
    df_out: pd.DataFrame['filename', 'score', 'label'] where 'score'
            is the average over all of the input dataframes and
            'label' is taken arbitrarily from the first input dataframe
    """
    dataframes = [df.rename(columns={'score': 'score_{}'.format(idx),
                                     'label': 'label_{}'.format(idx)})
                  for idx, df in enumerate(dataframes)]
    merged_df = reduce(lambda df1, df2: pd.merge(df1, df2, on='filename'), dataframes)
    assert all(df.shape[0] == merged_df.shape[0] for df in dataframes), \
    'Not all keys are the same in the data sets'
    
    merged_df['score'] = sum(merged_df['score_{}'.format(idx)] for idx, _ in enumerate(dataframes))
    merged_df['label'] = merged_df['label_0']
    return merged_df[['filename', 'score', 'label']]

In [4]:
def bootstrap(df, func, num_samples, sample_size_frac=1):
    """
    Returns the bootstrap average and standard deviation when applying
    func to df.  It is assumed that applying func to df returns a scalar.
    
    In each iteration, sample_size_frac*N rows are drawn from df at
    random with replacement, where N is the number of rows in df.
    In this way a DataFrame df_sample is created of the same type
    as df, with possible a different number of rows.  The calculation
    of interest is done on df_sample by applying func and returning
    a number.  This number is collected into an array, and this
    process is repeated for num_samples iterations.  Finally, the
    mean and standard deviation of the array of length num_samples
    is returned.  The standard deviation is an estimate of the error
    (due to finite sample size) that you would get when applying
    func to the full DataFrame df to get a number.
    
    Inputs
    
    df: pd.DataFrame of any type
    func: function that takes in df and returns a scalar
    num_samples: number of bootstrap samples/iterations,
                 see description above
    sample_size_frac: in each bootstrap sample, the number
                      of rows sampled is this fraction of
                      the actual number of rows in df
                      
    Outputs
    
    mean: mean of the bootstrap values.  Should be close to
          func(df) if num_samples is large enough.
    std: standard deviation of the bootstrap values. This is
         an estimate of the error (due to finite sample size)
         of func(df).
    """
    N = df.shape[0]
    sample_size = int(N*sample_size_frac)
    bootstrap_values = [func(df.iloc[np.random.randint(N, size=sample_size)])
                        for _ in range(num_samples)]
    return np.mean(bootstrap_values), np.std(bootstrap_values)

In [5]:
def main(datadir, num_boot_samples, bands=None):
    """
    For each HOG configuration, average scores from different rotations and
    output metrics: precision, recall, AUC, and standard deviation of the AUC
    from the bootstrap analysis.  Details of the bootstrap analysis described
    in the bootstrap function.
    
    Inputs
    
    datadir: directory name in which there are subdirectories corresponding
             to different HOG configurations
    num_boot_samples: number of bootstrap samples to create in the bootstrap
                      analysis (see bootstrap function)
    bands: list of bands to analyze separately.  If None, don't separate out
           bands.
                      
    Output
    
    pd.DataFrame['HOG_config', 'Precision', 'Recall', 'AUC',
                 'AUC_boot_avg', 'AUC_boot_std']
                 
    OR
    
    pd.DataFrame['HOG_config', 'Band', 'Precision', 'Recall', 'AUC',
                 'AUC_boot_avg', 'AUC_boot_std']
    """
    data = get_data(datadir)
    columns = ['HOG_config',
               'Precision',
               'Recall',
               'AUC',
               'AUC_boot_avg',
               'AUC_boot_std']
    if bands is not None:
        columns = columns[:1] + ['Band'] + columns[1:]
    output = {k: [] for k in columns}

    for hogname, dataframes in data.iteritems():
        scores_all_bands = get_average_scores(dataframes)
        if bands is not None:
            scores_all_bands['band'] = scores_all_bands['filename'].apply(lambda s: s.split('_')[2])
        # filter filenames further here if needed
        for band in (bands if bands is not None else ['']):
            if bands is not None:
                scores = scores_all_bands[scores_all_bands['band'] == band]
                output['Band'].append(band)
            else:
                scores = scores_all_bands
            output['HOG_config'].append(hogname)
            output['Precision'].append(precision_score(scores['label'], scores['score'] > 0.5))
            output['Recall'].append(recall_score(scores['label'], scores['score'] > 0.5))
            output['AUC'].append(roc_auc_score(scores['label'], scores['score']))
            boot_avg, boot_std = bootstrap(scores, lambda sc: roc_auc_score(sc['label'], sc['score']),
                                           num_boot_samples)
            output['AUC_boot_avg'].append(boot_avg)
            output['AUC_boot_std'].append(boot_std)
    
    return pd.DataFrame(output)[columns]

In [ ]:

Test on Mock


In [182]:
main('/path/to/data/directory', 10000)


/Users/f566998/anaconda/lib/python2.7/site-packages/ipykernel_launcher.py:5: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support sep=None with delim_whitespace=False; you can avoid this warning by specifying engine='python'.
  """
Out[182]:
HOG_config Precision Recall AUC AUC_boot_avg AUC_boot_std
0 ppc16cpb1 0.580585 0.933 0.837152 0.837301 0.008851
1 ppc16cpb3 0.698332 0.963 0.947882 0.947874 0.004986
2 ppc10cpb3 0.729282 0.924 0.925003 0.925028 0.005971
3 ppc6cpb3 0.713396 0.916 0.901193 0.901253 0.006918
4 ppc8cpb4 0.731240 0.955 0.942853 0.942810 0.005231
5 ppc12cpb3 0.723715 0.943 0.932294 0.932276 0.005629
6 ppc8cpb2 0.670968 0.936 0.907659 0.907594 0.006544
7 ppc8cpb3 0.711940 0.954 0.932361 0.932293 0.005674

Test on SLACS


In [6]:
main('/path/to/data/directory', 10000)


/Users/f566998/anaconda/lib/python2.7/site-packages/ipykernel_launcher.py:21: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support sep=None with delim_whitespace=False; you can avoid this warning by specifying engine='python'.
Out[6]:
HOG_config Precision Recall AUC AUC_boot_avg AUC_boot_std
0 ppc16cpb1 0.781818 0.710744 0.508855 0.509372 0.055635
1 ppc16cpb3 0.797872 0.619835 0.579693 0.579594 0.051548
2 ppc10cpb3 0.787234 0.611570 0.565998 0.566449 0.052739
3 ppc6cpb3 0.740741 0.495868 0.469658 0.469583 0.052430
4 ppc8cpb4 0.795181 0.545455 0.573554 0.574829 0.051678
5 ppc12cpb3 0.817073 0.553719 0.611806 0.611472 0.051830
6 ppc8cpb2 0.780952 0.677686 0.540260 0.540837 0.053210
7 ppc8cpb3 0.776596 0.603306 0.566234 0.566288 0.052566

Test on SLACS separating out different bands


In [7]:
main('/path/to/data/directory', 10000, bands=['435', '814'])


/Users/f566998/anaconda/lib/python2.7/site-packages/ipykernel_launcher.py:21: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support sep=None with delim_whitespace=False; you can avoid this warning by specifying engine='python'.
Out[7]:
HOG_config Band Precision Recall AUC AUC_boot_avg AUC_boot_std
0 ppc16cpb1 435 0.666667 0.500000 0.474359 0.475056 0.108194
1 ppc16cpb1 814 0.769231 0.750000 0.480682 0.480990 0.066764
2 ppc16cpb3 435 0.400000 0.083333 0.439103 0.437226 0.115580
3 ppc16cpb3 814 0.786667 0.737500 0.575568 0.574227 0.063030
4 ppc10cpb3 435 0.500000 0.208333 0.432692 0.432912 0.111085
5 ppc10cpb3 814 0.788732 0.700000 0.548295 0.547827 0.067872
6 ppc6cpb3 435 0.333333 0.125000 0.310897 0.312192 0.098854
7 ppc6cpb3 814 0.758065 0.587500 0.470455 0.471318 0.066270
8 ppc8cpb4 435 0.333333 0.083333 0.423077 0.422859 0.117460
9 ppc8cpb4 814 0.800000 0.650000 0.578409 0.578828 0.063724
10 ppc12cpb3 435 0.333333 0.041667 0.503205 0.505153 0.106680
11 ppc12cpb3 814 0.808824 0.687500 0.606818 0.607309 0.065499
12 ppc8cpb2 435 0.562500 0.375000 0.416667 0.417233 0.107811
13 ppc8cpb2 814 0.789474 0.750000 0.543182 0.542849 0.065817
14 ppc8cpb3 435 0.538462 0.291667 0.426282 0.425760 0.113388
15 ppc8cpb3 814 0.779412 0.662500 0.572727 0.572643 0.065960

In [ ]: