In [5]:
%matplotlib qt
import pandas as pd
import numpy as np
import re
import mia

In [6]:
hologic = pd.DataFrame.from_csv('../2015-03-28-real-texture.csv')
hologic.head()


Out[6]:
contrast dissimilarity homogeneity energy
p214-010-60001-cl.png 120768596.0 8260433.75 7615467.955415 52137.295714
p214-010-60001-cr.png 96740019.0 6551713.75 7780004.100804 48991.829472
p214-010-60001-ml.png 150263119.0 9750372.00 7490687.595096 43423.847163
p214-010-60001-mr.png 155084857.5 9615812.75 7532493.632325 37962.063566
p214-010-60005-cl.png 109416700.5 8403287.50 7470991.621762 55686.265538

In [7]:
hologic_meta = mia.analysis.create_hologic_meta_data(hologic, '../data/BIRADS.csv')
hologic_meta.head()


Out[7]:
patient_id side view img_name BIRADS img_number
p214-010-60001-cl.png 21401060001 c l p214-010-60001-cl.png 3 1
p214-010-60001-cr.png 21401060001 c r p214-010-60001-cr.png 3 1
p214-010-60001-ml.png 21401060001 m l p214-010-60001-ml.png 3 1
p214-010-60001-mr.png 21401060001 m r p214-010-60001-mr.png 3 1
p214-010-60005-cl.png 21401060005 c l p214-010-60005-cl.png 4 5

Real Image Analysis


In [8]:
mapping = mia.analysis.tSNE(hologic, n_components=2, verbose=2, learning_rate=300)


[t-SNE] Computing pairwise distances...
[t-SNE] Computed conditional probabilities for sample 360 / 360
[t-SNE] Mean sigma: 0.304437
[t-SNE] Iteration 10: error = 16.1901435, gradient norm = 0.1620562
[t-SNE] Iteration 20: error = 12.9699795, gradient norm = 0.1491592
[t-SNE] Iteration 30: error = 13.1406574, gradient norm = 0.1355151
[t-SNE] Iteration 40: error = 12.3628987, gradient norm = 0.1413517
[t-SNE] Iteration 50: error = 12.4745548, gradient norm = 0.1306971
[t-SNE] Iteration 60: error = 11.4025488, gradient norm = 0.1548845
[t-SNE] Iteration 70: error = 11.2954892, gradient norm = 0.1524526
[t-SNE] Iteration 80: error = 11.3902441, gradient norm = 0.1377847
[t-SNE] Iteration 83: did not make any progress during the last 30 episodes. Finished.
[t-SNE] Error after 83 iterations with early exaggeration: 11.566643
[t-SNE] Iteration 90: error = 0.6098006, gradient norm = 0.0209932
[t-SNE] Iteration 100: error = 0.3526824, gradient norm = 0.0085577
[t-SNE] Iteration 110: error = 0.3127755, gradient norm = 0.0030294
[t-SNE] Iteration 120: error = 0.3029141, gradient norm = 0.0011162
[t-SNE] Iteration 130: error = 0.2984669, gradient norm = 0.0007474
[t-SNE] Iteration 140: error = 0.2959664, gradient norm = 0.0006636
[t-SNE] Iteration 150: error = 0.2945749, gradient norm = 0.0006405
[t-SNE] Iteration 160: error = 0.2937845, gradient norm = 0.0006263
[t-SNE] Iteration 170: error = 0.2933301, gradient norm = 0.0006197
[t-SNE] Iteration 180: error = 0.2930650, gradient norm = 0.0006158
[t-SNE] Iteration 190: error = 0.2929085, gradient norm = 0.0006136
[t-SNE] Iteration 200: error = 0.2928156, gradient norm = 0.0006123
[t-SNE] Iteration 210: error = 0.2927602, gradient norm = 0.0006116
[t-SNE] Iteration 220: error = 0.2927271, gradient norm = 0.0006111
[t-SNE] Iteration 230: error = 0.2927074, gradient norm = 0.0006109
[t-SNE] Iteration 240: error = 0.2926956, gradient norm = 0.0006107
[t-SNE] Iteration 250: error = 0.2926885, gradient norm = 0.0006106
[t-SNE] Iteration 260: error = 0.2926842, gradient norm = 0.0006106
[t-SNE] Iteration 270: error = 0.2926817, gradient norm = 0.0006105
[t-SNE] Iteration 280: error = 0.2926802, gradient norm = 0.0006105
[t-SNE] Iteration 284: error difference 0.000000. Finished.
[t-SNE] Error after 284 iterations: 0.292680

In [9]:
mia.plotting.plot_scatter_2d(mapping, [0,1], hologic_meta.BIRADS)


Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x112147990>

We can see from the scatter matrix that the data is again splitting using the homogeneity attribute


In [11]:
h = hologic.copy()
h['BIRADS'] = hologic_meta.BIRADS
mia.plotting.plot_scattermatrix(h, 'BIRADS')