In [1]:
%matplotlib inline
In [2]:
import numpy as np
import matplotlib.pyplot as plt
import keras.backend as K
from keras.layers import Input, Dense
from keras.models import Sequential, Model
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_blobs
In [3]:
'TensorFlow version: ' + K.tf.__version__
Out[3]:
In [4]:
n_samples = 100
n_features = 2
n_classes = 2
seed = 42
rng = np.random.RandomState(seed)
In [5]:
x_test, y_test = make_blobs(n_samples=n_samples, centers=n_classes, random_state=rng)
In [6]:
# class labels are balanced
np.sum(y_test)
Out[6]:
In [7]:
fig, ax = plt.subplots(figsize=(7, 5))
cb = ax.scatter(*x_test.T, c=y_test, cmap='coolwarm')
fig.colorbar(cb, ax=ax)
ax.set_xlabel('$x_1$')
ax.set_ylabel('$x_2$')
plt.show()
In [8]:
classifier = Sequential([
Dense(16, input_dim=n_features, activation='relu'),
Dense(32, activation='relu'),
Dense(32, activation='relu'),
Dense(1, activation='sigmoid')
])
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy')
In [9]:
loss = classifier.evaluate(x_test, y_test)
loss
Out[9]:
An alternative specification isolates the positive and negative samples and explicitly requires an input for each class. The loss that is optimized is then the sum of the binary cross-entropy losses for each class. Since we fix the corresponding target labels to be all ones or zeros accordingly, the binary cross-entropy losses for each class both result in one of the complementary terms, and averaging them would result in the usual binary cross-entropy loss on all samples with their corresponding labels.
This trick is crucial for many model specifications in keras-adversarial.
In [10]:
pos = Input(shape=(n_features,))
neg = Input(shape=(n_features,))
# make use of the classifier defined earlier
y_pred_pos = classifier(pos)
y_pred_neg = classifier(neg)
# define a multi-input, multi-output model
classifier_alt = Model([pos, neg], [y_pred_pos, y_pred_neg])
classifier_alt.compile(optimizer='rmsprop', loss='binary_crossentropy')
In [11]:
losses = classifier_alt.evaluate(
[
x_test[y_test == 1],
x_test[y_test == 0]
],
[
np.ones(n_samples // 2),
np.zeros(n_samples // 2)
]
)
losses
Out[11]:
The loss that actually gets optimized is the first value above, the sum of the subsequent values. The mean of them is the required binary cross-entropy loss.
In [12]:
.5 * losses[0]
Out[12]:
In [13]:
# alternatively
np.mean(losses[1:])
Out[13]:
In [14]:
np.allclose(loss, np.mean(losses[1:]))
Out[14]: