Softmax classification: Multinomial classfication
Logistic regression 복습
Multinomial classfication
$ \begin{bmatrix} w_{A1} & w_{A2} & w_{A3} \\ w_{B1} & w_{B2} & w_{B3} \\ w_{C1} & w_{C2} & w_{C3} \end{bmatrix} \times \begin{bmatrix} x_{1}\\ x_{2}\\ x_{3} \end{bmatrix} = \begin{bmatrix} w_{A1}x_{1} + w_{A2}x_{2} + w_{A3}x_{3} \\ w_{B1}x_{1} + w_{B2}x_{2} + w_{B3}x_{3} \\ w_{C1}x_{1} + w_{C2}x_{2} + w_{C3}x_{3} \end{bmatrix} = \begin{bmatrix} \bar{y}_{A}\\ \bar{y}_{B}\\ \bar{y}_{C} \end{bmatrix} $
Sigmoid
$ WX = Y \begin{bmatrix} 2.0 -> p 0.7 -> 1.0\\ 1.0 -> p 0.2 -> 0.0\\ 0.1 -> p 0.1 -> 0.0 \end{bmatrix} $
Softmax
$S(y_i) = \frac{e^{y_1}}{\sum_{j}e^{y_j}}$
Cost function
$D(S,L) = -\sum_{i}L_{i}\log (s_{i}) = -\sum_{i}L_{i}\log (\bar{y}_{i}) = \sum_{i}L_{i} * - \log(\bar{y}_{i}) $
숙제는 logistic cost function과 cross entropy cost function이 결국은 같은데 이유에 대해서는 생각 해 볼 것
$C(H(x), y) = y\log(H(x)) - (1 - y)\log(1 - H(x))$
Gradient descent
$-\alpha \Delta C(w_1, w_2)$
In [8]:
import tensorflow as tf
import numpy as np
import os
print(os.getcwd())
xy = np.loadtxt('softmax.in', unpack=True, dtype='float')
x_data = np.transpose(xy[0:3])
y_data = np.transpose(xy[3:])
# input
X = tf.placeholder("float", [None, 3])
Y = tf.placeholder("float", [None, 3])
# set model weights
W = tf.Variable(tf.zeros([3, 3]))
# Construct model
# X와 W의 위치를 바꾸어줬기 때문에 위에 x_data 읽어올때 transpose를 해줌
hypothesis = tf.nn.softmax(tf.matmul(X, W))
# learning rate
learning_rate = 0.001
# Cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), reduction_indices=1))
# Gradent Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
# Init
init = tf.global_variables_initializer()
# Launch
with tf.Session() as sess:
sess.run(init)
for step in range(2001):
sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
# Test
a = sess.run(hypothesis, feed_dict={X:[[1, 11, 7]]})
print(a, sess.run(tf.argmax(a, 1)))
b = sess.run(hypothesis, feed_dict={X:[[1, 3, 4]]})
print(b, sess.run(tf.argmax(b, 1)))
c = sess.run(hypothesis, feed_dict={X:[[1, 1, 0]]})
print(c, sess.run(tf.argmax(c, 1)))
all = sess.run(hypothesis, feed_dict={X:[[1, 11, 7], [1, 3, 4], [1, 1, 0]]})
print(all, sess.run(tf.argmax(all, 1)))