Image Captioning with LSTMs

In the previous exercise you implemented a vanilla RNN and applied it to image captioning. In this notebook you will implement the LSTM update rule and use it for image captioning.


In [1]:
# As usual, a bit of setup

import time, os, json
import numpy as np
import matplotlib.pyplot as plt

from cs231n.gradient_check import eval_numerical_gradient, eval_numerical_gradient_array
from cs231n.rnn_layers import *
from cs231n.captioning_solver import CaptioningSolver
from cs231n.classifiers.rnn import CaptioningRNN
from cs231n.coco_utils import load_coco_data, sample_coco_minibatch, decode_captions
from cs231n.image_utils import image_from_url

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

def rel_error(x, y):
  """ returns relative error """
  return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))

Load MS-COCO data

As in the previous notebook, we will use the Microsoft COCO dataset for captioning.


In [2]:
# Load COCO data from disk; this returns a dictionary
# We'll work with dimensionality-reduced features for this notebook, but feel
# free to experiment with the original features by changing the flag below.
data = load_coco_data(pca_features=True)

# Print out all the keys and values from the data dictionary
for k, v in data.iteritems():
  if type(v) == np.ndarray:
    print k, type(v), v.shape, v.dtype
  else:
    print k, type(v), len(v)


idx_to_word <type 'list'> 1004
train_captions <type 'numpy.ndarray'> (400135, 17) int32
val_captions <type 'numpy.ndarray'> (195954, 17) int32
train_image_idxs <type 'numpy.ndarray'> (400135,) int32
val_features <type 'numpy.ndarray'> (40504, 512) float32
val_image_idxs <type 'numpy.ndarray'> (195954,) int32
train_features <type 'numpy.ndarray'> (82783, 512) float32
train_urls <type 'numpy.ndarray'> (82783,) |S63
val_urls <type 'numpy.ndarray'> (40504,) |S63
word_to_idx <type 'dict'> 1004

LSTM

If you read recent papers, you'll see that many people use a variant on the vanialla RNN called Long-Short Term Memory (LSTM) RNNs. Vanilla RNNs can be tough to train on long sequences due to vanishing and exploding gradiants caused by repeated matrix multiplication. LSTMs solve this problem by replacing the simple update rule of the vanilla RNN with a gating mechanism as follows.

Similar to the vanilla RNN, at each timestep we receive an input $x_t\in\mathbb{R}^D$ and the previous hidden state $h_{t-1}\in\mathbb{R}^H$; the LSTM also maintains an $H$-dimensional cell state, so we also receive the previous cell state $c_{t-1}\in\mathbb{R}^H$. The learnable parameters of the LSTM are an input-to-hidden matrix $W_x\in\mathbb{R}^{4H\times D}$, a hidden-to-hidden matrix $W_h\in\mathbb{R}^{4H\times H}$ and a bias vector $b\in\mathbb{R}^{4H}$.

At each timestep we first compute an activation vector $a\in\mathbb{R}^{4H}$ as $a=W_xx_t + W_hh_{t-1}+b$. We then divide this into four vectors $a_i,a_f,a_o,a_g\in\mathbb{R}^H$ where $a_i$ consists of the first $H$ elements of $a$, $a_f$ is the next $H$ elements of $a$, etc. We then compute the input gate $g\in\mathbb{R}^H$, forget gate $f\in\mathbb{R}^H$, output gate $o\in\mathbb{R}^H$ and block input $g\in\mathbb{R}^H$ as

$$ \begin{align*} i = \sigma(a_i) \hspace{2pc} f = \sigma(a_f) \hspace{2pc} o = \sigma(a_o) \hspace{2pc} g = \tanh(a_g) \end{align*} $$

where $\sigma$ is the sigmoid function and $\tanh$ is the hyperbolic tangent, both applied elementwise.

Finally we compute the next cell state $c_t$ and next hidden state $h_t$ as

$$ c_{t} = f\odot c_{t-1} + i\odot g \hspace{4pc} h_t = o\odot\tanh(c_t) $$

where $\odot$ is the elementwise product of vectors.

In the rest of the notebook we will implement the LSTM update rule and apply it to the image captioning task.

LSTM: step forward

Implement the forward pass for a single timestep of an LSTM in the lstm_step_forward function in the file cs231n/rnn_layers.py. This should be similar to the rnn_step_forward function that you implemented above, but using the LSTM update rule instead.

Once you are done, run the following to perform a simple test of your implementation. You should see errors around 1e-8 or less.


In [8]:
N, D, H = 3, 4, 5
x = np.linspace(-0.4, 1.2, num=N*D).reshape(N, D)
prev_h = np.linspace(-0.3, 0.7, num=N*H).reshape(N, H)
prev_c = np.linspace(-0.4, 0.9, num=N*H).reshape(N, H)
Wx = np.linspace(-2.1, 1.3, num=4*D*H).reshape(D, 4 * H)
Wh = np.linspace(-0.7, 2.2, num=4*H*H).reshape(H, 4 * H)
b = np.linspace(0.3, 0.7, num=4*H)

next_h, next_c, cache = lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)

expected_next_h = np.asarray([
    [ 0.24635157,  0.28610883,  0.32240467,  0.35525807,  0.38474904],
    [ 0.49223563,  0.55611431,  0.61507696,  0.66844003,  0.7159181 ],
    [ 0.56735664,  0.66310127,  0.74419266,  0.80889665,  0.858299  ]])
expected_next_c = np.asarray([
    [ 0.32986176,  0.39145139,  0.451556,    0.51014116,  0.56717407],
    [ 0.66382255,  0.76674007,  0.87195994,  0.97902709,  1.08751345],
    [ 0.74192008,  0.90592151,  1.07717006,  1.25120233,  1.42395676]])

print 'next_h error: ', rel_error(expected_next_h, next_h)
print 'next_c error: ', rel_error(expected_next_c, next_c)


next_h error:  5.70541304045e-09
next_c error:  5.81431230888e-09

LSTM: step backward

Implement the backward pass for a single LSTM timestep in the function lstm_step_backward in the file cs231n/rnn_layers.py. Once you are done, run the following to perform numeric gradient checking on your implementation. You should see errors around 1e-8 or less.


In [10]:
N, D, H = 4, 5, 6
x = np.random.randn(N, D)
prev_h = np.random.randn(N, H)
prev_c = np.random.randn(N, H)
Wx = np.random.randn(D, 4 * H)
Wh = np.random.randn(H, 4 * H)
b = np.random.randn(4 * H)

next_h, next_c, cache = lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)

dnext_h = np.random.randn(*next_h.shape)
dnext_c = np.random.randn(*next_c.shape)

fx_h = lambda x: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]
fh_h = lambda h: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]
fc_h = lambda c: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]
fWx_h = lambda Wx: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]
fWh_h = lambda Wh: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]
fb_h = lambda b: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[0]

fx_c = lambda x: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]
fh_c = lambda h: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]
fc_c = lambda c: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]
fWx_c = lambda Wx: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]
fWh_c = lambda Wh: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]
fb_c = lambda b: lstm_step_forward(x, prev_h, prev_c, Wx, Wh, b)[1]

num_grad = eval_numerical_gradient_array

dx_num = num_grad(fx_h, x, dnext_h) + num_grad(fx_c, x, dnext_c)
dh_num = num_grad(fh_h, prev_h, dnext_h) + num_grad(fh_c, prev_h, dnext_c)
dc_num = num_grad(fc_h, prev_c, dnext_h) + num_grad(fc_c, prev_c, dnext_c)
dWx_num = num_grad(fWx_h, Wx, dnext_h) + num_grad(fWx_c, Wx, dnext_c)
dWh_num = num_grad(fWh_h, Wh, dnext_h) + num_grad(fWh_c, Wh, dnext_c)
db_num = num_grad(fb_h, b, dnext_h) + num_grad(fb_c, b, dnext_c)

dx, dh, dc, dWx, dWh, db = lstm_step_backward(dnext_h, dnext_c, cache)

print 'dx error: ', rel_error(dx_num, dx)
print 'dh error: ', rel_error(dh_num, dh)
print 'dc error: ', rel_error(dc_num, dc)
print 'dWx error: ', rel_error(dWx_num, dWx)
print 'dWh error: ', rel_error(dWh_num, dWh)
print 'db error: ', rel_error(db_num, db)


dx error:  4.45839572389e-10
dh error:  1.38987651018e-10
dc error:  3.74918879842e-11
dWx error:  1.65772018958e-09
dWh error:  2.22204898907e-09
db error:  8.19363645669e-09

LSTM: forward

In the function lstm_forward in the file cs231n/rnn_layers.py, implement the lstm_forward function to run an LSTM forward on an entire timeseries of data.

When you are done run the following to check your implementation. You should see an error around 1e-7.


In [12]:
N, D, H, T = 2, 5, 4, 3
x = np.linspace(-0.4, 0.6, num=N*T*D).reshape(N, T, D)
h0 = np.linspace(-0.4, 0.8, num=N*H).reshape(N, H)
Wx = np.linspace(-0.2, 0.9, num=4*D*H).reshape(D, 4 * H)
Wh = np.linspace(-0.3, 0.6, num=4*H*H).reshape(H, 4 * H)
b = np.linspace(0.2, 0.7, num=4*H)

h, cache = lstm_forward(x, h0, Wx, Wh, b)

expected_h = np.asarray([
 [[ 0.01764008,  0.01823233,  0.01882671,  0.0194232 ],
  [ 0.11287491,  0.12146228,  0.13018446,  0.13902939],
  [ 0.31358768,  0.33338627,  0.35304453,  0.37250975]],
 [[ 0.45767879,  0.4761092,   0.4936887,   0.51041945],
  [ 0.6704845,   0.69350089,  0.71486014,  0.7346449 ],
  [ 0.81733511,  0.83677871,  0.85403753,  0.86935314]]])

print 'h error: ', rel_error(expected_h, h)


h error:  8.61053745211e-08

LSTM: backward

Implement the backward pass for an LSTM over an entire timeseries of data in the function lstm_backward in the file cs231n/rnn_layers.py. When you are done run the following to perform numeric gradient checking on your implementation. You should see errors around 1e-8 or less.


In [28]:
from cs231n.rnn_layers import lstm_forward, lstm_backward

N, D, T, H = 2, 3, 10, 6

x = np.random.randn(N, T, D)
h0 = np.random.randn(N, H)
Wx = np.random.randn(D, 4 * H)
Wh = np.random.randn(H, 4 * H)
b = np.random.randn(4 * H)

out, cache = lstm_forward(x, h0, Wx, Wh, b)

dout = np.random.randn(*out.shape)

dx, dh0, dWx, dWh, db = lstm_backward(dout, cache)

fx = lambda x: lstm_forward(x, h0, Wx, Wh, b)[0]
fh0 = lambda h0: lstm_forward(x, h0, Wx, Wh, b)[0]
fWx = lambda Wx: lstm_forward(x, h0, Wx, Wh, b)[0]
fWh = lambda Wh: lstm_forward(x, h0, Wx, Wh, b)[0]
fb = lambda b: lstm_forward(x, h0, Wx, Wh, b)[0]

dx_num = eval_numerical_gradient_array(fx, x, dout)
dh0_num = eval_numerical_gradient_array(fh0, h0, dout)
dWx_num = eval_numerical_gradient_array(fWx, Wx, dout)
dWh_num = eval_numerical_gradient_array(fWh, Wh, dout)
db_num = eval_numerical_gradient_array(fb, b, dout)

print 'dx error: ', rel_error(dx_num, dx)
print 'dh0 error: ', rel_error(dh0_num, dh0)
print 'dWx error: ', rel_error(dWx_num, dWx)
print 'dWh error: ', rel_error(dWh_num, dWh)
print 'db error: ', rel_error(db_num, db)


dx error:  1.30950486388e-09
dh0 error:  2.94495469347e-09
dWx error:  9.07014083141e-10
dWh error:  7.85200111904e-08
db error:  8.94957204976e-10

LSTM captioning model

Now that you have implemented an LSTM, update the implementation of the loss method of the CaptioningRNN class in the file cs231n/classifiers/rnn.py to handle the case where self.cell_type is lstm. This should require adding less than 10 lines of code.

Once you have done so, run the following to check your implementation. You should see a difference of less than 1e-10.


In [30]:
N, D, W, H = 10, 20, 30, 40
word_to_idx = {'<NULL>': 0, 'cat': 2, 'dog': 3}
V = len(word_to_idx)
T = 13

model = CaptioningRNN(word_to_idx,
          input_dim=D,
          wordvec_dim=W,
          hidden_dim=H,
          cell_type='lstm',
          dtype=np.float64)

# Set all model parameters to fixed values
for k, v in model.params.iteritems():
  model.params[k] = np.linspace(-1.4, 1.3, num=v.size).reshape(*v.shape)

features = np.linspace(-0.5, 1.7, num=N*D).reshape(N, D)
captions = (np.arange(N * T) % V).reshape(N, T)

loss, grads = model.loss(features, captions)
expected_loss = 9.82445935443

print 'loss: ', loss
print 'expected loss: ', expected_loss
print 'difference: ', abs(loss - expected_loss)


loss:  9.82445935443
expected loss:  9.82445935443
difference:  2.26840768391e-12

Overfit LSTM captioning model

Run the following to overfit an LSTM captioning model on the same small dataset as we used for the RNN above.


In [31]:
small_data = load_coco_data(max_train=50)

small_lstm_model = CaptioningRNN(
          cell_type='lstm',
          word_to_idx=data['word_to_idx'],
          input_dim=data['train_features'].shape[1],
          hidden_dim=512,
          wordvec_dim=256,
          dtype=np.float32,
        )

small_lstm_solver = CaptioningSolver(small_lstm_model, small_data,
           update_rule='adam',
           num_epochs=50,
           batch_size=25,
           optim_config={
             'learning_rate': 5e-3,
           },
           lr_decay=0.995,
           verbose=True, print_every=10,
         )

small_lstm_solver.train()

# Plot the training losses
plt.plot(small_lstm_solver.loss_history)
plt.xlabel('Iteration')
plt.ylabel('Loss')
plt.title('Training loss history')
plt.show()


(Iteration 1 / 100) loss: 72.956343
(Iteration 11 / 100) loss: 40.014447
(Iteration 21 / 100) loss: 21.908289
(Iteration 31 / 100) loss: 11.110767
(Iteration 41 / 100) loss: 4.216231
(Iteration 51 / 100) loss: 1.805185
(Iteration 61 / 100) loss: 0.883786
(Iteration 71 / 100) loss: 0.340748
(Iteration 81 / 100) loss: 0.154977
(Iteration 91 / 100) loss: 0.089899

LSTM test-time sampling

Modify the sample method of the CaptioningRNN class to handle the case where self.cell_type is lstm. This should take fewer than 10 lines of code.

When you are done run the following to sample from your overfit LSTM model on some training and validation set samples.


In [35]:
for split in ['train', 'val']:
  minibatch = sample_coco_minibatch(small_data, split=split, batch_size=2)
  gt_captions, features, urls = minibatch
  gt_captions = decode_captions(gt_captions, data['idx_to_word'])

  sample_captions = small_lstm_model.sample(features)
  sample_captions = decode_captions(sample_captions, data['idx_to_word'])

  for gt_caption, sample_caption, url in zip(gt_captions, sample_captions, urls):
    plt.imshow(image_from_url(url))
    plt.title('%s\n%s\nGT:%s' % (split, sample_caption, gt_caption))
    plt.axis('off')
    plt.show()


Train a good captioning model!

Using the pieces you have implemented in this and the previous notebook, try to train a captioning model that gives decent qualitative results (better than the random garbage you saw with the overfit models) when sampling on the validation set. You can subsample the training set if you want; we just want to see samples on the validatation set that are better than random.

Don't spend too much time on this part; we don't have any explicit accuracy thresholds you need to meet.


In [59]:
import sys
#_ORG_STDOUT = sys.stdout
#_ORG_STDERR = sys.stderr
class captoutput:
    def __init__(self):
        self.log = open("log.txt", "w")
        self.term = _ORG_STDOUT
        self.terr = _ORG_STDERR
    
    def __enter__(self):
        sys.stdout = self
        sys.stderr = self
        return self
    
    def __exit__(self):
        sys.stdout = _ORG_STDOUT
        sys.stderr = _ORG_STDERR
        self.close()
        
    def write(self, msg):
        self.term.write(msg)
        self.log.write(msg)
        self.log.flush()
        
    def close(self):
        if self.log:
            self.log.close()
            
    def read(self):
        with open("log.txt") as f:
            txt = f.read()
        return txt
        
#sys.stdout = captout()
#small_data = load_coco_data(max_train=50000)

#small_lstm_model = CaptioningRNN(
#          cell_type='lstm',
#          word_to_idx=data['word_to_idx'],
#          input_dim=data['train_features'].shape[1],
#          hidden_dim=1024,
#          wordvec_dim=512,
#          dtype=np.float32,
#        )

#small_lstm_solver = CaptioningSolver(small_lstm_model, small_data,
#           update_rule='adam',
#           num_epochs=50,
#           batch_size=50,
#           optim_config={
#             'learning_rate': 5e-3,
#           },
#           lr_decay=0.9,
#           verbose=True, print_every=10,
#         )

#small_lstm_solver.train()

#since i was running the code on inek's i needed to close the browser therefore i saved output to a log file, i used the above
#parameters and code to train the model

f = open("log.txt", "r").read()
print f


(Iteration 1 / 50000) loss: 30.926130
(Iteration 11 / 50000) loss: 29.070883
(Iteration 21 / 50000) loss: 31.601230
(Iteration 31 / 50000) loss: 29.873266
(Iteration 41 / 50000) loss: 27.608146
(Iteration 51 / 50000) loss: 30.443934
(Iteration 61 / 50000) loss: 29.725397
(Iteration 71 / 50000) loss: 28.025682
(Iteration 81 / 50000) loss: 26.722375
(Iteration 91 / 50000) loss: 29.231495
(Iteration 101 / 50000) loss: 30.085486
(Iteration 111 / 50000) loss: 25.889995
(Iteration 121 / 50000) loss: 31.518549
(Iteration 131 / 50000) loss: 31.410766
(Iteration 141 / 50000) loss: 30.248458
(Iteration 151 / 50000) loss: 28.827061
(Iteration 161 / 50000) loss: 28.923514
(Iteration 171 / 50000) loss: 29.634317
(Iteration 181 / 50000) loss: 26.052727
(Iteration 191 / 50000) loss: 25.890425
(Iteration 201 / 50000) loss: 29.870180
(Iteration 211 / 50000) loss: 30.890177
(Iteration 221 / 50000) loss: 26.539171
(Iteration 231 / 50000) loss: 25.099992
(Iteration 241 / 50000) loss: 26.520576
(Iteration 251 / 50000) loss: 29.033133
(Iteration 261 / 50000) loss: 32.032609
(Iteration 271 / 50000) loss: 28.158833
(Iteration 281 / 50000) loss: 25.275156
(Iteration 291 / 50000) loss: 30.612113
(Iteration 301 / 50000) loss: 26.956531
(Iteration 311 / 50000) loss: 26.891888
(Iteration 321 / 50000) loss: 28.420599
(Iteration 331 / 50000) loss: 25.966178
(Iteration 341 / 50000) loss: 28.184189
(Iteration 351 / 50000) loss: 28.036208
(Iteration 361 / 50000) loss: 28.988733
(Iteration 371 / 50000) loss: 28.628232
(Iteration 381 / 50000) loss: 26.383788
(Iteration 391 / 50000) loss: 29.830352
(Iteration 401 / 50000) loss: 28.160849
(Iteration 411 / 50000) loss: 29.472551
(Iteration 421 / 50000) loss: 31.039493
(Iteration 431 / 50000) loss: 26.597537
(Iteration 441 / 50000) loss: 28.445439
(Iteration 451 / 50000) loss: 28.499469
(Iteration 461 / 50000) loss: 28.026059
(Iteration 471 / 50000) loss: 29.247035
(Iteration 481 / 50000) loss: 27.027281
(Iteration 491 / 50000) loss: 28.537066
(Iteration 501 / 50000) loss: 28.267088
(Iteration 511 / 50000) loss: 27.308803
(Iteration 521 / 50000) loss: 28.137453
(Iteration 531 / 50000) loss: 26.030754
(Iteration 541 / 50000) loss: 28.392395
(Iteration 551 / 50000) loss: 28.725845
(Iteration 561 / 50000) loss: 30.046796
(Iteration 571 / 50000) loss: 27.490267
(Iteration 581 / 50000) loss: 26.665169
(Iteration 591 / 50000) loss: 26.497946
(Iteration 601 / 50000) loss: 27.382327
(Iteration 611 / 50000) loss: 24.913609
(Iteration 621 / 50000) loss: 24.369487
(Iteration 631 / 50000) loss: 28.561450
(Iteration 641 / 50000) loss: 26.873792
(Iteration 651 / 50000) loss: 27.210855
(Iteration 661 / 50000) loss: 27.908226
(Iteration 671 / 50000) loss: 30.072569
(Iteration 681 / 50000) loss: 27.672405
(Iteration 691 / 50000) loss: 29.239565
(Iteration 701 / 50000) loss: 26.657840
(Iteration 711 / 50000) loss: 27.731789
(Iteration 721 / 50000) loss: 28.808451
(Iteration 731 / 50000) loss: 29.337360
(Iteration 741 / 50000) loss: 27.986443
(Iteration 751 / 50000) loss: 25.670206
(Iteration 761 / 50000) loss: 26.954642
(Iteration 771 / 50000) loss: 27.588657
(Iteration 781 / 50000) loss: 28.300060
(Iteration 791 / 50000) loss: 25.665245
(Iteration 801 / 50000) loss: 27.945385
(Iteration 811 / 50000) loss: 26.916274
(Iteration 821 / 50000) loss: 24.603606
(Iteration 831 / 50000) loss: 25.905304
(Iteration 841 / 50000) loss: 26.024455
(Iteration 851 / 50000) loss: 27.914536
(Iteration 861 / 50000) loss: 25.036607
(Iteration 871 / 50000) loss: 26.765515
(Iteration 881 / 50000) loss: 27.184556
(Iteration 891 / 50000) loss: 27.591900
(Iteration 901 / 50000) loss: 27.441229
(Iteration 911 / 50000) loss: 27.533784
(Iteration 921 / 50000) loss: 25.674196
(Iteration 931 / 50000) loss: 27.915254
(Iteration 941 / 50000) loss: 25.970476
(Iteration 951 / 50000) loss: 28.177879
(Iteration 961 / 50000) loss: 25.382486
(Iteration 971 / 50000) loss: 26.630886
(Iteration 981 / 50000) loss: 25.006498
(Iteration 991 / 50000) loss: 27.183955
(Iteration 1001 / 50000) loss: 26.024022
(Iteration 1011 / 50000) loss: 26.643973
(Iteration 1021 / 50000) loss: 27.984407
(Iteration 1031 / 50000) loss: 25.092191
(Iteration 1041 / 50000) loss: 28.229451
(Iteration 1051 / 50000) loss: 25.982922
(Iteration 1061 / 50000) loss: 25.856620
(Iteration 1071 / 50000) loss: 26.886829
(Iteration 1081 / 50000) loss: 28.565694
(Iteration 1091 / 50000) loss: 25.785249
(Iteration 1101 / 50000) loss: 26.577723
(Iteration 1111 / 50000) loss: 26.714301
(Iteration 1121 / 50000) loss: 26.308523
(Iteration 1131 / 50000) loss: 24.979574
(Iteration 1141 / 50000) loss: 26.756139
(Iteration 1151 / 50000) loss: 27.159430
(Iteration 1161 / 50000) loss: 27.680463
(Iteration 1171 / 50000) loss: 28.321832
(Iteration 1181 / 50000) loss: 28.386754
(Iteration 1191 / 50000) loss: 26.758713
(Iteration 1201 / 50000) loss: 24.822881
(Iteration 1211 / 50000) loss: 26.673262
(Iteration 1221 / 50000) loss: 27.468768
(Iteration 1231 / 50000) loss: 27.799777
(Iteration 1241 / 50000) loss: 24.179218
(Iteration 1251 / 50000) loss: 27.470215
(Iteration 1261 / 50000) loss: 25.755874
(Iteration 1271 / 50000) loss: 26.982653
(Iteration 1281 / 50000) loss: 24.013282
(Iteration 1291 / 50000) loss: 26.606301
(Iteration 1301 / 50000) loss: 25.415308
(Iteration 1311 / 50000) loss: 26.289749
(Iteration 1321 / 50000) loss: 27.766659
(Iteration 1331 / 50000) loss: 27.038815
(Iteration 1341 / 50000) loss: 26.931581
(Iteration 1351 / 50000) loss: 25.527356
(Iteration 1361 / 50000) loss: 24.907970
(Iteration 1371 / 50000) loss: 24.678832
(Iteration 1381 / 50000) loss: 26.785432
(Iteration 1391 / 50000) loss: 25.016625
(Iteration 1401 / 50000) loss: 27.027078
(Iteration 1411 / 50000) loss: 25.867241
(Iteration 1421 / 50000) loss: 25.458545
(Iteration 1431 / 50000) loss: 24.838994
(Iteration 1441 / 50000) loss: 24.477856
(Iteration 1451 / 50000) loss: 25.537385
(Iteration 1461 / 50000) loss: 27.251884
(Iteration 1471 / 50000) loss: 24.997199
(Iteration 1481 / 50000) loss: 27.669329
(Iteration 1491 / 50000) loss: 26.619078
(Iteration 1501 / 50000) loss: 25.738802
(Iteration 1511 / 50000) loss: 24.673971
(Iteration 1521 / 50000) loss: 24.857616
(Iteration 1531 / 50000) loss: 24.361794
(Iteration 1541 / 50000) loss: 25.411605
(Iteration 1551 / 50000) loss: 26.095347
(Iteration 1561 / 50000) loss: 23.375312
(Iteration 1571 / 50000) loss: 24.075270
(Iteration 1581 / 50000) loss: 24.527163
(Iteration 1591 / 50000) loss: 24.456808
(Iteration 1601 / 50000) loss: 24.247041
(Iteration 1611 / 50000) loss: 28.687336
(Iteration 1621 / 50000) loss: 24.767607
(Iteration 1631 / 50000) loss: 28.151576
(Iteration 1641 / 50000) loss: 27.305339
(Iteration 1651 / 50000) loss: 22.927375
(Iteration 1661 / 50000) loss: 27.530474
(Iteration 1671 / 50000) loss: 24.480313
(Iteration 1681 / 50000) loss: 26.066825
(Iteration 1691 / 50000) loss: 23.923833
(Iteration 1701 / 50000) loss: 24.993270
(Iteration 1711 / 50000) loss: 24.477478
(Iteration 1721 / 50000) loss: 25.712409
(Iteration 1731 / 50000) loss: 24.306951
(Iteration 1741 / 50000) loss: 24.279497
(Iteration 1751 / 50000) loss: 24.842832
(Iteration 1761 / 50000) loss: 25.763138
(Iteration 1771 / 50000) loss: 26.093069
(Iteration 1781 / 50000) loss: 23.821324
(Iteration 1791 / 50000) loss: 25.322239
(Iteration 1801 / 50000) loss: 25.254542
(Iteration 1811 / 50000) loss: 22.329390
(Iteration 1821 / 50000) loss: 24.645844
(Iteration 1831 / 50000) loss: 24.832529
(Iteration 1841 / 50000) loss: 27.098495
(Iteration 1851 / 50000) loss: 24.578657
(Iteration 1861 / 50000) loss: 27.554647
(Iteration 1871 / 50000) loss: 25.025150
(Iteration 1881 / 50000) loss: 26.359313
(Iteration 1891 / 50000) loss: 27.151641
(Iteration 1901 / 50000) loss: 25.834320
(Iteration 1911 / 50000) loss: 24.473711
(Iteration 1921 / 50000) loss: 26.830604
(Iteration 1931 / 50000) loss: 25.845421
(Iteration 1941 / 50000) loss: 23.343691
(Iteration 1951 / 50000) loss: 24.279959
(Iteration 1961 / 50000) loss: 25.305406
(Iteration 1971 / 50000) loss: 27.338967
(Iteration 1981 / 50000) loss: 24.652411
(Iteration 1991 / 50000) loss: 26.458899
(Iteration 2001 / 50000) loss: 25.585148
(Iteration 2011 / 50000) loss: 25.224274
(Iteration 2021 / 50000) loss: 24.797197
(Iteration 2031 / 50000) loss: 24.398065
(Iteration 2041 / 50000) loss: 26.874232
(Iteration 2051 / 50000) loss: 26.659299
(Iteration 2061 / 50000) loss: 23.651646
(Iteration 2071 / 50000) loss: 26.124512
(Iteration 2081 / 50000) loss: 24.804420
(Iteration 2091 / 50000) loss: 25.319467
(Iteration 2101 / 50000) loss: 23.849383
(Iteration 2111 / 50000) loss: 26.783550
(Iteration 2121 / 50000) loss: 26.316597
(Iteration 2131 / 50000) loss: 22.223642
(Iteration 2141 / 50000) loss: 24.126290
(Iteration 2151 / 50000) loss: 25.360958
(Iteration 2161 / 50000) loss: 23.360438
(Iteration 2171 / 50000) loss: 24.314161
(Iteration 2181 / 50000) loss: 25.381773
(Iteration 2191 / 50000) loss: 25.409461
(Iteration 2201 / 50000) loss: 26.419749
(Iteration 2211 / 50000) loss: 23.301151
(Iteration 2221 / 50000) loss: 22.845281
(Iteration 2231 / 50000) loss: 26.860357
(Iteration 2241 / 50000) loss: 22.708815
(Iteration 2251 / 50000) loss: 22.656833
(Iteration 2261 / 50000) loss: 25.359872
(Iteration 2271 / 50000) loss: 27.078018
(Iteration 2281 / 50000) loss: 23.728591
(Iteration 2291 / 50000) loss: 25.745536
(Iteration 2301 / 50000) loss: 23.837006
(Iteration 2311 / 50000) loss: 25.076898
(Iteration 2321 / 50000) loss: 25.812691
(Iteration 2331 / 50000) loss: 23.390736
(Iteration 2341 / 50000) loss: 25.914371
(Iteration 2351 / 50000) loss: 23.761057
(Iteration 2361 / 50000) loss: 23.110267
(Iteration 2371 / 50000) loss: 23.646581
(Iteration 2381 / 50000) loss: 23.945801
(Iteration 2391 / 50000) loss: 22.543917
(Iteration 2401 / 50000) loss: 26.254549
(Iteration 2411 / 50000) loss: 22.362316
(Iteration 2421 / 50000) loss: 24.884465
(Iteration 2431 / 50000) loss: 24.279306
(Iteration 2441 / 50000) loss: 24.574530
(Iteration 2451 / 50000) loss: 23.463663
(Iteration 2461 / 50000) loss: 22.730254
(Iteration 2471 / 50000) loss: 23.302722
(Iteration 2481 / 50000) loss: 24.171579
(Iteration 2491 / 50000) loss: 24.601159
(Iteration 2501 / 50000) loss: 24.165054
(Iteration 2511 / 50000) loss: 23.695381
(Iteration 2521 / 50000) loss: 24.982344
(Iteration 2531 / 50000) loss: 24.425312
(Iteration 2541 / 50000) loss: 26.731578
(Iteration 2551 / 50000) loss: 23.954384
(Iteration 2561 / 50000) loss: 24.427643
(Iteration 2571 / 50000) loss: 24.588089
(Iteration 2581 / 50000) loss: 23.936148
(Iteration 2591 / 50000) loss: 22.354957
(Iteration 2601 / 50000) loss: 24.542692
(Iteration 2611 / 50000) loss: 25.578933
(Iteration 2621 / 50000) loss: 24.296249
(Iteration 2631 / 50000) loss: 23.703376
(Iteration 2641 / 50000) loss: 26.210923
(Iteration 2651 / 50000) loss: 22.685480
(Iteration 2661 / 50000) loss: 21.707539
(Iteration 2671 / 50000) loss: 24.479660
(Iteration 2681 / 50000) loss: 25.799216
(Iteration 2691 / 50000) loss: 23.508130
(Iteration 2701 / 50000) loss: 25.412508
(Iteration 2711 / 50000) loss: 24.762456
(Iteration 2721 / 50000) loss: 22.780127
(Iteration 2731 / 50000) loss: 23.651029
(Iteration 2741 / 50000) loss: 25.895968
(Iteration 2751 / 50000) loss: 22.138261
(Iteration 2761 / 50000) loss: 24.886435
(Iteration 2771 / 50000) loss: 23.528768
(Iteration 2781 / 50000) loss: 24.256855
(Iteration 2791 / 50000) loss: 23.348242
(Iteration 2801 / 50000) loss: 22.082031
(Iteration 2811 / 50000) loss: 24.194906
(Iteration 2821 / 50000) loss: 23.336834
(Iteration 2831 / 50000) loss: 23.891726
(Iteration 2841 / 50000) loss: 27.685216
(Iteration 2851 / 50000) loss: 27.186310
(Iteration 2861 / 50000) loss: 22.593117
(Iteration 2871 / 50000) loss: 25.153673
(Iteration 2881 / 50000) loss: 24.760632
(Iteration 2891 / 50000) loss: 23.506637
(Iteration 2901 / 50000) loss: 23.571829
(Iteration 2911 / 50000) loss: 22.805850
(Iteration 2921 / 50000) loss: 23.912635
(Iteration 2931 / 50000) loss: 21.466595
(Iteration 2941 / 50000) loss: 24.921710
(Iteration 2951 / 50000) loss: 23.675006
(Iteration 2961 / 50000) loss: 24.766163
(Iteration 2971 / 50000) loss: 24.936274
(Iteration 2981 / 50000) loss: 23.098093
(Iteration 2991 / 50000) loss: 24.959830
(Iteration 3001 / 50000) loss: 23.327401
(Iteration 3011 / 50000) loss: 22.307848
(Iteration 3021 / 50000) loss: 22.051121
(Iteration 3031 / 50000) loss: 22.449622
(Iteration 3041 / 50000) loss: 21.915756
(Iteration 3051 / 50000) loss: 24.541494
(Iteration 3061 / 50000) loss: 23.232721
(Iteration 3071 / 50000) loss: 25.892722
(Iteration 3081 / 50000) loss: 24.001198
(Iteration 3091 / 50000) loss: 25.303866
(Iteration 3101 / 50000) loss: 22.044903
(Iteration 3111 / 50000) loss: 26.075894
(Iteration 3121 / 50000) loss: 22.790129
(Iteration 3131 / 50000) loss: 21.991797
(Iteration 3141 / 50000) loss: 21.137556
(Iteration 3151 / 50000) loss: 23.080023
(Iteration 3161 / 50000) loss: 21.932215
(Iteration 3171 / 50000) loss: 25.339084
(Iteration 3181 / 50000) loss: 22.291852
(Iteration 3191 / 50000) loss: 23.349550
(Iteration 3201 / 50000) loss: 24.089402
(Iteration 3211 / 50000) loss: 22.934421
(Iteration 3221 / 50000) loss: 24.127132
(Iteration 3231 / 50000) loss: 23.999292
(Iteration 3241 / 50000) loss: 22.552732
(Iteration 3251 / 50000) loss: 25.116679
(Iteration 3261 / 50000) loss: 24.207387
(Iteration 3271 / 50000) loss: 23.830072
(Iteration 3281 / 50000) loss: 22.193077
(Iteration 3291 / 50000) loss: 23.385983
(Iteration 3301 / 50000) loss: 23.764899
(Iteration 3311 / 50000) loss: 23.861217
(Iteration 3321 / 50000) loss: 21.944560
(Iteration 3331 / 50000) loss: 23.885232
(Iteration 3341 / 50000) loss: 23.969034
(Iteration 3351 / 50000) loss: 23.810024
(Iteration 3361 / 50000) loss: 21.777030
(Iteration 3371 / 50000) loss: 22.824446
(Iteration 3381 / 50000) loss: 24.401014
(Iteration 3391 / 50000) loss: 23.145499
(Iteration 3401 / 50000) loss: 21.841794
(Iteration 3411 / 50000) loss: 26.163799
(Iteration 3421 / 50000) loss: 22.254940
(Iteration 3431 / 50000) loss: 21.864860
(Iteration 3441 / 50000) loss: 23.768853
(Iteration 3451 / 50000) loss: 24.968298
(Iteration 3461 / 50000) loss: 24.056934
(Iteration 3471 / 50000) loss: 22.874120
(Iteration 3481 / 50000) loss: 22.894880
(Iteration 3491 / 50000) loss: 23.559703
(Iteration 3501 / 50000) loss: 23.033933
(Iteration 3511 / 50000) loss: 23.657632
(Iteration 3521 / 50000) loss: 23.856937
(Iteration 3531 / 50000) loss: 23.384962
(Iteration 3541 / 50000) loss: 21.987550
(Iteration 3551 / 50000) loss: 22.710152
(Iteration 3561 / 50000) loss: 23.430265
(Iteration 3571 / 50000) loss: 23.663510
(Iteration 3581 / 50000) loss: 24.506699
(Iteration 3591 / 50000) loss: 23.738003
(Iteration 3601 / 50000) loss: 21.921448
(Iteration 3611 / 50000) loss: 24.530472
(Iteration 3621 / 50000) loss: 22.202798
(Iteration 3631 / 50000) loss: 23.368526
(Iteration 3641 / 50000) loss: 23.624008
(Iteration 3651 / 50000) loss: 24.721385
(Iteration 3661 / 50000) loss: 23.869856
(Iteration 3671 / 50000) loss: 23.239123
(Iteration 3681 / 50000) loss: 23.872734
(Iteration 3691 / 50000) loss: 23.989828
(Iteration 3701 / 50000) loss: 23.775158
(Iteration 3711 / 50000) loss: 22.803934
(Iteration 3721 / 50000) loss: 23.450899
(Iteration 3731 / 50000) loss: 23.090121
(Iteration 3741 / 50000) loss: 23.528472
(Iteration 3751 / 50000) loss: 22.913002
(Iteration 3761 / 50000) loss: 20.867402
(Iteration 3771 / 50000) loss: 23.589012
(Iteration 3781 / 50000) loss: 21.712804
(Iteration 3791 / 50000) loss: 24.651307
(Iteration 3801 / 50000) loss: 23.777181
(Iteration 3811 / 50000) loss: 21.290426
(Iteration 3821 / 50000) loss: 20.935243
(Iteration 3831 / 50000) loss: 23.536577
(Iteration 3841 / 50000) loss: 22.614619
(Iteration 3851 / 50000) loss: 23.278106
(Iteration 3861 / 50000) loss: 22.858303
(Iteration 3871 / 50000) loss: 22.575795
(Iteration 3881 / 50000) loss: 23.356342
(Iteration 3891 / 50000) loss: 24.219184
(Iteration 3901 / 50000) loss: 23.217734
(Iteration 3911 / 50000) loss: 22.528431
(Iteration 3921 / 50000) loss: 23.821671
(Iteration 3931 / 50000) loss: 24.241799
(Iteration 3941 / 50000) loss: 24.474328
(Iteration 3951 / 50000) loss: 21.721901
(Iteration 3961 / 50000) loss: 21.267580
(Iteration 3971 / 50000) loss: 22.757693
(Iteration 3981 / 50000) loss: 21.907911
(Iteration 3991 / 50000) loss: 23.688277
(Iteration 4001 / 50000) loss: 23.423222
(Iteration 4011 / 50000) loss: 21.877247
(Iteration 4021 / 50000) loss: 25.059861
(Iteration 4031 / 50000) loss: 22.185143
(Iteration 4041 / 50000) loss: 24.913744
(Iteration 4051 / 50000) loss: 23.939090
(Iteration 4061 / 50000) loss: 21.869466
(Iteration 4071 / 50000) loss: 22.905561
(Iteration 4081 / 50000) loss: 22.803717
(Iteration 4091 / 50000) loss: 23.871347
(Iteration 4101 / 50000) loss: 21.995222
(Iteration 4111 / 50000) loss: 22.746074
(Iteration 4121 / 50000) loss: 22.987963
(Iteration 4131 / 50000) loss: 20.781571
(Iteration 4141 / 50000) loss: 20.549921
(Iteration 4151 / 50000) loss: 21.667560
(Iteration 4161 / 50000) loss: 21.244800
(Iteration 4171 / 50000) loss: 20.716015
(Iteration 4181 / 50000) loss: 21.321830
(Iteration 4191 / 50000) loss: 24.955916
(Iteration 4201 / 50000) loss: 22.125517
(Iteration 4211 / 50000) loss: 23.026855
(Iteration 4221 / 50000) loss: 23.666508
(Iteration 4231 / 50000) loss: 23.258541
(Iteration 4241 / 50000) loss: 22.655983
(Iteration 4251 / 50000) loss: 23.579580
(Iteration 4261 / 50000) loss: 21.378548
(Iteration 4271 / 50000) loss: 23.572377
(Iteration 4281 / 50000) loss: 20.492207
(Iteration 4291 / 50000) loss: 22.133580
(Iteration 4301 / 50000) loss: 23.079507
(Iteration 4311 / 50000) loss: 21.813561
(Iteration 4321 / 50000) loss: 20.518320
(Iteration 4331 / 50000) loss: 22.883881
(Iteration 4341 / 50000) loss: 22.623029
(Iteration 4351 / 50000) loss: 22.362846
(Iteration 4361 / 50000) loss: 20.420900
(Iteration 4371 / 50000) loss: 22.954308
(Iteration 4381 / 50000) loss: 23.025488
(Iteration 4391 / 50000) loss: 21.448610
(Iteration 4401 / 50000) loss: 22.017987
(Iteration 4411 / 50000) loss: 22.531510
(Iteration 4421 / 50000) loss: 20.150642
(Iteration 4431 / 50000) loss: 23.603983
(Iteration 4441 / 50000) loss: 20.515706
(Iteration 4451 / 50000) loss: 24.454975
(Iteration 4461 / 50000) loss: 21.788702
(Iteration 4471 / 50000) loss: 20.693378
(Iteration 4481 / 50000) loss: 22.060716
(Iteration 4491 / 50000) loss: 21.888109
(Iteration 4501 / 50000) loss: 19.995999
(Iteration 4511 / 50000) loss: 24.263554
(Iteration 4521 / 50000) loss: 23.219617
(Iteration 4531 / 50000) loss: 23.395078
(Iteration 4541 / 50000) loss: 24.380893
(Iteration 4551 / 50000) loss: 22.017857
(Iteration 4561 / 50000) loss: 21.049995
(Iteration 4571 / 50000) loss: 22.772561
(Iteration 4581 / 50000) loss: 21.117909
(Iteration 4591 / 50000) loss: 19.933161
(Iteration 4601 / 50000) loss: 19.572107
(Iteration 4611 / 50000) loss: 22.303534
(Iteration 4621 / 50000) loss: 22.622785
(Iteration 4631 / 50000) loss: 22.511456
(Iteration 4641 / 50000) loss: 20.556505
(Iteration 4651 / 50000) loss: 21.558934
(Iteration 4661 / 50000) loss: 22.516404
(Iteration 4671 / 50000) loss: 21.034696
(Iteration 4681 / 50000) loss: 20.580145
(Iteration 4691 / 50000) loss: 20.135264
(Iteration 4701 / 50000) loss: 21.448252
(Iteration 4711 / 50000) loss: 20.945050
(Iteration 4721 / 50000) loss: 20.739266
(Iteration 4731 / 50000) loss: 20.322136
(Iteration 4741 / 50000) loss: 21.766342
(Iteration 4751 / 50000) loss: 23.080450
(Iteration 4761 / 50000) loss: 21.744860
(Iteration 4771 / 50000) loss: 23.390080
(Iteration 4781 / 50000) loss: 20.417548
(Iteration 4791 / 50000) loss: 22.983565
(Iteration 4801 / 50000) loss: 24.457967
(Iteration 4811 / 50000) loss: 24.129633
(Iteration 4821 / 50000) loss: 20.855473
(Iteration 4831 / 50000) loss: 21.967207
(Iteration 4841 / 50000) loss: 23.447107
(Iteration 4851 / 50000) loss: 22.908405
(Iteration 4861 / 50000) loss: 23.698209
(Iteration 4871 / 50000) loss: 21.725997
(Iteration 4881 / 50000) loss: 22.803424
(Iteration 4891 / 50000) loss: 21.340950
(Iteration 4901 / 50000) loss: 19.163235
(Iteration 4911 / 50000) loss: 21.648118
(Iteration 4921 / 50000) loss: 20.779833
(Iteration 4931 / 50000) loss: 19.850648
(Iteration 4941 / 50000) loss: 20.017689
(Iteration 4951 / 50000) loss: 23.025248
(Iteration 4961 / 50000) loss: 18.760954
(Iteration 4971 / 50000) loss: 20.558591
(Iteration 4981 / 50000) loss: 20.973140
(Iteration 4991 / 50000) loss: 23.179926
(Iteration 5001 / 50000) loss: 22.141270
(Iteration 5011 / 50000) loss: 22.088409
(Iteration 5021 / 50000) loss: 21.480088
(Iteration 5031 / 50000) loss: 22.493216
(Iteration 5041 / 50000) loss: 21.813057
(Iteration 5051 / 50000) loss: 21.558678
(Iteration 5061 / 50000) loss: 23.058744
(Iteration 5071 / 50000) loss: 20.503597
(Iteration 5081 / 50000) loss: 21.208037
(Iteration 5091 / 50000) loss: 20.081319
(Iteration 5101 / 50000) loss: 19.694733
(Iteration 5111 / 50000) loss: 21.458480
(Iteration 5121 / 50000) loss: 19.122882
(Iteration 5131 / 50000) loss: 20.435460
(Iteration 5141 / 50000) loss: 21.769694
(Iteration 5151 / 50000) loss: 20.719976
(Iteration 5161 / 50000) loss: 21.668327
(Iteration 5171 / 50000) loss: 22.763358
(Iteration 5181 / 50000) loss: 21.596523
(Iteration 5191 / 50000) loss: 21.937386
(Iteration 5201 / 50000) loss: 20.595539
(Iteration 5211 / 50000) loss: 22.730250
(Iteration 5221 / 50000) loss: 24.919865
(Iteration 5231 / 50000) loss: 20.519277
(Iteration 5241 / 50000) loss: 19.540004
(Iteration 5251 / 50000) loss: 24.039228
(Iteration 5261 / 50000) loss: 19.218905
(Iteration 5271 / 50000) loss: 19.091521
(Iteration 5281 / 50000) loss: 19.688470
(Iteration 5291 / 50000) loss: 22.045766
(Iteration 5301 / 50000) loss: 19.029959
(Iteration 5311 / 50000) loss: 20.702372
(Iteration 5321 / 50000) loss: 21.596280
(Iteration 5331 / 50000) loss: 20.278536
(Iteration 5341 / 50000) loss: 19.349478
(Iteration 5351 / 50000) loss: 21.524768
(Iteration 5361 / 50000) loss: 22.438924
(Iteration 5371 / 50000) loss: 21.619846
(Iteration 5381 / 50000) loss: 22.978886
(Iteration 5391 / 50000) loss: 20.310543
(Iteration 5401 / 50000) loss: 21.671787
(Iteration 5411 / 50000) loss: 20.772511
(Iteration 5421 / 50000) loss: 22.205400
(Iteration 5431 / 50000) loss: 18.629825
(Iteration 5441 / 50000) loss: 21.192941
(Iteration 5451 / 50000) loss: 20.808159
(Iteration 5461 / 50000) loss: 20.586499
(Iteration 5471 / 50000) loss: 21.543697
(Iteration 5481 / 50000) loss: 20.760599
(Iteration 5491 / 50000) loss: 19.343566
(Iteration 5501 / 50000) loss: 19.398574
(Iteration 5511 / 50000) loss: 23.755869
(Iteration 5521 / 50000) loss: 21.682787
(Iteration 5531 / 50000) loss: 20.033302
(Iteration 5541 / 50000) loss: 20.916389
(Iteration 5551 / 50000) loss: 19.878234
(Iteration 5561 / 50000) loss: 20.440956
(Iteration 5571 / 50000) loss: 19.826960
(Iteration 5581 / 50000) loss: 18.158928
(Iteration 5591 / 50000) loss: 22.680598
(Iteration 5601 / 50000) loss: 21.702390
(Iteration 5611 / 50000) loss: 22.075435
(Iteration 5621 / 50000) loss: 19.761415
(Iteration 5631 / 50000) loss: 21.261737
(Iteration 5641 / 50000) loss: 23.292195
(Iteration 5651 / 50000) loss: 20.432820
(Iteration 5661 / 50000) loss: 20.586842
(Iteration 5671 / 50000) loss: 21.023676
(Iteration 5681 / 50000) loss: 20.935793
(Iteration 5691 / 50000) loss: 22.283069
(Iteration 5701 / 50000) loss: 20.519489
(Iteration 5711 / 50000) loss: 20.835395
(Iteration 5721 / 50000) loss: 20.373061
(Iteration 5731 / 50000) loss: 19.190660
(Iteration 5741 / 50000) loss: 19.410828
(Iteration 5751 / 50000) loss: 21.803060
(Iteration 5761 / 50000) loss: 21.324499
(Iteration 5771 / 50000) loss: 19.076110
(Iteration 5781 / 50000) loss: 20.613418
(Iteration 5791 / 50000) loss: 20.666250
(Iteration 5801 / 50000) loss: 21.124283
(Iteration 5811 / 50000) loss: 22.079847
(Iteration 5821 / 50000) loss: 18.405150
(Iteration 5831 / 50000) loss: 20.393583
(Iteration 5841 / 50000) loss: 21.792274
(Iteration 5851 / 50000) loss: 21.431253
(Iteration 5861 / 50000) loss: 21.189426
(Iteration 5871 / 50000) loss: 22.840624
(Iteration 5881 / 50000) loss: 21.984174
(Iteration 5891 / 50000) loss: 22.366338
(Iteration 5901 / 50000) loss: 18.913741
(Iteration 5911 / 50000) loss: 22.056459
(Iteration 5921 / 50000) loss: 19.296436
(Iteration 5931 / 50000) loss: 18.917200
(Iteration 5941 / 50000) loss: 19.105390
(Iteration 5951 / 50000) loss: 20.565251
(Iteration 5961 / 50000) loss: 20.369270
(Iteration 5971 / 50000) loss: 18.911205
(Iteration 5981 / 50000) loss: 20.799106
(Iteration 5991 / 50000) loss: 20.070229
(Iteration 6001 / 50000) loss: 21.124290
(Iteration 6011 / 50000) loss: 20.218269
(Iteration 6021 / 50000) loss: 20.200367
(Iteration 6031 / 50000) loss: 20.333937
(Iteration 6041 / 50000) loss: 20.970247
(Iteration 6051 / 50000) loss: 21.110310
(Iteration 6061 / 50000) loss: 23.306282
(Iteration 6071 / 50000) loss: 20.827773
(Iteration 6081 / 50000) loss: 20.329168
(Iteration 6091 / 50000) loss: 20.148261
(Iteration 6101 / 50000) loss: 20.354062
(Iteration 6111 / 50000) loss: 20.523364
(Iteration 6121 / 50000) loss: 21.570652
(Iteration 6131 / 50000) loss: 17.314943
(Iteration 6141 / 50000) loss: 17.422210
(Iteration 6151 / 50000) loss: 20.898131
(Iteration 6161 / 50000) loss: 20.846413
(Iteration 6171 / 50000) loss: 19.690817
(Iteration 6181 / 50000) loss: 19.164835
(Iteration 6191 / 50000) loss: 20.303410
(Iteration 6201 / 50000) loss: 19.081390
(Iteration 6211 / 50000) loss: 19.751259
(Iteration 6221 / 50000) loss: 20.038525
(Iteration 6231 / 50000) loss: 18.270613
(Iteration 6241 / 50000) loss: 18.278064
(Iteration 6251 / 50000) loss: 18.499522
(Iteration 6261 / 50000) loss: 19.733064
(Iteration 6271 / 50000) loss: 18.414712
(Iteration 6281 / 50000) loss: 19.180443
(Iteration 6291 / 50000) loss: 19.458727
(Iteration 6301 / 50000) loss: 19.637927
(Iteration 6311 / 50000) loss: 19.623687
(Iteration 6321 / 50000) loss: 19.233474
(Iteration 6331 / 50000) loss: 18.398342
(Iteration 6341 / 50000) loss: 20.706862
(Iteration 6351 / 50000) loss: 21.328848
(Iteration 6361 / 50000) loss: 17.883774
(Iteration 6371 / 50000) loss: 18.835936
(Iteration 6381 / 50000) loss: 19.785980
(Iteration 6391 / 50000) loss: 21.251250
(Iteration 6401 / 50000) loss: 20.523335
(Iteration 6411 / 50000) loss: 20.116770
(Iteration 6421 / 50000) loss: 20.002601
(Iteration 6431 / 50000) loss: 20.165428
(Iteration 6441 / 50000) loss: 20.511777
(Iteration 6451 / 50000) loss: 18.339521
(Iteration 6461 / 50000) loss: 20.356371
(Iteration 6471 / 50000) loss: 19.870485
(Iteration 6481 / 50000) loss: 20.009747
(Iteration 6491 / 50000) loss: 18.328134
(Iteration 6501 / 50000) loss: 19.456700
(Iteration 6511 / 50000) loss: 21.337666
(Iteration 6521 / 50000) loss: 19.458789
(Iteration 6531 / 50000) loss: 18.385450
(Iteration 6541 / 50000) loss: 20.027317
(Iteration 6551 / 50000) loss: 18.606620
(Iteration 6561 / 50000) loss: 17.697981
(Iteration 6571 / 50000) loss: 18.570554
(Iteration 6581 / 50000) loss: 19.005362
(Iteration 6591 / 50000) loss: 19.332094
(Iteration 6601 / 50000) loss: 21.319113
(Iteration 6611 / 50000) loss: 20.539046
(Iteration 6621 / 50000) loss: 19.211269
(Iteration 6631 / 50000) loss: 19.568886
(Iteration 6641 / 50000) loss: 19.442566
(Iteration 6651 / 50000) loss: 21.482987
(Iteration 6661 / 50000) loss: 19.846344
(Iteration 6671 / 50000) loss: 18.707803
(Iteration 6681 / 50000) loss: 20.110428
(Iteration 6691 / 50000) loss: 19.791817
(Iteration 6701 / 50000) loss: 17.631434
(Iteration 6711 / 50000) loss: 18.500482
(Iteration 6721 / 50000) loss: 18.915803
(Iteration 6731 / 50000) loss: 17.382782
(Iteration 6741 / 50000) loss: 19.619677
(Iteration 6751 / 50000) loss: 18.308075
(Iteration 6761 / 50000) loss: 19.669702
(Iteration 6771 / 50000) loss: 19.782305
(Iteration 6781 / 50000) loss: 19.449764
(Iteration 6791 / 50000) loss: 18.028271
(Iteration 6801 / 50000) loss: 18.950048
(Iteration 6811 / 50000) loss: 20.656126
(Iteration 6821 / 50000) loss: 18.984901
(Iteration 6831 / 50000) loss: 18.945427
(Iteration 6841 / 50000) loss: 18.366352
(Iteration 6851 / 50000) loss: 19.100345
(Iteration 6861 / 50000) loss: 19.338871
(Iteration 6871 / 50000) loss: 19.592446
(Iteration 6881 / 50000) loss: 19.448646
(Iteration 6891 / 50000) loss: 19.620853
(Iteration 6901 / 50000) loss: 18.676299
(Iteration 6911 / 50000) loss: 20.952203
(Iteration 6921 / 50000) loss: 22.775919
(Iteration 6931 / 50000) loss: 18.405753
(Iteration 6941 / 50000) loss: 18.079386
(Iteration 6951 / 50000) loss: 19.684543
(Iteration 6961 / 50000) loss: 19.829524
(Iteration 6971 / 50000) loss: 21.881164
(Iteration 6981 / 50000) loss: 17.532832
(Iteration 6991 / 50000) loss: 17.459927
(Iteration 7001 / 50000) loss: 17.958956
(Iteration 7011 / 50000) loss: 18.919425
(Iteration 7021 / 50000) loss: 18.679336
(Iteration 7031 / 50000) loss: 17.643844
(Iteration 7041 / 50000) loss: 20.107942
(Iteration 7051 / 50000) loss: 17.663355
(Iteration 7061 / 50000) loss: 20.203215
(Iteration 7071 / 50000) loss: 19.315467
(Iteration 7081 / 50000) loss: 17.935547
(Iteration 7091 / 50000) loss: 19.505259
(Iteration 7101 / 50000) loss: 19.591113
(Iteration 7111 / 50000) loss: 18.313607
(Iteration 7121 / 50000) loss: 19.502872
(Iteration 7131 / 50000) loss: 21.825251
(Iteration 7141 / 50000) loss: 21.121488
(Iteration 7151 / 50000) loss: 19.273408
(Iteration 7161 / 50000) loss: 18.573612
(Iteration 7171 / 50000) loss: 19.776891
(Iteration 7181 / 50000) loss: 20.941724
(Iteration 7191 / 50000) loss: 20.451837
(Iteration 7201 / 50000) loss: 19.499344
(Iteration 7211 / 50000) loss: 18.207035
(Iteration 7221 / 50000) loss: 17.530173
(Iteration 7231 / 50000) loss: 18.280858
(Iteration 7241 / 50000) loss: 18.946820
(Iteration 7251 / 50000) loss: 18.655773
(Iteration 7261 / 50000) loss: 20.152376
(Iteration 7271 / 50000) loss: 20.156445
(Iteration 7281 / 50000) loss: 19.400799
(Iteration 7291 / 50000) loss: 19.720933
(Iteration 7301 / 50000) loss: 19.642111
(Iteration 7311 / 50000) loss: 17.764900
(Iteration 7321 / 50000) loss: 17.939227
(Iteration 7331 / 50000) loss: 19.567266
(Iteration 7341 / 50000) loss: 21.254086
(Iteration 7351 / 50000) loss: 17.944472
(Iteration 7361 / 50000) loss: 17.912508
(Iteration 7371 / 50000) loss: 19.432748
(Iteration 7381 / 50000) loss: 19.820085
(Iteration 7391 / 50000) loss: 18.794662
(Iteration 7401 / 50000) loss: 19.292708
(Iteration 7411 / 50000) loss: 17.743046
(Iteration 7421 / 50000) loss: 18.779725
(Iteration 7431 / 50000) loss: 17.285560
(Iteration 7441 / 50000) loss: 19.763444
(Iteration 7451 / 50000) loss: 18.147894
(Iteration 7461 / 50000) loss: 20.235662
(Iteration 7471 / 50000) loss: 18.181342
(Iteration 7481 / 50000) loss: 18.766445
(Iteration 7491 / 50000) loss: 17.565340
(Iteration 7501 / 50000) loss: 16.960034
(Iteration 7511 / 50000) loss: 16.101397
(Iteration 7521 / 50000) loss: 16.729498
(Iteration 7531 / 50000) loss: 18.678913
(Iteration 7541 / 50000) loss: 17.928990
(Iteration 7551 / 50000) loss: 20.151434
(Iteration 7561 / 50000) loss: 17.705622
(Iteration 7571 / 50000) loss: 16.902790
(Iteration 7581 / 50000) loss: 18.067885
(Iteration 7591 / 50000) loss: 16.952416
(Iteration 7601 / 50000) loss: 18.492926
(Iteration 7611 / 50000) loss: 18.398710
(Iteration 7621 / 50000) loss: 19.033628
(Iteration 7631 / 50000) loss: 17.947896
(Iteration 7641 / 50000) loss: 19.537541
(Iteration 7651 / 50000) loss: 17.358078
(Iteration 7661 / 50000) loss: 18.947505
(Iteration 7671 / 50000) loss: 19.214473
(Iteration 7681 / 50000) loss: 18.417779
(Iteration 7691 / 50000) loss: 18.607622
(Iteration 7701 / 50000) loss: 19.507673
(Iteration 7711 / 50000) loss: 20.172791
(Iteration 7721 / 50000) loss: 18.599648
(Iteration 7731 / 50000) loss: 17.628087
(Iteration 7741 / 50000) loss: 18.244452
(Iteration 7751 / 50000) loss: 16.956513
(Iteration 7761 / 50000) loss: 17.161630
(Iteration 7771 / 50000) loss: 20.129473
(Iteration 7781 / 50000) loss: 18.816792
(Iteration 7791 / 50000) loss: 17.741674
(Iteration 7801 / 50000) loss: 17.977319
(Iteration 7811 / 50000) loss: 17.240084
(Iteration 7821 / 50000) loss: 17.525019
(Iteration 7831 / 50000) loss: 18.446755
(Iteration 7841 / 50000) loss: 19.122728
(Iteration 7851 / 50000) loss: 17.108321
(Iteration 7861 / 50000) loss: 18.727433
(Iteration 7871 / 50000) loss: 18.703713
(Iteration 7881 / 50000) loss: 18.070126
(Iteration 7891 / 50000) loss: 16.807122
(Iteration 7901 / 50000) loss: 17.695822
(Iteration 7911 / 50000) loss: 17.462358
(Iteration 7921 / 50000) loss: 18.063869
(Iteration 7931 / 50000) loss: 20.164772
(Iteration 7941 / 50000) loss: 17.588070
(Iteration 7951 / 50000) loss: 17.855903
(Iteration 7961 / 50000) loss: 19.258295
(Iteration 7971 / 50000) loss: 19.280157
(Iteration 7981 / 50000) loss: 17.134296
(Iteration 7991 / 50000) loss: 17.752265
(Iteration 8001 / 50000) loss: 17.073337
(Iteration 8011 / 50000) loss: 18.257042
(Iteration 8021 / 50000) loss: 16.541223
(Iteration 8031 / 50000) loss: 17.487458
(Iteration 8041 / 50000) loss: 17.221950
(Iteration 8051 / 50000) loss: 17.582704
(Iteration 8061 / 50000) loss: 17.643004
(Iteration 8071 / 50000) loss: 18.519407
(Iteration 8081 / 50000) loss: 18.897081
(Iteration 8091 / 50000) loss: 18.807134
(Iteration 8101 / 50000) loss: 18.431072
(Iteration 8111 / 50000) loss: 16.821774
(Iteration 8121 / 50000) loss: 18.024948
(Iteration 8131 / 50000) loss: 20.367276
(Iteration 8141 / 50000) loss: 16.537830
(Iteration 8151 / 50000) loss: 19.572197
(Iteration 8161 / 50000) loss: 18.282196
(Iteration 8171 / 50000) loss: 17.764949
(Iteration 8181 / 50000) loss: 18.778030
(Iteration 8191 / 50000) loss: 17.846059
(Iteration 8201 / 50000) loss: 19.745659
(Iteration 8211 / 50000) loss: 18.265191
(Iteration 8221 / 50000) loss: 16.120530
(Iteration 8231 / 50000) loss: 18.011215
(Iteration 8241 / 50000) loss: 16.415331
(Iteration 8251 / 50000) loss: 18.234777
(Iteration 8261 / 50000) loss: 17.668741
(Iteration 8271 / 50000) loss: 18.099680
(Iteration 8281 / 50000) loss: 17.807742
(Iteration 8291 / 50000) loss: 18.656129
(Iteration 8301 / 50000) loss: 17.626587
(Iteration 8311 / 50000) loss: 18.404431
(Iteration 8321 / 50000) loss: 17.964798
(Iteration 8331 / 50000) loss: 18.289032
(Iteration 8341 / 50000) loss: 16.652279
(Iteration 8351 / 50000) loss: 18.262767
(Iteration 8361 / 50000) loss: 16.647480
(Iteration 8371 / 50000) loss: 18.246995
(Iteration 8381 / 50000) loss: 17.697705
(Iteration 8391 / 50000) loss: 17.177185
(Iteration 8401 / 50000) loss: 19.622441
(Iteration 8411 / 50000) loss: 18.281019
(Iteration 8421 / 50000) loss: 18.643746
(Iteration 8431 / 50000) loss: 16.699289
(Iteration 8441 / 50000) loss: 18.197147
(Iteration 8451 / 50000) loss: 17.164112
(Iteration 8461 / 50000) loss: 17.428307
(Iteration 8471 / 50000) loss: 19.780781
(Iteration 8481 / 50000) loss: 18.901711
(Iteration 8491 / 50000) loss: 15.845885
(Iteration 8501 / 50000) loss: 16.470592
(Iteration 8511 / 50000) loss: 16.991765
(Iteration 8521 / 50000) loss: 16.707910
(Iteration 8531 / 50000) loss: 15.673759
(Iteration 8541 / 50000) loss: 17.710295
(Iteration 8551 / 50000) loss: 16.368827
(Iteration 8561 / 50000) loss: 18.000798
(Iteration 8571 / 50000) loss: 17.910926
(Iteration 8581 / 50000) loss: 17.567604
(Iteration 8591 / 50000) loss: 18.788112
(Iteration 8601 / 50000) loss: 17.168847
(Iteration 8611 / 50000) loss: 15.332856
(Iteration 8621 / 50000) loss: 16.727200
(Iteration 8631 / 50000) loss: 18.131948
(Iteration 8641 / 50000) loss: 18.054578
(Iteration 8651 / 50000) loss: 17.522945
(Iteration 8661 / 50000) loss: 16.682996
(Iteration 8671 / 50000) loss: 16.815948
(Iteration 8681 / 50000) loss: 19.062542
(Iteration 8691 / 50000) loss: 16.807473
(Iteration 8701 / 50000) loss: 18.524285
(Iteration 8711 / 50000) loss: 17.118205
(Iteration 8721 / 50000) loss: 18.898090
(Iteration 8731 / 50000) loss: 16.681473
(Iteration 8741 / 50000) loss: 19.239302
(Iteration 8751 / 50000) loss: 16.805546
(Iteration 8761 / 50000) loss: 16.782028
(Iteration 8771 / 50000) loss: 17.302671
(Iteration 8781 / 50000) loss: 18.520628
(Iteration 8791 / 50000) loss: 16.597888
(Iteration 8801 / 50000) loss: 15.966570
(Iteration 8811 / 50000) loss: 18.676601
(Iteration 8821 / 50000) loss: 18.165537
(Iteration 8831 / 50000) loss: 19.003924
(Iteration 8841 / 50000) loss: 17.915025
(Iteration 8851 / 50000) loss: 18.186735
(Iteration 8861 / 50000) loss: 17.317318
(Iteration 8871 / 50000) loss: 16.307261
(Iteration 8881 / 50000) loss: 16.256647
(Iteration 8891 / 50000) loss: 16.212746
(Iteration 8901 / 50000) loss: 17.557186
(Iteration 8911 / 50000) loss: 18.376219
(Iteration 8921 / 50000) loss: 18.213310
(Iteration 8931 / 50000) loss: 17.545227
(Iteration 8941 / 50000) loss: 19.426384
(Iteration 8951 / 50000) loss: 18.714962
(Iteration 8961 / 50000) loss: 17.114706
(Iteration 8971 / 50000) loss: 17.509216
(Iteration 8981 / 50000) loss: 16.215570
(Iteration 8991 / 50000) loss: 17.375526
(Iteration 9001 / 50000) loss: 17.193062
(Iteration 9011 / 50000) loss: 17.680504
(Iteration 9021 / 50000) loss: 16.667845
(Iteration 9031 / 50000) loss: 17.521360
(Iteration 9041 / 50000) loss: 16.159589
(Iteration 9051 / 50000) loss: 16.249272
(Iteration 9061 / 50000) loss: 16.347158
(Iteration 9071 / 50000) loss: 16.803838
(Iteration 9081 / 50000) loss: 17.742362
(Iteration 9091 / 50000) loss: 16.693335
(Iteration 9101 / 50000) loss: 16.581633
(Iteration 9111 / 50000) loss: 16.434157
(Iteration 9121 / 50000) loss: 15.957248
(Iteration 9131 / 50000) loss: 19.755144
(Iteration 9141 / 50000) loss: 16.581782
(Iteration 9151 / 50000) loss: 17.155394
(Iteration 9161 / 50000) loss: 16.861450
(Iteration 9171 / 50000) loss: 17.448032
(Iteration 9181 / 50000) loss: 18.669689
(Iteration 9191 / 50000) loss: 15.555337
(Iteration 9201 / 50000) loss: 17.814621
(Iteration 9211 / 50000) loss: 15.551170
(Iteration 9221 / 50000) loss: 17.196796
(Iteration 9231 / 50000) loss: 16.880089
(Iteration 9241 / 50000) loss: 17.402382
(Iteration 9251 / 50000) loss: 16.506217
(Iteration 9261 / 50000) loss: 17.058196
(Iteration 9271 / 50000) loss: 16.353933
(Iteration 9281 / 50000) loss: 16.676791
(Iteration 9291 / 50000) loss: 15.673351
(Iteration 9301 / 50000) loss: 16.455683
(Iteration 9311 / 50000) loss: 15.978762
(Iteration 9321 / 50000) loss: 15.406118
(Iteration 9331 / 50000) loss: 19.031682
(Iteration 9341 / 50000) loss: 15.470888
(Iteration 9351 / 50000) loss: 15.188175
(Iteration 9361 / 50000) loss: 15.666013
(Iteration 9371 / 50000) loss: 15.762611
(Iteration 9381 / 50000) loss: 15.480892
(Iteration 9391 / 50000) loss: 18.041210
(Iteration 9401 / 50000) loss: 16.653803
(Iteration 9411 / 50000) loss: 17.230485
(Iteration 9421 / 50000) loss: 16.706672
(Iteration 9431 / 50000) loss: 16.488942
(Iteration 9441 / 50000) loss: 17.110530
(Iteration 9451 / 50000) loss: 14.856356
(Iteration 9461 / 50000) loss: 17.685101
(Iteration 9471 / 50000) loss: 17.540546
(Iteration 9481 / 50000) loss: 15.068992
(Iteration 9491 / 50000) loss: 15.609837
(Iteration 9501 / 50000) loss: 17.711725
(Iteration 9511 / 50000) loss: 16.795534
(Iteration 9521 / 50000) loss: 15.188682
(Iteration 9531 / 50000) loss: 17.370098
(Iteration 9541 / 50000) loss: 17.118295
(Iteration 9551 / 50000) loss: 17.052448
(Iteration 9561 / 50000) loss: 16.586125
(Iteration 9571 / 50000) loss: 18.598129
(Iteration 9581 / 50000) loss: 17.077304
(Iteration 9591 / 50000) loss: 17.526231
(Iteration 9601 / 50000) loss: 16.165526
(Iteration 9611 / 50000) loss: 17.217744
(Iteration 9621 / 50000) loss: 16.699322
(Iteration 9631 / 50000) loss: 16.937043
(Iteration 9641 / 50000) loss: 15.696933
(Iteration 9651 / 50000) loss: 15.594095
(Iteration 9661 / 50000) loss: 17.344318
(Iteration 9671 / 50000) loss: 16.198189
(Iteration 9681 / 50000) loss: 17.908091
(Iteration 9691 / 50000) loss: 17.400884
(Iteration 9701 / 50000) loss: 15.060644
(Iteration 9711 / 50000) loss: 15.586394
(Iteration 9721 / 50000) loss: 16.516205
(Iteration 9731 / 50000) loss: 17.056010
(Iteration 9741 / 50000) loss: 16.078699
(Iteration 9751 / 50000) loss: 18.157994
(Iteration 9761 / 50000) loss: 16.467600
(Iteration 9771 / 50000) loss: 17.049044
(Iteration 9781 / 50000) loss: 16.618214
(Iteration 9791 / 50000) loss: 18.521259
(Iteration 9801 / 50000) loss: 16.404452
(Iteration 9811 / 50000) loss: 15.168632
(Iteration 9821 / 50000) loss: 14.881513
(Iteration 9831 / 50000) loss: 18.009426
(Iteration 9841 / 50000) loss: 17.957841
(Iteration 9851 / 50000) loss: 16.313223
(Iteration 9861 / 50000) loss: 15.852134
(Iteration 9871 / 50000) loss: 16.870043
(Iteration 9881 / 50000) loss: 15.732664
(Iteration 9891 / 50000) loss: 17.831941
(Iteration 9901 / 50000) loss: 15.538980
(Iteration 9911 / 50000) loss: 16.379067
(Iteration 9921 / 50000) loss: 17.689081
(Iteration 9931 / 50000) loss: 17.150266
(Iteration 9941 / 50000) loss: 16.538504
(Iteration 9951 / 50000) loss: 14.868312
(Iteration 9961 / 50000) loss: 16.902443
(Iteration 9971 / 50000) loss: 15.325823
(Iteration 9981 / 50000) loss: 16.625908
(Iteration 9991 / 50000) loss: 15.614556
(Iteration 10001 / 50000) loss: 17.236104
(Iteration 10011 / 50000) loss: 17.549364
(Iteration 10021 / 50000) loss: 14.630394
(Iteration 10031 / 50000) loss: 16.469599
(Iteration 10041 / 50000) loss: 14.485995
(Iteration 10051 / 50000) loss: 16.944418
(Iteration 10061 / 50000) loss: 14.875517
(Iteration 10071 / 50000) loss: 16.319228
(Iteration 10081 / 50000) loss: 15.411546
(Iteration 10091 / 50000) loss: 17.135337
(Iteration 10101 / 50000) loss: 15.598701
(Iteration 10111 / 50000) loss: 17.581875
(Iteration 10121 / 50000) loss: 15.563860
(Iteration 10131 / 50000) loss: 15.966700
(Iteration 10141 / 50000) loss: 14.733457
(Iteration 10151 / 50000) loss: 14.110269
(Iteration 10161 / 50000) loss: 15.934618
(Iteration 10171 / 50000) loss: 15.086526
(Iteration 10181 / 50000) loss: 13.838178
(Iteration 10191 / 50000) loss: 17.008396
(Iteration 10201 / 50000) loss: 16.767724
(Iteration 10211 / 50000) loss: 18.026935
(Iteration 10221 / 50000) loss: 17.186840
(Iteration 10231 / 50000) loss: 16.662008
(Iteration 10241 / 50000) loss: 15.460845
(Iteration 10251 / 50000) loss: 14.272533
(Iteration 10261 / 50000) loss: 15.478552
(Iteration 10271 / 50000) loss: 15.593291
(Iteration 10281 / 50000) loss: 14.137836
(Iteration 10291 / 50000) loss: 16.641457
(Iteration 10301 / 50000) loss: 13.862440
(Iteration 10311 / 50000) loss: 16.420695
(Iteration 10321 / 50000) loss: 16.151679
(Iteration 10331 / 50000) loss: 15.369138
(Iteration 10341 / 50000) loss: 15.024248
(Iteration 10351 / 50000) loss: 13.723435
(Iteration 10361 / 50000) loss: 16.858463
(Iteration 10371 / 50000) loss: 14.083611
(Iteration 10381 / 50000) loss: 16.737886
(Iteration 10391 / 50000) loss: 16.345412
(Iteration 10401 / 50000) loss: 15.978486
(Iteration 10411 / 50000) loss: 15.786510
(Iteration 10421 / 50000) loss: 16.295959
(Iteration 10431 / 50000) loss: 18.019885
(Iteration 10441 / 50000) loss: 14.678747
(Iteration 10451 / 50000) loss: 15.033670
(Iteration 10461 / 50000) loss: 16.961459
(Iteration 10471 / 50000) loss: 16.227661
(Iteration 10481 / 50000) loss: 15.606098
(Iteration 10491 / 50000) loss: 16.282136
(Iteration 10501 / 50000) loss: 14.492209
(Iteration 10511 / 50000) loss: 13.798812
(Iteration 10521 / 50000) loss: 14.586155
(Iteration 10531 / 50000) loss: 15.055263
(Iteration 10541 / 50000) loss: 14.595962
(Iteration 10551 / 50000) loss: 17.024849
(Iteration 10561 / 50000) loss: 15.485250
(Iteration 10571 / 50000) loss: 15.409110
(Iteration 10581 / 50000) loss: 15.350903
(Iteration 10591 / 50000) loss: 14.970876
(Iteration 10601 / 50000) loss: 17.158490
(Iteration 10611 / 50000) loss: 14.442553
(Iteration 10621 / 50000) loss: 15.646996
(Iteration 10631 / 50000) loss: 16.005507
(Iteration 10641 / 50000) loss: 16.157485
(Iteration 10651 / 50000) loss: 15.761939
(Iteration 10661 / 50000) loss: 15.616799
(Iteration 10671 / 50000) loss: 15.367503
(Iteration 10681 / 50000) loss: 14.678549
(Iteration 10691 / 50000) loss: 15.845859
(Iteration 10701 / 50000) loss: 16.210914
(Iteration 10711 / 50000) loss: 16.420200
(Iteration 10721 / 50000) loss: 15.234358
(Iteration 10731 / 50000) loss: 14.572684
(Iteration 10741 / 50000) loss: 16.480985
(Iteration 10751 / 50000) loss: 15.535439
(Iteration 10761 / 50000) loss: 15.016944
(Iteration 10771 / 50000) loss: 15.872384
(Iteration 10781 / 50000) loss: 15.698025
(Iteration 10791 / 50000) loss: 15.437026
(Iteration 10801 / 50000) loss: 15.682360
(Iteration 10811 / 50000) loss: 14.874140
(Iteration 10821 / 50000) loss: 14.844195
(Iteration 10831 / 50000) loss: 14.850108
(Iteration 10841 / 50000) loss: 16.153551
(Iteration 10851 / 50000) loss: 15.347772
(Iteration 10861 / 50000) loss: 16.355364
(Iteration 10871 / 50000) loss: 15.664333
(Iteration 10881 / 50000) loss: 14.422206
(Iteration 10891 / 50000) loss: 15.844061
(Iteration 10901 / 50000) loss: 16.394600
(Iteration 10911 / 50000) loss: 14.173170
(Iteration 10921 / 50000) loss: 14.674755
(Iteration 10931 / 50000) loss: 16.237118
(Iteration 10941 / 50000) loss: 13.446815
(Iteration 10951 / 50000) loss: 14.760999
(Iteration 10961 / 50000) loss: 16.052395
(Iteration 10971 / 50000) loss: 13.315091
(Iteration 10981 / 50000) loss: 15.641616
(Iteration 10991 / 50000) loss: 14.798618
(Iteration 11001 / 50000) loss: 15.182279
(Iteration 11011 / 50000) loss: 15.610517
(Iteration 11021 / 50000) loss: 14.978175
(Iteration 11031 / 50000) loss: 15.507785
(Iteration 11041 / 50000) loss: 14.801220
(Iteration 11051 / 50000) loss: 15.603042
(Iteration 11061 / 50000) loss: 14.437139
(Iteration 11071 / 50000) loss: 14.813405
(Iteration 11081 / 50000) loss: 16.710091
(Iteration 11091 / 50000) loss: 15.573694
(Iteration 11101 / 50000) loss: 15.531077
(Iteration 11111 / 50000) loss: 17.278361
(Iteration 11121 / 50000) loss: 13.684040
(Iteration 11131 / 50000) loss: 13.844369
(Iteration 11141 / 50000) loss: 14.629847
(Iteration 11151 / 50000) loss: 14.246002
(Iteration 11161 / 50000) loss: 14.374112
(Iteration 11171 / 50000) loss: 13.978127
(Iteration 11181 / 50000) loss: 14.658940
(Iteration 11191 / 50000) loss: 15.330353
(Iteration 11201 / 50000) loss: 15.101047
(Iteration 11211 / 50000) loss: 14.947263
(Iteration 11221 / 50000) loss: 15.923002
(Iteration 11231 / 50000) loss: 14.095825
(Iteration 11241 / 50000) loss: 15.484590
(Iteration 11251 / 50000) loss: 14.141835
(Iteration 11261 / 50000) loss: 14.040744
(Iteration 11271 / 50000) loss: 16.343761
(Iteration 11281 / 50000) loss: 13.464931
(Iteration 11291 / 50000) loss: 15.564428
(Iteration 11301 / 50000) loss: 14.949871
(Iteration 11311 / 50000) loss: 14.428300
(Iteration 11321 / 50000) loss: 14.967848
(Iteration 11331 / 50000) loss: 16.112642
(Iteration 11341 / 50000) loss: 14.685092
(Iteration 11351 / 50000) loss: 15.511693
(Iteration 11361 / 50000) loss: 14.167699
(Iteration 11371 / 50000) loss: 15.223804
(Iteration 11381 / 50000) loss: 17.617971
(Iteration 11391 / 50000) loss: 15.127659
(Iteration 11401 / 50000) loss: 15.242616
(Iteration 11411 / 50000) loss: 15.752058
(Iteration 11421 / 50000) loss: 13.081453
(Iteration 11431 / 50000) loss: 14.677169
(Iteration 11441 / 50000) loss: 13.781921
(Iteration 11451 / 50000) loss: 15.433597
(Iteration 11461 / 50000) loss: 14.249955
(Iteration 11471 / 50000) loss: 15.388795
(Iteration 11481 / 50000) loss: 14.934442
(Iteration 11491 / 50000) loss: 15.541721
(Iteration 11501 / 50000) loss: 16.720043
(Iteration 11511 / 50000) loss: 15.003232
(Iteration 11521 / 50000) loss: 15.303973
(Iteration 11531 / 50000) loss: 14.728652
(Iteration 11541 / 50000) loss: 14.612109
(Iteration 11551 / 50000) loss: 13.953819
(Iteration 11561 / 50000) loss: 15.084764
(Iteration 11571 / 50000) loss: 14.017291
(Iteration 11581 / 50000) loss: 14.977150
(Iteration 11591 / 50000) loss: 16.553509
(Iteration 11601 / 50000) loss: 15.488044
(Iteration 11611 / 50000) loss: 14.092208
(Iteration 11621 / 50000) loss: 15.267354
(Iteration 11631 / 50000) loss: 15.036526
(Iteration 11641 / 50000) loss: 14.736187
(Iteration 11651 / 50000) loss: 13.155355
(Iteration 11661 / 50000) loss: 14.740794
(Iteration 11671 / 50000) loss: 14.001583
(Iteration 11681 / 50000) loss: 13.578831
(Iteration 11691 / 50000) loss: 14.802255
(Iteration 11701 / 50000) loss: 13.783968
(Iteration 11711 / 50000) loss: 14.955258
(Iteration 11721 / 50000) loss: 14.077916
(Iteration 11731 / 50000) loss: 15.801541
(Iteration 11741 / 50000) loss: 15.543674
(Iteration 11751 / 50000) loss: 14.923980
(Iteration 11761 / 50000) loss: 13.955608
(Iteration 11771 / 50000) loss: 16.656646
(Iteration 11781 / 50000) loss: 15.405755
(Iteration 11791 / 50000) loss: 13.784215
(Iteration 11801 / 50000) loss: 13.909164
(Iteration 11811 / 50000) loss: 15.599827
(Iteration 11821 / 50000) loss: 14.555124
(Iteration 11831 / 50000) loss: 14.557012
(Iteration 11841 / 50000) loss: 14.094957
(Iteration 11851 / 50000) loss: 15.190572
(Iteration 11861 / 50000) loss: 15.393720
(Iteration 11871 / 50000) loss: 14.522871
(Iteration 11881 / 50000) loss: 14.113457
(Iteration 11891 / 50000) loss: 14.023989
(Iteration 11901 / 50000) loss: 13.145218
(Iteration 11911 / 50000) loss: 15.796865
(Iteration 11921 / 50000) loss: 14.626319
(Iteration 11931 / 50000) loss: 14.665143
(Iteration 11941 / 50000) loss: 14.413752
(Iteration 11951 / 50000) loss: 14.278458
(Iteration 11961 / 50000) loss: 15.712148
(Iteration 11971 / 50000) loss: 14.916899
(Iteration 11981 / 50000) loss: 16.541285
(Iteration 11991 / 50000) loss: 15.929810
(Iteration 12001 / 50000) loss: 14.108012
(Iteration 12011 / 50000) loss: 14.743943
(Iteration 12021 / 50000) loss: 14.616023
(Iteration 12031 / 50000) loss: 14.018671
(Iteration 12041 / 50000) loss: 14.798096
(Iteration 12051 / 50000) loss: 15.279035
(Iteration 12061 / 50000) loss: 14.968731
(Iteration 12071 / 50000) loss: 15.220018
(Iteration 12081 / 50000) loss: 14.647342
(Iteration 12091 / 50000) loss: 14.656200
(Iteration 12101 / 50000) loss: 14.258652
(Iteration 12111 / 50000) loss: 15.312849
(Iteration 12121 / 50000) loss: 13.647113
(Iteration 12131 / 50000) loss: 14.121212
(Iteration 12141 / 50000) loss: 14.376576
(Iteration 12151 / 50000) loss: 12.464075
(Iteration 12161 / 50000) loss: 14.696920
(Iteration 12171 / 50000) loss: 15.696884
(Iteration 12181 / 50000) loss: 13.647912
(Iteration 12191 / 50000) loss: 16.797501
(Iteration 12201 / 50000) loss: 14.585297
(Iteration 12211 / 50000) loss: 14.309058
(Iteration 12221 / 50000) loss: 14.740819
(Iteration 12231 / 50000) loss: 14.021078
(Iteration 12241 / 50000) loss: 13.441066
(Iteration 12251 / 50000) loss: 12.998351
(Iteration 12261 / 50000) loss: 13.092440
(Iteration 12271 / 50000) loss: 14.122700
(Iteration 12281 / 50000) loss: 13.710731
(Iteration 12291 / 50000) loss: 14.120879
(Iteration 12301 / 50000) loss: 14.593388
(Iteration 12311 / 50000) loss: 13.713633
(Iteration 12321 / 50000) loss: 15.599760
(Iteration 12331 / 50000) loss: 14.361754
(Iteration 12341 / 50000) loss: 13.199930
(Iteration 12351 / 50000) loss: 14.190560
(Iteration 12361 / 50000) loss: 13.736939
(Iteration 12371 / 50000) loss: 14.540184
(Iteration 12381 / 50000) loss: 13.839320
(Iteration 12391 / 50000) loss: 14.597070
(Iteration 12401 / 50000) loss: 14.932490
(Iteration 12411 / 50000) loss: 14.595059
(Iteration 12421 / 50000) loss: 15.054561
(Iteration 12431 / 50000) loss: 13.465855
(Iteration 12441 / 50000) loss: 12.539556
(Iteration 12451 / 50000) loss: 13.814096
(Iteration 12461 / 50000) loss: 14.961584
(Iteration 12471 / 50000) loss: 13.536850
(Iteration 12481 / 50000) loss: 12.875490
(Iteration 12491 / 50000) loss: 12.199841
(Iteration 12501 / 50000) loss: 13.787848
(Iteration 12511 / 50000) loss: 15.434793
(Iteration 12521 / 50000) loss: 14.486527
(Iteration 12531 / 50000) loss: 13.440075
(Iteration 12541 / 50000) loss: 13.258948
(Iteration 12551 / 50000) loss: 14.683145
(Iteration 12561 / 50000) loss: 14.528728
(Iteration 12571 / 50000) loss: 13.944834
(Iteration 12581 / 50000) loss: 13.998560
(Iteration 12591 / 50000) loss: 14.712468
(Iteration 12601 / 50000) loss: 12.604679
(Iteration 12611 / 50000) loss: 13.642901
(Iteration 12621 / 50000) loss: 13.692284
(Iteration 12631 / 50000) loss: 14.856894
(Iteration 12641 / 50000) loss: 13.603528
(Iteration 12651 / 50000) loss: 14.889061
(Iteration 12661 / 50000) loss: 13.709754
(Iteration 12671 / 50000) loss: 13.964045
(Iteration 12681 / 50000) loss: 13.511804
(Iteration 12691 / 50000) loss: 15.764029
(Iteration 12701 / 50000) loss: 14.894100
(Iteration 12711 / 50000) loss: 14.313630
(Iteration 12721 / 50000) loss: 11.998873
(Iteration 12731 / 50000) loss: 13.293908
(Iteration 12741 / 50000) loss: 14.314851
(Iteration 12751 / 50000) loss: 12.307314
(Iteration 12761 / 50000) loss: 12.814419
(Iteration 12771 / 50000) loss: 15.048711
(Iteration 12781 / 50000) loss: 13.724506
(Iteration 12791 / 50000) loss: 12.824282
(Iteration 12801 / 50000) loss: 13.814577
(Iteration 12811 / 50000) loss: 14.249308
(Iteration 12821 / 50000) loss: 12.720390
(Iteration 12831 / 50000) loss: 14.117117
(Iteration 12841 / 50000) loss: 13.446039
(Iteration 12851 / 50000) loss: 14.224435
(Iteration 12861 / 50000) loss: 13.754643
(Iteration 12871 / 50000) loss: 14.277512
(Iteration 12881 / 50000) loss: 13.782169
(Iteration 12891 / 50000) loss: 13.128468
(Iteration 12901 / 50000) loss: 12.818985
(Iteration 12911 / 50000) loss: 13.924977
(Iteration 12921 / 50000) loss: 12.220798
(Iteration 12931 / 50000) loss: 12.105065
(Iteration 12941 / 50000) loss: 14.449642
(Iteration 12951 / 50000) loss: 16.470022
(Iteration 12961 / 50000) loss: 12.708723
(Iteration 12971 / 50000) loss: 13.762842
(Iteration 12981 / 50000) loss: 14.590795
(Iteration 12991 / 50000) loss: 14.735038
(Iteration 13001 / 50000) loss: 14.596201
(Iteration 13011 / 50000) loss: 12.048053
(Iteration 13021 / 50000) loss: 13.897778
(Iteration 13031 / 50000) loss: 14.413839
(Iteration 13041 / 50000) loss: 13.590829
(Iteration 13051 / 50000) loss: 14.558441
(Iteration 13061 / 50000) loss: 12.919196
(Iteration 13071 / 50000) loss: 13.285011
(Iteration 13081 / 50000) loss: 15.618654
(Iteration 13091 / 50000) loss: 14.005802
(Iteration 13101 / 50000) loss: 14.018620
(Iteration 13111 / 50000) loss: 13.290290
(Iteration 13121 / 50000) loss: 12.415906
(Iteration 13131 / 50000) loss: 15.117600
(Iteration 13141 / 50000) loss: 15.171156
(Iteration 13151 / 50000) loss: 13.993892
(Iteration 13161 / 50000) loss: 11.778832
(Iteration 13171 / 50000) loss: 12.965943
(Iteration 13181 / 50000) loss: 13.056946
(Iteration 13191 / 50000) loss: 13.685550
(Iteration 13201 / 50000) loss: 13.759870
(Iteration 13211 / 50000) loss: 12.928143
(Iteration 13221 / 50000) loss: 15.278584
(Iteration 13231 / 50000) loss: 11.866953
(Iteration 13241 / 50000) loss: 13.307105
(Iteration 13251 / 50000) loss: 14.290387
(Iteration 13261 / 50000) loss: 12.884797
(Iteration 13271 / 50000) loss: 13.986853
(Iteration 13281 / 50000) loss: 14.583555
(Iteration 13291 / 50000) loss: 13.250064
(Iteration 13301 / 50000) loss: 12.321050
(Iteration 13311 / 50000) loss: 14.220781
(Iteration 13321 / 50000) loss: 13.790051
(Iteration 13331 / 50000) loss: 13.770125
(Iteration 13341 / 50000) loss: 12.960272
(Iteration 13351 / 50000) loss: 13.347561
(Iteration 13361 / 50000) loss: 12.987386
(Iteration 13371 / 50000) loss: 12.625686
(Iteration 13381 / 50000) loss: 12.673961
(Iteration 13391 / 50000) loss: 13.340789
(Iteration 13401 / 50000) loss: 13.258921
(Iteration 13411 / 50000) loss: 12.177127
(Iteration 13421 / 50000) loss: 14.085374
(Iteration 13431 / 50000) loss: 13.496601
(Iteration 13441 / 50000) loss: 13.255104
(Iteration 13451 / 50000) loss: 15.147194
(Iteration 13461 / 50000) loss: 13.381636
(Iteration 13471 / 50000) loss: 14.105485
(Iteration 13481 / 50000) loss: 12.554404
(Iteration 13491 / 50000) loss: 12.337243
(Iteration 13501 / 50000) loss: 12.538850
(Iteration 13511 / 50000) loss: 12.863847
(Iteration 13521 / 50000) loss: 12.718885
(Iteration 13531 / 50000) loss: 15.019425
(Iteration 13541 / 50000) loss: 13.371602
(Iteration 13551 / 50000) loss: 14.032678
(Iteration 13561 / 50000) loss: 13.669990
(Iteration 13571 / 50000) loss: 13.260386
(Iteration 13581 / 50000) loss: 12.937115
(Iteration 13591 / 50000) loss: 12.952862
(Iteration 13601 / 50000) loss: 12.862986
(Iteration 13611 / 50000) loss: 12.388838
(Iteration 13621 / 50000) loss: 14.562004
(Iteration 13631 / 50000) loss: 13.263308
(Iteration 13641 / 50000) loss: 14.131367
(Iteration 13651 / 50000) loss: 13.187985
(Iteration 13661 / 50000) loss: 12.685281
(Iteration 13671 / 50000) loss: 13.504192
(Iteration 13681 / 50000) loss: 13.487027
(Iteration 13691 / 50000) loss: 13.532350
(Iteration 13701 / 50000) loss: 13.937110
(Iteration 13711 / 50000) loss: 12.144014
(Iteration 13721 / 50000) loss: 12.071364
(Iteration 13731 / 50000) loss: 13.204165
(Iteration 13741 / 50000) loss: 13.628241
(Iteration 13751 / 50000) loss: 12.256045
(Iteration 13761 / 50000) loss: 13.384479
(Iteration 13771 / 50000) loss: 13.255942
(Iteration 13781 / 50000) loss: 12.982652
(Iteration 13791 / 50000) loss: 12.195056
(Iteration 13801 / 50000) loss: 13.281510
(Iteration 13811 / 50000) loss: 12.968755
(Iteration 13821 / 50000) loss: 14.400437
(Iteration 13831 / 50000) loss: 13.087384
(Iteration 13841 / 50000) loss: 13.113876
(Iteration 13851 / 50000) loss: 14.054735
(Iteration 13861 / 50000) loss: 12.696951
(Iteration 13871 / 50000) loss: 13.678173
(Iteration 13881 / 50000) loss: 13.636691
(Iteration 13891 / 50000) loss: 12.032802
(Iteration 13901 / 50000) loss: 13.009553
(Iteration 13911 / 50000) loss: 13.529807
(Iteration 13921 / 50000) loss: 13.773904
(Iteration 13931 / 50000) loss: 13.155205
(Iteration 13941 / 50000) loss: 14.053726
(Iteration 13951 / 50000) loss: 13.700384
(Iteration 13961 / 50000) loss: 12.318819
(Iteration 13971 / 50000) loss: 12.466796
(Iteration 13981 / 50000) loss: 13.404581
(Iteration 13991 / 50000) loss: 13.869467
(Iteration 14001 / 50000) loss: 12.998578
(Iteration 14011 / 50000) loss: 13.852586
(Iteration 14021 / 50000) loss: 13.816236
(Iteration 14031 / 50000) loss: 13.432375
(Iteration 14041 / 50000) loss: 12.646685
(Iteration 14051 / 50000) loss: 13.174180
(Iteration 14061 / 50000) loss: 12.434935
(Iteration 14071 / 50000) loss: 13.960615
(Iteration 14081 / 50000) loss: 11.837880
(Iteration 14091 / 50000) loss: 13.622893
(Iteration 14101 / 50000) loss: 12.027315
(Iteration 14111 / 50000) loss: 12.986002
(Iteration 14121 / 50000) loss: 12.523142
(Iteration 14131 / 50000) loss: 13.433128
(Iteration 14141 / 50000) loss: 13.959704
(Iteration 14151 / 50000) loss: 13.027380
(Iteration 14161 / 50000) loss: 13.316555
(Iteration 14171 / 50000) loss: 13.123387
(Iteration 14181 / 50000) loss: 12.809161
(Iteration 14191 / 50000) loss: 10.875267
(Iteration 14201 / 50000) loss: 13.682431
(Iteration 14211 / 50000) loss: 12.215289
(Iteration 14221 / 50000) loss: 11.647410
(Iteration 14231 / 50000) loss: 12.175926
(Iteration 14241 / 50000) loss: 13.361347
(Iteration 14251 / 50000) loss: 12.885480
(Iteration 14261 / 50000) loss: 12.341097
(Iteration 14271 / 50000) loss: 13.548989
(Iteration 14281 / 50000) loss: 13.684475
(Iteration 14291 / 50000) loss: 13.833680
(Iteration 14301 / 50000) loss: 12.594080
(Iteration 14311 / 50000) loss: 12.168182
(Iteration 14321 / 50000) loss: 11.849971
(Iteration 14331 / 50000) loss: 11.814404
(Iteration 14341 / 50000) loss: 12.929611
(Iteration 14351 / 50000) loss: 13.385174
(Iteration 14361 / 50000) loss: 13.068138
(Iteration 14371 / 50000) loss: 12.703210
(Iteration 14381 / 50000) loss: 12.937831
(Iteration 14391 / 50000) loss: 11.701643
(Iteration 14401 / 50000) loss: 12.592990
(Iteration 14411 / 50000) loss: 12.899636
(Iteration 14421 / 50000) loss: 13.007304
(Iteration 14431 / 50000) loss: 13.157535
(Iteration 14441 / 50000) loss: 11.862817
(Iteration 14451 / 50000) loss: 14.304760
(Iteration 14461 / 50000) loss: 12.371818
(Iteration 14471 / 50000) loss: 13.473682
(Iteration 14481 / 50000) loss: 12.229868
(Iteration 14491 / 50000) loss: 12.546769
(Iteration 14501 / 50000) loss: 13.931292
(Iteration 14511 / 50000) loss: 13.253244
(Iteration 14521 / 50000) loss: 13.121725
(Iteration 14531 / 50000) loss: 11.094180
(Iteration 14541 / 50000) loss: 11.887710
(Iteration 14551 / 50000) loss: 12.681082
(Iteration 14561 / 50000) loss: 11.838241
(Iteration 14571 / 50000) loss: 12.941057
(Iteration 14581 / 50000) loss: 13.451517
(Iteration 14591 / 50000) loss: 13.021491
(Iteration 14601 / 50000) loss: 12.895560
(Iteration 14611 / 50000) loss: 12.665048
(Iteration 14621 / 50000) loss: 12.232843
(Iteration 14631 / 50000) loss: 12.038119
(Iteration 14641 / 50000) loss: 12.484437
(Iteration 14651 / 50000) loss: 12.433969
(Iteration 14661 / 50000) loss: 13.028412
(Iteration 14671 / 50000) loss: 12.080096
(Iteration 14681 / 50000) loss: 12.311698
(Iteration 14691 / 50000) loss: 12.546569
(Iteration 14701 / 50000) loss: 14.113309
(Iteration 14711 / 50000) loss: 12.488484
(Iteration 14721 / 50000) loss: 12.186787
(Iteration 14731 / 50000) loss: 12.642658
(Iteration 14741 / 50000) loss: 12.404406
(Iteration 14751 / 50000) loss: 12.347173
(Iteration 14761 / 50000) loss: 12.444105
(Iteration 14771 / 50000) loss: 13.302134
(Iteration 14781 / 50000) loss: 13.010895
(Iteration 14791 / 50000) loss: 12.974823
(Iteration 14801 / 50000) loss: 11.829310
(Iteration 14811 / 50000) loss: 11.260037
(Iteration 14821 / 50000) loss: 12.596599
(Iteration 14831 / 50000) loss: 13.065502
(Iteration 14841 / 50000) loss: 13.409545
(Iteration 14851 / 50000) loss: 12.718419
(Iteration 14861 / 50000) loss: 14.324716
(Iteration 14871 / 50000) loss: 11.707780
(Iteration 14881 / 50000) loss: 12.027935
(Iteration 14891 / 50000) loss: 12.431318
(Iteration 14901 / 50000) loss: 12.291580
(Iteration 14911 / 50000) loss: 11.960058
(Iteration 14921 / 50000) loss: 12.088619
(Iteration 14931 / 50000) loss: 12.701819
(Iteration 14941 / 50000) loss: 12.096389
(Iteration 14951 / 50000) loss: 12.935023
(Iteration 14961 / 50000) loss: 12.455396
(Iteration 14971 / 50000) loss: 12.118695
(Iteration 14981 / 50000) loss: 12.263365
(Iteration 14991 / 50000) loss: 11.766314
(Iteration 15001 / 50000) loss: 13.555945
(Iteration 15011 / 50000) loss: 11.187251
(Iteration 15021 / 50000) loss: 12.524558
(Iteration 15031 / 50000) loss: 11.636717
(Iteration 15041 / 50000) loss: 12.953315
(Iteration 15051 / 50000) loss: 11.751568
(Iteration 15061 / 50000) loss: 13.889372
(Iteration 15071 / 50000) loss: 12.193191
(Iteration 15081 / 50000) loss: 10.477939
(Iteration 15091 / 50000) loss: 13.249164
(Iteration 15101 / 50000) loss: 12.576220
(Iteration 15111 / 50000) loss: 12.131267
(Iteration 15121 / 50000) loss: 12.450651
(Iteration 15131 / 50000) loss: 12.219051
(Iteration 15141 / 50000) loss: 11.082464
(Iteration 15151 / 50000) loss: 11.831227
(Iteration 15161 / 50000) loss: 12.946210
(Iteration 15171 / 50000) loss: 11.917077
(Iteration 15181 / 50000) loss: 12.222676
(Iteration 15191 / 50000) loss: 12.109309
(Iteration 15201 / 50000) loss: 11.651755
(Iteration 15211 / 50000) loss: 13.374680
(Iteration 15221 / 50000) loss: 11.293789
(Iteration 15231 / 50000) loss: 12.240737
(Iteration 15241 / 50000) loss: 11.724727
(Iteration 15251 / 50000) loss: 11.687255
(Iteration 15261 / 50000) loss: 12.555184
(Iteration 15271 / 50000) loss: 11.954406
(Iteration 15281 / 50000) loss: 11.278813
(Iteration 15291 / 50000) loss: 12.563500
(Iteration 15301 / 50000) loss: 11.653630
(Iteration 15311 / 50000) loss: 12.146879
(Iteration 15321 / 50000) loss: 12.798332
(Iteration 15331 / 50000) loss: 12.056937
(Iteration 15341 / 50000) loss: 12.759643
(Iteration 15351 / 50000) loss: 10.857654
(Iteration 15361 / 50000) loss: 11.936877
(Iteration 15371 / 50000) loss: 11.605721
(Iteration 15381 / 50000) loss: 10.577835
(Iteration 15391 / 50000) loss: 13.866457
(Iteration 15401 / 50000) loss: 11.324263
(Iteration 15411 / 50000) loss: 11.341328
(Iteration 15421 / 50000) loss: 12.195272
(Iteration 15431 / 50000) loss: 11.740646
(Iteration 15441 / 50000) loss: 11.837903
(Iteration 15451 / 50000) loss: 11.403308
(Iteration 15461 / 50000) loss: 11.611504
(Iteration 15471 / 50000) loss: 11.599004
(Iteration 15481 / 50000) loss: 13.251967
(Iteration 15491 / 50000) loss: 12.087936
(Iteration 15501 / 50000) loss: 12.477865
(Iteration 15511 / 50000) loss: 12.897188
(Iteration 15521 / 50000) loss: 11.616325
(Iteration 15531 / 50000) loss: 11.970239
(Iteration 15541 / 50000) loss: 13.110265
(Iteration 15551 / 50000) loss: 10.032874
(Iteration 15561 / 50000) loss: 12.048217
(Iteration 15571 / 50000) loss: 13.661204
(Iteration 15581 / 50000) loss: 12.378044
(Iteration 15591 / 50000) loss: 13.463606
(Iteration 15601 / 50000) loss: 11.567825
(Iteration 15611 / 50000) loss: 13.315365
(Iteration 15621 / 50000) loss: 11.886029
(Iteration 15631 / 50000) loss: 12.203477
(Iteration 15641 / 50000) loss: 12.675750
(Iteration 15651 / 50000) loss: 11.473643
(Iteration 15661 / 50000) loss: 11.899258
(Iteration 15671 / 50000) loss: 11.358312
(Iteration 15681 / 50000) loss: 11.700593
(Iteration 15691 / 50000) loss: 12.089305
(Iteration 15701 / 50000) loss: 12.311487
(Iteration 15711 / 50000) loss: 11.769994
(Iteration 15721 / 50000) loss: 11.822524
(Iteration 15731 / 50000) loss: 11.825413
(Iteration 15741 / 50000) loss: 12.134475
(Iteration 15751 / 50000) loss: 11.153241
(Iteration 15761 / 50000) loss: 11.764351
(Iteration 15771 / 50000) loss: 10.833557
(Iteration 15781 / 50000) loss: 11.776622
(Iteration 15791 / 50000) loss: 12.315434
(Iteration 15801 / 50000) loss: 11.474076
(Iteration 15811 / 50000) loss: 12.225371
(Iteration 15821 / 50000) loss: 12.024598
(Iteration 15831 / 50000) loss: 11.527045
(Iteration 15841 / 50000) loss: 11.369645
(Iteration 15851 / 50000) loss: 10.890027
(Iteration 15861 / 50000) loss: 11.307582
(Iteration 15871 / 50000) loss: 11.548539
(Iteration 15881 / 50000) loss: 12.477740
(Iteration 15891 / 50000) loss: 12.732750
(Iteration 15901 / 50000) loss: 11.482215
(Iteration 15911 / 50000) loss: 11.608670
(Iteration 15921 / 50000) loss: 12.782837
(Iteration 15931 / 50000) loss: 10.484980
(Iteration 15941 / 50000) loss: 12.283507
(Iteration 15951 / 50000) loss: 11.891543
(Iteration 15961 / 50000) loss: 12.033310
(Iteration 15971 / 50000) loss: 11.688822
(Iteration 15981 / 50000) loss: 11.388120
(Iteration 15991 / 50000) loss: 11.198258
(Iteration 16001 / 50000) loss: 12.843046
(Iteration 16011 / 50000) loss: 11.681976
(Iteration 16021 / 50000) loss: 11.622078
(Iteration 16031 / 50000) loss: 12.847421
(Iteration 16041 / 50000) loss: 10.802940
(Iteration 16051 / 50000) loss: 11.186993
(Iteration 16061 / 50000) loss: 12.435582
(Iteration 16071 / 50000) loss: 12.531185
(Iteration 16081 / 50000) loss: 11.205331
(Iteration 16091 / 50000) loss: 11.685692
(Iteration 16101 / 50000) loss: 10.587287
(Iteration 16111 / 50000) loss: 14.092516
(Iteration 16121 / 50000) loss: 12.023104
(Iteration 16131 / 50000) loss: 11.396548
(Iteration 16141 / 50000) loss: 11.315796
(Iteration 16151 / 50000) loss: 11.551324
(Iteration 16161 / 50000) loss: 11.917056
(Iteration 16171 / 50000) loss: 12.343707
(Iteration 16181 / 50000) loss: 11.319594
(Iteration 16191 / 50000) loss: 12.700071
(Iteration 16201 / 50000) loss: 11.871254
(Iteration 16211 / 50000) loss: 11.686509
(Iteration 16221 / 50000) loss: 10.915664
(Iteration 16231 / 50000) loss: 10.433999
(Iteration 16241 / 50000) loss: 11.929084
(Iteration 16251 / 50000) loss: 11.677173
(Iteration 16261 / 50000) loss: 12.464382
(Iteration 16271 / 50000) loss: 13.125372
(Iteration 16281 / 50000) loss: 11.932524
(Iteration 16291 / 50000) loss: 13.353013
(Iteration 16301 / 50000) loss: 11.513099
(Iteration 16311 / 50000) loss: 12.218660
(Iteration 16321 / 50000) loss: 12.441169
(Iteration 16331 / 50000) loss: 10.516090
(Iteration 16341 / 50000) loss: 11.981027
(Iteration 16351 / 50000) loss: 11.492979
(Iteration 16361 / 50000) loss: 12.092228
(Iteration 16371 / 50000) loss: 10.927425
(Iteration 16381 / 50000) loss: 12.121218
(Iteration 16391 / 50000) loss: 11.012550
(Iteration 16401 / 50000) loss: 10.793390
(Iteration 16411 / 50000) loss: 11.771472
(Iteration 16421 / 50000) loss: 10.854056
(Iteration 16431 / 50000) loss: 11.578832
(Iteration 16441 / 50000) loss: 10.967962
(Iteration 16451 / 50000) loss: 12.149971
(Iteration 16461 / 50000) loss: 11.593949
(Iteration 16471 / 50000) loss: 11.297883
(Iteration 16481 / 50000) loss: 12.196899
(Iteration 16491 / 50000) loss: 11.850568
(Iteration 16501 / 50000) loss: 12.209374
(Iteration 16511 / 50000) loss: 12.935850
(Iteration 16521 / 50000) loss: 10.612351
(Iteration 16531 / 50000) loss: 11.352786
(Iteration 16541 / 50000) loss: 12.435870
(Iteration 16551 / 50000) loss: 12.735647
(Iteration 16561 / 50000) loss: 11.330062
(Iteration 16571 / 50000) loss: 11.587207
(Iteration 16581 / 50000) loss: 11.373011
(Iteration 16591 / 50000) loss: 10.828225
(Iteration 16601 / 50000) loss: 11.283717
(Iteration 16611 / 50000) loss: 11.089741
(Iteration 16621 / 50000) loss: 12.170773
(Iteration 16631 / 50000) loss: 11.021169
(Iteration 16641 / 50000) loss: 10.549046
(Iteration 16651 / 50000) loss: 11.584072
(Iteration 16661 / 50000) loss: 13.338707
(Iteration 16671 / 50000) loss: 10.833974
(Iteration 16681 / 50000) loss: 11.009647
(Iteration 16691 / 50000) loss: 12.757693
(Iteration 16701 / 50000) loss: 11.770811
(Iteration 16711 / 50000) loss: 10.964763
(Iteration 16721 / 50000) loss: 11.098016
(Iteration 16731 / 50000) loss: 11.175961
(Iteration 16741 / 50000) loss: 11.562343
(Iteration 16751 / 50000) loss: 12.115902
(Iteration 16761 / 50000) loss: 11.619846
(Iteration 16771 / 50000) loss: 10.494006
(Iteration 16781 / 50000) loss: 10.918501
(Iteration 16791 / 50000) loss: 11.057513
(Iteration 16801 / 50000) loss: 11.732716
(Iteration 16811 / 50000) loss: 10.938954
(Iteration 16821 / 50000) loss: 10.678232
(Iteration 16831 / 50000) loss: 10.431115
(Iteration 16841 / 50000) loss: 10.760226
(Iteration 16851 / 50000) loss: 11.543540
(Iteration 16861 / 50000) loss: 10.608817
(Iteration 16871 / 50000) loss: 11.512707
(Iteration 16881 / 50000) loss: 12.005934
(Iteration 16891 / 50000) loss: 11.629787
(Iteration 16901 / 50000) loss: 11.787999
(Iteration 16911 / 50000) loss: 11.506127
(Iteration 16921 / 50000) loss: 10.906245
(Iteration 16931 / 50000) loss: 11.023993
(Iteration 16941 / 50000) loss: 11.247785
(Iteration 16951 / 50000) loss: 10.408766
(Iteration 16961 / 50000) loss: 12.097917
(Iteration 16971 / 50000) loss: 10.269312
(Iteration 16981 / 50000) loss: 11.849755
(Iteration 16991 / 50000) loss: 11.785420
(Iteration 17001 / 50000) loss: 12.046339
(Iteration 17011 / 50000) loss: 12.082921
(Iteration 17021 / 50000) loss: 10.585973
(Iteration 17031 / 50000) loss: 11.696827
(Iteration 17041 / 50000) loss: 11.125047
(Iteration 17051 / 50000) loss: 11.230368
(Iteration 17061 / 50000) loss: 11.752436
(Iteration 17071 / 50000) loss: 11.603684
(Iteration 17081 / 50000) loss: 11.221140
(Iteration 17091 / 50000) loss: 9.957755
(Iteration 17101 / 50000) loss: 9.507976
(Iteration 17111 / 50000) loss: 10.676257
(Iteration 17121 / 50000) loss: 10.120954
(Iteration 17131 / 50000) loss: 10.824641
(Iteration 17141 / 50000) loss: 10.648922
(Iteration 17151 / 50000) loss: 10.235326
(Iteration 17161 / 50000) loss: 11.643248
(Iteration 17171 / 50000) loss: 11.120327
(Iteration 17181 / 50000) loss: 11.835344
(Iteration 17191 / 50000) loss: 10.316847
(Iteration 17201 / 50000) loss: 9.736040
(Iteration 17211 / 50000) loss: 10.664222
(Iteration 17221 / 50000) loss: 11.476979
(Iteration 17231 / 50000) loss: 10.947952
(Iteration 17241 / 50000) loss: 10.867407
(Iteration 17251 / 50000) loss: 11.185871
(Iteration 17261 / 50000) loss: 12.049559
(Iteration 17271 / 50000) loss: 11.798857
(Iteration 17281 / 50000) loss: 10.427165
(Iteration 17291 / 50000) loss: 10.785813
(Iteration 17301 / 50000) loss: 10.760485
(Iteration 17311 / 50000) loss: 11.213517
(Iteration 17321 / 50000) loss: 11.020853
(Iteration 17331 / 50000) loss: 10.823077
(Iteration 17341 / 50000) loss: 12.008430
(Iteration 17351 / 50000) loss: 10.311567
(Iteration 17361 / 50000) loss: 11.777992
(Iteration 17371 / 50000) loss: 10.473614
(Iteration 17381 / 50000) loss: 10.306853
(Iteration 17391 / 50000) loss: 11.249818
(Iteration 17401 / 50000) loss: 10.349057
(Iteration 17411 / 50000) loss: 10.322825
(Iteration 17421 / 50000) loss: 11.641846
(Iteration 17431 / 50000) loss: 12.312911
(Iteration 17441 / 50000) loss: 11.472314
(Iteration 17451 / 50000) loss: 11.130931
(Iteration 17461 / 50000) loss: 10.737630
(Iteration 17471 / 50000) loss: 11.932442
(Iteration 17481 / 50000) loss: 12.409054
(Iteration 17491 / 50000) loss: 11.304795
(Iteration 17501 / 50000) loss: 11.216029
(Iteration 17511 / 50000) loss: 10.326899
(Iteration 17521 / 50000) loss: 10.617266
(Iteration 17531 / 50000) loss: 10.728381
(Iteration 17541 / 50000) loss: 11.106885
(Iteration 17551 / 50000) loss: 10.296625
(Iteration 17561 / 50000) loss: 12.090609
(Iteration 17571 / 50000) loss: 10.873164
(Iteration 17581 / 50000) loss: 9.877620
(Iteration 17591 / 50000) loss: 9.917652
(Iteration 17601 / 50000) loss: 11.370737
(Iteration 17611 / 50000) loss: 9.848011
(Iteration 17621 / 50000) loss: 9.777739
(Iteration 17631 / 50000) loss: 10.145837
(Iteration 17641 / 50000) loss: 10.970479
(Iteration 17651 / 50000) loss: 11.010025
(Iteration 17661 / 50000) loss: 12.643598
(Iteration 17671 / 50000) loss: 9.629636
(Iteration 17681 / 50000) loss: 10.267055
(Iteration 17691 / 50000) loss: 10.583009
(Iteration 17701 / 50000) loss: 10.403852
(Iteration 17711 / 50000) loss: 9.886601
(Iteration 17721 / 50000) loss: 11.245544
(Iteration 17731 / 50000) loss: 10.642115
(Iteration 17741 / 50000) loss: 11.500731
(Iteration 17751 / 50000) loss: 11.172897
(Iteration 17761 / 50000) loss: 11.567817
(Iteration 17771 / 50000) loss: 10.821569
(Iteration 17781 / 50000) loss: 10.110706
(Iteration 17791 / 50000) loss: 9.712391
(Iteration 17801 / 50000) loss: 10.905462
(Iteration 17811 / 50000) loss: 9.676707
(Iteration 17821 / 50000) loss: 11.434267
(Iteration 17831 / 50000) loss: 10.852395
(Iteration 17841 / 50000) loss: 10.972991
(Iteration 17851 / 50000) loss: 11.338946
(Iteration 17861 / 50000) loss: 9.860258
(Iteration 17871 / 50000) loss: 11.278821
(Iteration 17881 / 50000) loss: 9.505521
(Iteration 17891 / 50000) loss: 10.680456
(Iteration 17901 / 50000) loss: 10.085313
(Iteration 17911 / 50000) loss: 11.611902
(Iteration 17921 / 50000) loss: 11.106735
(Iteration 17931 / 50000) loss: 11.018422
(Iteration 17941 / 50000) loss: 11.317119
(Iteration 17951 / 50000) loss: 9.553220
(Iteration 17961 / 50000) loss: 10.679138
(Iteration 17971 / 50000) loss: 10.587583
(Iteration 17981 / 50000) loss: 10.373363
(Iteration 17991 / 50000) loss: 10.385629
(Iteration 18001 / 50000) loss: 10.685374
(Iteration 18011 / 50000) loss: 11.459960
(Iteration 18021 / 50000) loss: 10.998341
(Iteration 18031 / 50000) loss: 9.240413
(Iteration 18041 / 50000) loss: 11.474645
(Iteration 18051 / 50000) loss: 10.360581
(Iteration 18061 / 50000) loss: 11.386265
(Iteration 18071 / 50000) loss: 10.007146
(Iteration 18081 / 50000) loss: 11.924049
(Iteration 18091 / 50000) loss: 10.734305
(Iteration 18101 / 50000) loss: 9.896938
(Iteration 18111 / 50000) loss: 10.662252
(Iteration 18121 / 50000) loss: 10.517017
(Iteration 18131 / 50000) loss: 9.652554
(Iteration 18141 / 50000) loss: 10.277673
(Iteration 18151 / 50000) loss: 10.729264
(Iteration 18161 / 50000) loss: 10.731922
(Iteration 18171 / 50000) loss: 10.735942
(Iteration 18181 / 50000) loss: 10.696041
(Iteration 18191 / 50000) loss: 9.986263
(Iteration 18201 / 50000) loss: 10.222195
(Iteration 18211 / 50000) loss: 10.956273
(Iteration 18221 / 50000) loss: 10.665185
(Iteration 18231 / 50000) loss: 11.205534
(Iteration 18241 / 50000) loss: 10.476006
(Iteration 18251 / 50000) loss: 9.865654
(Iteration 18261 / 50000) loss: 10.725330
(Iteration 18271 / 50000) loss: 10.061578
(Iteration 18281 / 50000) loss: 10.301649
(Iteration 18291 / 50000) loss: 11.359639
(Iteration 18301 / 50000) loss: 10.817521
(Iteration 18311 / 50000) loss: 10.169991
(Iteration 18321 / 50000) loss: 10.172515
(Iteration 18331 / 50000) loss: 11.201472
(Iteration 18341 / 50000) loss: 10.159015
(Iteration 18351 / 50000) loss: 10.291038
(Iteration 18361 / 50000) loss: 9.938997
(Iteration 18371 / 50000) loss: 10.181904
(Iteration 18381 / 50000) loss: 10.849457
(Iteration 18391 / 50000) loss: 10.672127
(Iteration 18401 / 50000) loss: 9.263496
(Iteration 18411 / 50000) loss: 10.307031
(Iteration 18421 / 50000) loss: 10.510911
(Iteration 18431 / 50000) loss: 11.038968
(Iteration 18441 / 50000) loss: 11.020901
(Iteration 18451 / 50000) loss: 10.104586
(Iteration 18461 / 50000) loss: 10.301331
(Iteration 18471 / 50000) loss: 10.139678
(Iteration 18481 / 50000) loss: 10.231183
(Iteration 18491 / 50000) loss: 10.070831
(Iteration 18501 / 50000) loss: 9.574512
(Iteration 18511 / 50000) loss: 10.463215
(Iteration 18521 / 50000) loss: 9.898771
(Iteration 18531 / 50000) loss: 10.484448
(Iteration 18541 / 50000) loss: 9.649048
(Iteration 18551 / 50000) loss: 9.938678
(Iteration 18561 / 50000) loss: 10.572805
(Iteration 18571 / 50000) loss: 9.577142
(Iteration 18581 / 50000) loss: 10.687667
(Iteration 18591 / 50000) loss: 11.200777
(Iteration 18601 / 50000) loss: 10.763388
(Iteration 18611 / 50000) loss: 10.272336
(Iteration 18621 / 50000) loss: 10.176299
(Iteration 18631 / 50000) loss: 9.983980
(Iteration 18641 / 50000) loss: 9.509213
(Iteration 18651 / 50000) loss: 11.280973
(Iteration 18661 / 50000) loss: 9.929589
(Iteration 18671 / 50000) loss: 10.660050
(Iteration 18681 / 50000) loss: 10.918643
(Iteration 18691 / 50000) loss: 11.279820
(Iteration 18701 / 50000) loss: 8.200775
(Iteration 18711 / 50000) loss: 10.145697
(Iteration 18721 / 50000) loss: 10.719754
(Iteration 18731 / 50000) loss: 10.373231
(Iteration 18741 / 50000) loss: 10.692183
(Iteration 18751 / 50000) loss: 9.868389
(Iteration 18761 / 50000) loss: 10.468767
(Iteration 18771 / 50000) loss: 10.386947
(Iteration 18781 / 50000) loss: 10.767704
(Iteration 18791 / 50000) loss: 10.599739
(Iteration 18801 / 50000) loss: 9.907078
(Iteration 18811 / 50000) loss: 10.541379
(Iteration 18821 / 50000) loss: 10.252145
(Iteration 18831 / 50000) loss: 9.752723
(Iteration 18841 / 50000) loss: 10.353258
(Iteration 18851 / 50000) loss: 10.646919
(Iteration 18861 / 50000) loss: 9.628460
(Iteration 18871 / 50000) loss: 10.072590
(Iteration 18881 / 50000) loss: 10.163578
(Iteration 18891 / 50000) loss: 10.620025
(Iteration 18901 / 50000) loss: 9.836694
(Iteration 18911 / 50000) loss: 10.603083
(Iteration 18921 / 50000) loss: 10.103118
(Iteration 18931 / 50000) loss: 9.801279
(Iteration 18941 / 50000) loss: 10.848746
(Iteration 18951 / 50000) loss: 9.895478
(Iteration 18961 / 50000) loss: 10.297117
(Iteration 18971 / 50000) loss: 10.411863
(Iteration 18981 / 50000) loss: 10.340045
(Iteration 18991 / 50000) loss: 9.191691
(Iteration 19001 / 50000) loss: 9.966189
(Iteration 19011 / 50000) loss: 9.533653
(Iteration 19021 / 50000) loss: 10.247374
(Iteration 19031 / 50000) loss: 10.588440
(Iteration 19041 / 50000) loss: 9.284956
(Iteration 19051 / 50000) loss: 10.201916
(Iteration 19061 / 50000) loss: 11.133341
(Iteration 19071 / 50000) loss: 10.075460
(Iteration 19081 / 50000) loss: 10.525139
(Iteration 19091 / 50000) loss: 10.104523
(Iteration 19101 / 50000) loss: 10.791582
(Iteration 19111 / 50000) loss: 10.012606
(Iteration 19121 / 50000) loss: 9.797189
(Iteration 19131 / 50000) loss: 10.262831
(Iteration 19141 / 50000) loss: 9.913860
(Iteration 19151 / 50000) loss: 10.924628
(Iteration 19161 / 50000) loss: 10.677093
(Iteration 19171 / 50000) loss: 9.242387
(Iteration 19181 / 50000) loss: 10.529967
(Iteration 19191 / 50000) loss: 9.869770
(Iteration 19201 / 50000) loss: 10.424482
(Iteration 19211 / 50000) loss: 10.601426
(Iteration 19221 / 50000) loss: 9.736970
(Iteration 19231 / 50000) loss: 9.840945
(Iteration 19241 / 50000) loss: 9.897951
(Iteration 19251 / 50000) loss: 8.938630
(Iteration 19261 / 50000) loss: 9.785966
(Iteration 19271 / 50000) loss: 9.573255
(Iteration 19281 / 50000) loss: 9.921889
(Iteration 19291 / 50000) loss: 9.937131
(Iteration 19301 / 50000) loss: 10.048968
(Iteration 19311 / 50000) loss: 9.730276
(Iteration 19321 / 50000) loss: 9.615229
(Iteration 19331 / 50000) loss: 9.434570
(Iteration 19341 / 50000) loss: 9.338075
(Iteration 19351 / 50000) loss: 9.876075
(Iteration 19361 / 50000) loss: 9.102637
(Iteration 19371 / 50000) loss: 10.249874
(Iteration 19381 / 50000) loss: 9.649609
(Iteration 19391 / 50000) loss: 10.295938
(Iteration 19401 / 50000) loss: 9.671122
(Iteration 19411 / 50000) loss: 10.637351
(Iteration 19421 / 50000) loss: 9.807225
(Iteration 19431 / 50000) loss: 10.222573
(Iteration 19441 / 50000) loss: 9.664794
(Iteration 19451 / 50000) loss: 9.749264
(Iteration 19461 / 50000) loss: 9.057043
(Iteration 19471 / 50000) loss: 9.994290
(Iteration 19481 / 50000) loss: 9.820397
(Iteration 19491 / 50000) loss: 10.867925
(Iteration 19501 / 50000) loss: 9.602223
(Iteration 19511 / 50000) loss: 10.487108
(Iteration 19521 / 50000) loss: 9.598407
(Iteration 19531 / 50000) loss: 9.899903
(Iteration 19541 / 50000) loss: 8.303294
(Iteration 19551 / 50000) loss: 10.037828
(Iteration 19561 / 50000) loss: 8.951615
(Iteration 19571 / 50000) loss: 9.983304
(Iteration 19581 / 50000) loss: 10.997965
(Iteration 19591 / 50000) loss: 10.271451
(Iteration 19601 / 50000) loss: 9.928385
(Iteration 19611 / 50000) loss: 9.769983
(Iteration 19621 / 50000) loss: 9.310971
(Iteration 19631 / 50000) loss: 10.151406
(Iteration 19641 / 50000) loss: 9.182326
(Iteration 19651 / 50000) loss: 8.722151
(Iteration 19661 / 50000) loss: 10.252161
(Iteration 19671 / 50000) loss: 9.494374
(Iteration 19681 / 50000) loss: 10.425398
(Iteration 19691 / 50000) loss: 10.691721
(Iteration 19701 / 50000) loss: 9.584975
(Iteration 19711 / 50000) loss: 10.115330
(Iteration 19721 / 50000) loss: 10.530268
(Iteration 19731 / 50000) loss: 9.572986
(Iteration 19741 / 50000) loss: 9.873109
(Iteration 19751 / 50000) loss: 9.773654
(Iteration 19761 / 50000) loss: 9.651002
(Iteration 19771 / 50000) loss: 8.892369
(Iteration 19781 / 50000) loss: 10.503395
(Iteration 19791 / 50000) loss: 9.250310
(Iteration 19801 / 50000) loss: 9.693540
(Iteration 19811 / 50000) loss: 9.217221
(Iteration 19821 / 50000) loss: 9.780228
(Iteration 19831 / 50000) loss: 10.229153
(Iteration 19841 / 50000) loss: 9.836088
(Iteration 19851 / 50000) loss: 9.843092
(Iteration 19861 / 50000) loss: 9.606407
(Iteration 19871 / 50000) loss: 9.682701
(Iteration 19881 / 50000) loss: 9.707394
(Iteration 19891 / 50000) loss: 9.679084
(Iteration 19901 / 50000) loss: 10.472096
(Iteration 19911 / 50000) loss: 9.238102
(Iteration 19921 / 50000) loss: 9.477805
(Iteration 19931 / 50000) loss: 10.821126
(Iteration 19941 / 50000) loss: 9.760634
(Iteration 19951 / 50000) loss: 9.071098
(Iteration 19961 / 50000) loss: 9.553391
(Iteration 19971 / 50000) loss: 10.082787
(Iteration 19981 / 50000) loss: 10.569919
(Iteration 19991 / 50000) loss: 8.854854
(Iteration 20001 / 50000) loss: 9.777065
(Iteration 20011 / 50000) loss: 10.200590
(Iteration 20021 / 50000) loss: 9.396489
(Iteration 20031 / 50000) loss: 9.583538
(Iteration 20041 / 50000) loss: 8.413725
(Iteration 20051 / 50000) loss: 9.448106
(Iteration 20061 / 50000) loss: 9.316168
(Iteration 20071 / 50000) loss: 9.510789
(Iteration 20081 / 50000) loss: 9.023071
(Iteration 20091 / 50000) loss: 9.622688
(Iteration 20101 / 50000) loss: 9.261488
(Iteration 20111 / 50000) loss: 9.793390
(Iteration 20121 / 50000) loss: 9.341188
(Iteration 20131 / 50000) loss: 8.952203
(Iteration 20141 / 50000) loss: 9.070444
(Iteration 20151 / 50000) loss: 9.737986
(Iteration 20161 / 50000) loss: 10.170590
(Iteration 20171 / 50000) loss: 10.037182
(Iteration 20181 / 50000) loss: 9.335745
(Iteration 20191 / 50000) loss: 10.751371
(Iteration 20201 / 50000) loss: 10.224881
(Iteration 20211 / 50000) loss: 9.202674
(Iteration 20221 / 50000) loss: 9.904796
(Iteration 20231 / 50000) loss: 10.076535
(Iteration 20241 / 50000) loss: 8.995797
(Iteration 20251 / 50000) loss: 9.548426
(Iteration 20261 / 50000) loss: 9.621029
(Iteration 20271 / 50000) loss: 9.837357
(Iteration 20281 / 50000) loss: 8.902033
(Iteration 20291 / 50000) loss: 9.588188
(Iteration 20301 / 50000) loss: 10.330018
(Iteration 20311 / 50000) loss: 10.168080
(Iteration 20321 / 50000) loss: 9.471613
(Iteration 20331 / 50000) loss: 10.226764
(Iteration 20341 / 50000) loss: 10.020547
(Iteration 20351 / 50000) loss: 9.528696
(Iteration 20361 / 50000) loss: 10.326015
(Iteration 20371 / 50000) loss: 9.383863
(Iteration 20381 / 50000) loss: 8.677970
(Iteration 20391 / 50000) loss: 8.657609
(Iteration 20401 / 50000) loss: 9.958138
(Iteration 20411 / 50000) loss: 8.617082
(Iteration 20421 / 50000) loss: 10.035840
(Iteration 20431 / 50000) loss: 9.860879
(Iteration 20441 / 50000) loss: 9.382918
(Iteration 20451 / 50000) loss: 9.989788
(Iteration 20461 / 50000) loss: 8.830731
(Iteration 20471 / 50000) loss: 9.138534
(Iteration 20481 / 50000) loss: 9.457747
(Iteration 20491 / 50000) loss: 8.984228
(Iteration 20501 / 50000) loss: 8.943382
(Iteration 20511 / 50000) loss: 9.694217
(Iteration 20521 / 50000) loss: 10.155543
(Iteration 20531 / 50000) loss: 10.752902
(Iteration 20541 / 50000) loss: 9.399323
(Iteration 20551 / 50000) loss: 9.098574
(Iteration 20561 / 50000) loss: 10.179966
(Iteration 20571 / 50000) loss: 9.516083
(Iteration 20581 / 50000) loss: 9.801933
(Iteration 20591 / 50000) loss: 9.043855
(Iteration 20601 / 50000) loss: 8.547657
(Iteration 20611 / 50000) loss: 8.841669
(Iteration 20621 / 50000) loss: 9.509892
(Iteration 20631 / 50000) loss: 10.198972
(Iteration 20641 / 50000) loss: 9.692019
(Iteration 20651 / 50000) loss: 8.545354
(Iteration 20661 / 50000) loss: 9.414965
(Iteration 20671 / 50000) loss: 9.749872
(Iteration 20681 / 50000) loss: 9.240202
(Iteration 20691 / 50000) loss: 9.157465
(Iteration 20701 / 50000) loss: 9.732455
(Iteration 20711 / 50000) loss: 9.925468
(Iteration 20721 / 50000) loss: 10.204095
(Iteration 20731 / 50000) loss: 9.842916
(Iteration 20741 / 50000) loss: 9.267500
(Iteration 20751 / 50000) loss: 9.040181
(Iteration 20761 / 50000) loss: 8.803595
(Iteration 20771 / 50000) loss: 9.419081
(Iteration 20781 / 50000) loss: 10.202384
(Iteration 20791 / 50000) loss: 9.370142
(Iteration 20801 / 50000) loss: 9.515065
(Iteration 20811 / 50000) loss: 9.025579
(Iteration 20821 / 50000) loss: 8.877830
(Iteration 20831 / 50000) loss: 9.769890
(Iteration 20841 / 50000) loss: 9.744019
(Iteration 20851 / 50000) loss: 8.884290
(Iteration 20861 / 50000) loss: 8.885941
(Iteration 20871 / 50000) loss: 8.867761
(Iteration 20881 / 50000) loss: 8.693495
(Iteration 20891 / 50000) loss: 9.639863
(Iteration 20901 / 50000) loss: 8.946441
(Iteration 20911 / 50000) loss: 8.844769
(Iteration 20921 / 50000) loss: 9.994848
(Iteration 20931 / 50000) loss: 10.598706
(Iteration 20941 / 50000) loss: 8.976510
(Iteration 20951 / 50000) loss: 10.003313
(Iteration 20961 / 50000) loss: 9.055531
(Iteration 20971 / 50000) loss: 10.032047
(Iteration 20981 / 50000) loss: 9.168206
(Iteration 20991 / 50000) loss: 9.432503
(Iteration 21001 / 50000) loss: 9.185262
(Iteration 21011 / 50000) loss: 8.727891
(Iteration 21021 / 50000) loss: 9.688075
(Iteration 21031 / 50000) loss: 8.799508
(Iteration 21041 / 50000) loss: 8.574701
(Iteration 21051 / 50000) loss: 8.981119
(Iteration 21061 / 50000) loss: 9.300615
(Iteration 21071 / 50000) loss: 9.969541
(Iteration 21081 / 50000) loss: 9.327891
(Iteration 21091 / 50000) loss: 9.362092
(Iteration 21101 / 50000) loss: 8.547433
(Iteration 21111 / 50000) loss: 9.681332
(Iteration 21121 / 50000) loss: 9.236007
(Iteration 21131 / 50000) loss: 9.154927
(Iteration 21141 / 50000) loss: 9.045409
(Iteration 21151 / 50000) loss: 9.718047
(Iteration 21161 / 50000) loss: 9.917458
(Iteration 21171 / 50000) loss: 9.318614
(Iteration 21181 / 50000) loss: 8.639803
(Iteration 21191 / 50000) loss: 8.709849
(Iteration 21201 / 50000) loss: 10.063288
(Iteration 21211 / 50000) loss: 9.283163
(Iteration 21221 / 50000) loss: 8.717003
(Iteration 21231 / 50000) loss: 9.008152
(Iteration 21241 / 50000) loss: 10.693272
(Iteration 21251 / 50000) loss: 10.002656
(Iteration 21261 / 50000) loss: 8.647387
(Iteration 21271 / 50000) loss: 9.753452
(Iteration 21281 / 50000) loss: 9.912834
(Iteration 21291 / 50000) loss: 8.051475
(Iteration 21301 / 50000) loss: 8.876884
(Iteration 21311 / 50000) loss: 8.996730
(Iteration 21321 / 50000) loss: 9.845915
(Iteration 21331 / 50000) loss: 9.271611
(Iteration 21341 / 50000) loss: 9.116343
(Iteration 21351 / 50000) loss: 9.011598
(Iteration 21361 / 50000) loss: 9.384234
(Iteration 21371 / 50000) loss: 10.536725
(Iteration 21381 / 50000) loss: 9.130913
(Iteration 21391 / 50000) loss: 9.502020
(Iteration 21401 / 50000) loss: 8.835864
(Iteration 21411 / 50000) loss: 7.527517
(Iteration 21421 / 50000) loss: 8.748132
(Iteration 21431 / 50000) loss: 8.999865
(Iteration 21441 / 50000) loss: 8.804122
(Iteration 21451 / 50000) loss: 9.144144
(Iteration 21461 / 50000) loss: 8.223530
(Iteration 21471 / 50000) loss: 9.212712
(Iteration 21481 / 50000) loss: 9.410245
(Iteration 21491 / 50000) loss: 8.623558
(Iteration 21501 / 50000) loss: 9.024998
(Iteration 21511 / 50000) loss: 8.838616
(Iteration 21521 / 50000) loss: 9.390242
(Iteration 21531 / 50000) loss: 8.581557
(Iteration 21541 / 50000) loss: 10.073692
(Iteration 21551 / 50000) loss: 9.883387
(Iteration 21561 / 50000) loss: 8.301261
(Iteration 21571 / 50000) loss: 9.365018
(Iteration 21581 / 50000) loss: 9.746145
(Iteration 21591 / 50000) loss: 9.443406
(Iteration 21601 / 50000) loss: 10.043802
(Iteration 21611 / 50000) loss: 9.325408
(Iteration 21621 / 50000) loss: 8.563474
(Iteration 21631 / 50000) loss: 8.476268
(Iteration 21641 / 50000) loss: 10.124056
(Iteration 21651 / 50000) loss: 8.884519
(Iteration 21661 / 50000) loss: 9.600210
(Iteration 21671 / 50000) loss: 9.382423
(Iteration 21681 / 50000) loss: 9.592311
(Iteration 21691 / 50000) loss: 9.480448
(Iteration 21701 / 50000) loss: 8.704604
(Iteration 21711 / 50000) loss: 8.802332
(Iteration 21721 / 50000) loss: 8.843689
(Iteration 21731 / 50000) loss: 8.633809
(Iteration 21741 / 50000) loss: 8.557644
(Iteration 21751 / 50000) loss: 8.825295
(Iteration 21761 / 50000) loss: 8.968644
(Iteration 21771 / 50000) loss: 8.731195
(Iteration 21781 / 50000) loss: 8.702158
(Iteration 21791 / 50000) loss: 9.121968
(Iteration 21801 / 50000) loss: 9.452478
(Iteration 21811 / 50000) loss: 8.864856
(Iteration 21821 / 50000) loss: 8.169686
(Iteration 21831 / 50000) loss: 9.910982
(Iteration 21841 / 50000) loss: 9.550921
(Iteration 21851 / 50000) loss: 8.846084
(Iteration 21861 / 50000) loss: 8.145724
(Iteration 21871 / 50000) loss: 8.793788
(Iteration 21881 / 50000) loss: 8.620936
(Iteration 21891 / 50000) loss: 8.402709
(Iteration 21901 / 50000) loss: 8.930903
(Iteration 21911 / 50000) loss: 8.369387
(Iteration 21921 / 50000) loss: 8.115340
(Iteration 21931 / 50000) loss: 8.747023
(Iteration 21941 / 50000) loss: 9.139087
(Iteration 21951 / 50000) loss: 9.168682
(Iteration 21961 / 50000) loss: 8.836778
(Iteration 21971 / 50000) loss: 9.257224
(Iteration 21981 / 50000) loss: 8.109490
(Iteration 21991 / 50000) loss: 8.826971
(Iteration 22001 / 50000) loss: 9.488993
(Iteration 22011 / 50000) loss: 9.281360
(Iteration 22021 / 50000) loss: 9.463757
(Iteration 22031 / 50000) loss: 8.616042
(Iteration 22041 / 50000) loss: 8.397938
(Iteration 22051 / 50000) loss: 8.591973
(Iteration 22061 / 50000) loss: 9.027519
(Iteration 22071 / 50000) loss: 8.862233
(Iteration 22081 / 50000) loss: 8.952689
(Iteration 22091 / 50000) loss: 8.316417
(Iteration 22101 / 50000) loss: 8.740930
(Iteration 22111 / 50000) loss: 8.955272
(Iteration 22121 / 50000) loss: 9.104111
(Iteration 22131 / 50000) loss: 7.603028
(Iteration 22141 / 50000) loss: 8.388932
(Iteration 22151 / 50000) loss: 8.752087
(Iteration 22161 / 50000) loss: 9.070115
(Iteration 22171 / 50000) loss: 9.666885
(Iteration 22181 / 50000) loss: 8.505788
(Iteration 22191 / 50000) loss: 8.380467
(Iteration 22201 / 50000) loss: 9.444410
(Iteration 22211 / 50000) loss: 9.082031
(Iteration 22221 / 50000) loss: 7.490505
(Iteration 22231 / 50000) loss: 9.172272
(Iteration 22241 / 50000) loss: 9.704550
(Iteration 22251 / 50000) loss: 8.047768
(Iteration 22261 / 50000) loss: 9.306776
(Iteration 22271 / 50000) loss: 9.163417
(Iteration 22281 / 50000) loss: 8.743850
(Iteration 22291 / 50000) loss: 8.199921
(Iteration 22301 / 50000) loss: 9.866930
(Iteration 22311 / 50000) loss: 8.266673
(Iteration 22321 / 50000) loss: 8.783620
(Iteration 22331 / 50000) loss: 8.300991
(Iteration 22341 / 50000) loss: 9.829550
(Iteration 22351 / 50000) loss: 8.922139
(Iteration 22361 / 50000) loss: 8.441351
(Iteration 22371 / 50000) loss: 9.250854
(Iteration 22381 / 50000) loss: 8.777077
(Iteration 22391 / 50000) loss: 8.910025
(Iteration 22401 / 50000) loss: 9.050283
(Iteration 22411 / 50000) loss: 8.840054
(Iteration 22421 / 50000) loss: 8.376865
(Iteration 22431 / 50000) loss: 8.494549
(Iteration 22441 / 50000) loss: 8.323891
(Iteration 22451 / 50000) loss: 9.479876
(Iteration 22461 / 50000) loss: 8.667582
(Iteration 22471 / 50000) loss: 9.262490
(Iteration 22481 / 50000) loss: 8.838422
(Iteration 22491 / 50000) loss: 8.785201
(Iteration 22501 / 50000) loss: 9.347206
(Iteration 22511 / 50000) loss: 7.886751
(Iteration 22521 / 50000) loss: 7.675965
(Iteration 22531 / 50000) loss: 9.018352
(Iteration 22541 / 50000) loss: 9.260224
(Iteration 22551 / 50000) loss: 8.779277
(Iteration 22561 / 50000) loss: 7.209487
(Iteration 22571 / 50000) loss: 8.139168
(Iteration 22581 / 50000) loss: 8.578427
(Iteration 22591 / 50000) loss: 8.246694
(Iteration 22601 / 50000) loss: 8.022479
(Iteration 22611 / 50000) loss: 9.889952
(Iteration 22621 / 50000) loss: 9.042249
(Iteration 22631 / 50000) loss: 8.448622
(Iteration 22641 / 50000) loss: 8.268767
(Iteration 22651 / 50000) loss: 8.465953
(Iteration 22661 / 50000) loss: 9.279467
(Iteration 22671 / 50000) loss: 8.243144
(Iteration 22681 / 50000) loss: 8.792189
(Iteration 22691 / 50000) loss: 8.563996
(Iteration 22701 / 50000) loss: 8.899828
(Iteration 22711 / 50000) loss: 9.509592
(Iteration 22721 / 50000) loss: 9.018317
(Iteration 22731 / 50000) loss: 9.069303
(Iteration 22741 / 50000) loss: 8.406179
(Iteration 22751 / 50000) loss: 9.420952
(Iteration 22761 / 50000) loss: 8.667871
(Iteration 22771 / 50000) loss: 8.875749
(Iteration 22781 / 50000) loss: 9.087675
(Iteration 22791 / 50000) loss: 8.692611
(Iteration 22801 / 50000) loss: 9.469008
(Iteration 22811 / 50000) loss: 8.858448
(Iteration 22821 / 50000) loss: 9.606613
(Iteration 22831 / 50000) loss: 9.007707
(Iteration 22841 / 50000) loss: 9.006035
(Iteration 22851 / 50000) loss: 8.809841
(Iteration 22861 / 50000) loss: 8.129689
(Iteration 22871 / 50000) loss: 8.345020
(Iteration 22881 / 50000) loss: 8.810358
(Iteration 22891 / 50000) loss: 8.673817
(Iteration 22901 / 50000) loss: 10.265531
(Iteration 22911 / 50000) loss: 8.755801
(Iteration 22921 / 50000) loss: 8.616327
(Iteration 22931 / 50000) loss: 8.033333
(Iteration 22941 / 50000) loss: 8.553583
(Iteration 22951 / 50000) loss: 8.059711
(Iteration 22961 / 50000) loss: 8.632774
(Iteration 22971 / 50000) loss: 8.469203
(Iteration 22981 / 50000) loss: 8.463877
(Iteration 22991 / 50000) loss: 8.759397
(Iteration 23001 / 50000) loss: 8.789989
(Iteration 23011 / 50000) loss: 8.325233
(Iteration 23021 / 50000) loss: 8.364010
(Iteration 23031 / 50000) loss: 9.122428
(Iteration 23041 / 50000) loss: 8.448828
(Iteration 23051 / 50000) loss: 8.058612
(Iteration 23061 / 50000) loss: 8.018756
(Iteration 23071 / 50000) loss: 9.447347
(Iteration 23081 / 50000) loss: 8.921310
(Iteration 23091 / 50000) loss: 8.287170
(Iteration 23101 / 50000) loss: 8.274203
(Iteration 23111 / 50000) loss: 8.655310
(Iteration 23121 / 50000) loss: 8.759463
(Iteration 23131 / 50000) loss: 9.191884
(Iteration 23141 / 50000) loss: 8.695010
(Iteration 23151 / 50000) loss: 8.713944
(Iteration 23161 / 50000) loss: 8.477332
(Iteration 23171 / 50000) loss: 7.783958
(Iteration 23181 / 50000) loss: 8.290457
(Iteration 23191 / 50000) loss: 8.249695
(Iteration 23201 / 50000) loss: 8.433629
(Iteration 23211 / 50000) loss: 8.944840
(Iteration 23221 / 50000) loss: 9.550397
(Iteration 23231 / 50000) loss: 7.781013
(Iteration 23241 / 50000) loss: 8.920095
(Iteration 23251 / 50000) loss: 8.869288
(Iteration 23261 / 50000) loss: 8.089275
(Iteration 23271 / 50000) loss: 8.002697
(Iteration 23281 / 50000) loss: 8.728881
(Iteration 23291 / 50000) loss: 8.451796
(Iteration 23301 / 50000) loss: 8.604311
(Iteration 23311 / 50000) loss: 9.144431
(Iteration 23321 / 50000) loss: 9.035018
(Iteration 23331 / 50000) loss: 9.300387
(Iteration 23341 / 50000) loss: 7.841499
(Iteration 23351 / 50000) loss: 8.853027
(Iteration 23361 / 50000) loss: 9.031871
(Iteration 23371 / 50000) loss: 8.130820
(Iteration 23381 / 50000) loss: 8.731788
(Iteration 23391 / 50000) loss: 8.281506
(Iteration 23401 / 50000) loss: 8.284627
(Iteration 23411 / 50000) loss: 9.420734
(Iteration 23421 / 50000) loss: 9.041279
(Iteration 23431 / 50000) loss: 8.751678
(Iteration 23441 / 50000) loss: 8.397959
(Iteration 23451 / 50000) loss: 8.562526
(Iteration 23461 / 50000) loss: 8.305688
(Iteration 23471 / 50000) loss: 8.180821
(Iteration 23481 / 50000) loss: 7.679165
(Iteration 23491 / 50000) loss: 8.929194
(Iteration 23501 / 50000) loss: 8.282184
(Iteration 23511 / 50000) loss: 8.767452
(Iteration 23521 / 50000) loss: 7.644497
(Iteration 23531 / 50000) loss: 9.093947
(Iteration 23541 / 50000) loss: 8.104666
(Iteration 23551 / 50000) loss: 8.971048
(Iteration 23561 / 50000) loss: 8.081642
(Iteration 23571 / 50000) loss: 8.018357
(Iteration 23581 / 50000) loss: 8.116469
(Iteration 23591 / 50000) loss: 8.152305
(Iteration 23601 / 50000) loss: 7.724375
(Iteration 23611 / 50000) loss: 8.362245
(Iteration 23621 / 50000) loss: 7.682929
(Iteration 23631 / 50000) loss: 8.088927
(Iteration 23641 / 50000) loss: 7.834337
(Iteration 23651 / 50000) loss: 8.248397
(Iteration 23661 / 50000) loss: 8.301711
(Iteration 23671 / 50000) loss: 8.345411
(Iteration 23681 / 50000) loss: 7.964479
(Iteration 23691 / 50000) loss: 7.893703
(Iteration 23701 / 50000) loss: 8.971995
(Iteration 23711 / 50000) loss: 8.072702
(Iteration 23721 / 50000) loss: 8.872845
(Iteration 23731 / 50000) loss: 8.117633
(Iteration 23741 / 50000) loss: 8.485692
(Iteration 23751 / 50000) loss: 8.560824
(Iteration 23761 / 50000) loss: 8.807125
(Iteration 23771 / 50000) loss: 8.117013
(Iteration 23781 / 50000) loss: 8.857990
(Iteration 23791 / 50000) loss: 8.425120
(Iteration 23801 / 50000) loss: 7.559764
(Iteration 23811 / 50000) loss: 7.653035
(Iteration 23821 / 50000) loss: 8.151645
(Iteration 23831 / 50000) loss: 9.409151
(Iteration 23841 / 50000) loss: 8.405934
(Iteration 23851 / 50000) loss: 8.624635
(Iteration 23861 / 50000) loss: 8.577616
(Iteration 23871 / 50000) loss: 7.351630
(Iteration 23881 / 50000) loss: 8.370439
(Iteration 23891 / 50000) loss: 7.379898
(Iteration 23901 / 50000) loss: 8.526125
(Iteration 23911 / 50000) loss: 7.331178
(Iteration 23921 / 50000) loss: 8.090018
(Iteration 23931 / 50000) loss: 8.496484
(Iteration 23941 / 50000) loss: 9.069689
(Iteration 23951 / 50000) loss: 8.188869
(Iteration 23961 / 50000) loss: 8.503840
(Iteration 23971 / 50000) loss: 7.411022
(Iteration 23981 / 50000) loss: 8.363702
(Iteration 23991 / 50000) loss: 8.837732
(Iteration 24001 / 50000) loss: 8.848079
(Iteration 24011 / 50000) loss: 8.781039
(Iteration 24021 / 50000) loss: 8.620619
(Iteration 24031 / 50000) loss: 8.314858
(Iteration 24041 / 50000) loss: 7.597805
(Iteration 24051 / 50000) loss: 7.944381
(Iteration 24061 / 50000) loss: 7.976215
(Iteration 24071 / 50000) loss: 7.988316
(Iteration 24081 / 50000) loss: 7.354346
(Iteration 24091 / 50000) loss: 8.162811
(Iteration 24101 / 50000) loss: 8.642128
(Iteration 24111 / 50000) loss: 8.010389
(Iteration 24121 / 50000) loss: 8.189435
(Iteration 24131 / 50000) loss: 8.775514
(Iteration 24141 / 50000) loss: 7.659099
(Iteration 24151 / 50000) loss: 8.141402
(Iteration 24161 / 50000) loss: 8.198994
(Iteration 24171 / 50000) loss: 7.737871
(Iteration 24181 / 50000) loss: 8.346815
(Iteration 24191 / 50000) loss: 8.681706
(Iteration 24201 / 50000) loss: 8.409661
(Iteration 24211 / 50000) loss: 7.981623
(Iteration 24221 / 50000) loss: 8.699900
(Iteration 24231 / 50000) loss: 8.281234
(Iteration 24241 / 50000) loss: 8.990044
(Iteration 24251 / 50000) loss: 8.215118
(Iteration 24261 / 50000) loss: 8.939156
(Iteration 24271 / 50000) loss: 8.791285
(Iteration 24281 / 50000) loss: 8.008293
(Iteration 24291 / 50000) loss: 8.638120
(Iteration 24301 / 50000) loss: 8.178704
(Iteration 24311 / 50000) loss: 8.577972
(Iteration 24321 / 50000) loss: 7.923296
(Iteration 24331 / 50000) loss: 8.648744
(Iteration 24341 / 50000) loss: 8.745779
(Iteration 24351 / 50000) loss: 7.840836
(Iteration 24361 / 50000) loss: 8.295351
(Iteration 24371 / 50000) loss: 8.472953
(Iteration 24381 / 50000) loss: 7.886631
(Iteration 24391 / 50000) loss: 8.047930
(Iteration 24401 / 50000) loss: 7.936747
(Iteration 24411 / 50000) loss: 8.050111
(Iteration 24421 / 50000) loss: 8.495460
(Iteration 24431 / 50000) loss: 7.408210
(Iteration 24441 / 50000) loss: 8.435351
(Iteration 24451 / 50000) loss: 8.432334
(Iteration 24461 / 50000) loss: 9.003516
(Iteration 24471 / 50000) loss: 7.811767
(Iteration 24481 / 50000) loss: 9.851634
(Iteration 24491 / 50000) loss: 7.145633
(Iteration 24501 / 50000) loss: 7.404977
(Iteration 24511 / 50000) loss: 8.464608
(Iteration 24521 / 50000) loss: 7.616299
(Iteration 24531 / 50000) loss: 7.902122
(Iteration 24541 / 50000) loss: 7.992880
(Iteration 24551 / 50000) loss: 7.681488
(Iteration 24561 / 50000) loss: 8.211501
(Iteration 24571 / 50000) loss: 7.505250
(Iteration 24581 / 50000) loss: 8.362596
(Iteration 24591 / 50000) loss: 8.432965
(Iteration 24601 / 50000) loss: 8.704875
(Iteration 24611 / 50000) loss: 8.600796
(Iteration 24621 / 50000) loss: 8.515012
(Iteration 24631 / 50000) loss: 9.034758
(Iteration 24641 / 50000) loss: 9.207626
(Iteration 24651 / 50000) loss: 6.819972
(Iteration 24661 / 50000) loss: 8.520651
(Iteration 24671 / 50000) loss: 8.440612
(Iteration 24681 / 50000) loss: 7.848160
(Iteration 24691 / 50000) loss: 8.072952
(Iteration 24701 / 50000) loss: 7.205146
(Iteration 24711 / 50000) loss: 8.833599
(Iteration 24721 / 50000) loss: 8.249089
(Iteration 24731 / 50000) loss: 8.135007
(Iteration 24741 / 50000) loss: 8.252429
(Iteration 24751 / 50000) loss: 8.306243
(Iteration 24761 / 50000) loss: 7.670366
(Iteration 24771 / 50000) loss: 8.376448
(Iteration 24781 / 50000) loss: 7.430469
(Iteration 24791 / 50000) loss: 7.553583
(Iteration 24801 / 50000) loss: 8.147921
(Iteration 24811 / 50000) loss: 8.379053
(Iteration 24821 / 50000) loss: 7.846014
(Iteration 24831 / 50000) loss: 8.092808
(Iteration 24841 / 50000) loss: 7.264117
(Iteration 24851 / 50000) loss: 7.802521
(Iteration 24861 / 50000) loss: 7.479374
(Iteration 24871 / 50000) loss: 8.474685
(Iteration 24881 / 50000) loss: 8.311934
(Iteration 24891 / 50000) loss: 8.424348
(Iteration 24901 / 50000) loss: 7.373428
(Iteration 24911 / 50000) loss: 8.849211
(Iteration 24921 / 50000) loss: 7.856967
(Iteration 24931 / 50000) loss: 7.748634
(Iteration 24941 / 50000) loss: 8.301973
(Iteration 24951 / 50000) loss: 8.079628
(Iteration 24961 / 50000) loss: 7.556756
(Iteration 24971 / 50000) loss: 8.267796
(Iteration 24981 / 50000) loss: 8.312485
(Iteration 24991 / 50000) loss: 7.555561
(Iteration 25001 / 50000) loss: 8.255676
(Iteration 25011 / 50000) loss: 7.737092
(Iteration 25021 / 50000) loss: 8.474481
(Iteration 25031 / 50000) loss: 7.769738
(Iteration 25041 / 50000) loss: 7.867768
(Iteration 25051 / 50000) loss: 7.445509
(Iteration 25061 / 50000) loss: 7.927852
(Iteration 25071 / 50000) loss: 8.203074
(Iteration 25081 / 50000) loss: 7.832866
(Iteration 25091 / 50000) loss: 7.497983
(Iteration 25101 / 50000) loss: 8.324136
(Iteration 25111 / 50000) loss: 7.586772
(Iteration 25121 / 50000) loss: 7.877896
(Iteration 25131 / 50000) loss: 7.200987
(Iteration 25141 / 50000) loss: 8.636975
(Iteration 25151 / 50000) loss: 7.743054
(Iteration 25161 / 50000) loss: 7.802484
(Iteration 25171 / 50000) loss: 8.541362
(Iteration 25181 / 50000) loss: 8.437155
(Iteration 25191 / 50000) loss: 8.459423
(Iteration 25201 / 50000) loss: 8.077019
(Iteration 25211 / 50000) loss: 7.311219
(Iteration 25221 / 50000) loss: 7.971913
(Iteration 25231 / 50000) loss: 8.413351
(Iteration 25241 / 50000) loss: 7.441964
(Iteration 25251 / 50000) loss: 8.375287
(Iteration 25261 / 50000) loss: 7.825249
(Iteration 25271 / 50000) loss: 8.515885
(Iteration 25281 / 50000) loss: 8.656430
(Iteration 25291 / 50000) loss: 7.531369
(Iteration 25301 / 50000) loss: 8.111380
(Iteration 25311 / 50000) loss: 7.571697
(Iteration 25321 / 50000) loss: 7.518395
(Iteration 25331 / 50000) loss: 7.293624
(Iteration 25341 / 50000) loss: 8.269860
(Iteration 25351 / 50000) loss: 7.759951
(Iteration 25361 / 50000) loss: 7.336838
(Iteration 25371 / 50000) loss: 7.722897
(Iteration 25381 / 50000) loss: 8.387203
(Iteration 25391 / 50000) loss: 7.867406
(Iteration 25401 / 50000) loss: 8.209749
(Iteration 25411 / 50000) loss: 8.485877
(Iteration 25421 / 50000) loss: 8.254173
(Iteration 25431 / 50000) loss: 7.808305
(Iteration 25441 / 50000) loss: 8.051764
(Iteration 25451 / 50000) loss: 7.581742
(Iteration 25461 / 50000) loss: 7.510599
(Iteration 25471 / 50000) loss: 8.668167
(Iteration 25481 / 50000) loss: 7.781871
(Iteration 25491 / 50000) loss: 7.889341
(Iteration 25501 / 50000) loss: 6.697689
(Iteration 25511 / 50000) loss: 8.401791
(Iteration 25521 / 50000) loss: 7.569456
(Iteration 25531 / 50000) loss: 7.893210
(Iteration 25541 / 50000) loss: 7.516563
(Iteration 25551 / 50000) loss: 7.389350
(Iteration 25561 / 50000) loss: 8.447339
(Iteration 25571 / 50000) loss: 8.108105
(Iteration 25581 / 50000) loss: 8.381425
(Iteration 25591 / 50000) loss: 6.703634
(Iteration 25601 / 50000) loss: 8.364971
(Iteration 25611 / 50000) loss: 8.277686
(Iteration 25621 / 50000) loss: 8.270017
(Iteration 25631 / 50000) loss: 7.537334
(Iteration 25641 / 50000) loss: 6.518239
(Iteration 25651 / 50000) loss: 8.112953
(Iteration 25661 / 50000) loss: 7.895146
(Iteration 25671 / 50000) loss: 7.455959
(Iteration 25681 / 50000) loss: 8.310107
(Iteration 25691 / 50000) loss: 7.508219
(Iteration 25701 / 50000) loss: 7.361572
(Iteration 25711 / 50000) loss: 7.554858
(Iteration 25721 / 50000) loss: 7.688189
(Iteration 25731 / 50000) loss: 7.181843
(Iteration 25741 / 50000) loss: 7.363787
(Iteration 25751 / 50000) loss: 7.756234
(Iteration 25761 / 50000) loss: 7.301623
(Iteration 25771 / 50000) loss: 8.230617
(Iteration 25781 / 50000) loss: 8.028083
(Iteration 25791 / 50000) loss: 7.586051
(Iteration 25801 / 50000) loss: 7.808671
(Iteration 25811 / 50000) loss: 8.572748
(Iteration 25821 / 50000) loss: 7.341016
(Iteration 25831 / 50000) loss: 7.101585
(Iteration 25841 / 50000) loss: 8.458892
(Iteration 25851 / 50000) loss: 7.676429
(Iteration 25861 / 50000) loss: 7.034109
(Iteration 25871 / 50000) loss: 7.540890
(Iteration 25881 / 50000) loss: 6.874981
(Iteration 25891 / 50000) loss: 7.558488
(Iteration 25901 / 50000) loss: 7.026745
(Iteration 25911 / 50000) loss: 7.807913
(Iteration 25921 / 50000) loss: 7.781466
(Iteration 25931 / 50000) loss: 8.119266
(Iteration 25941 / 50000) loss: 7.911963
(Iteration 25951 / 50000) loss: 7.865986
(Iteration 25961 / 50000) loss: 7.615242
(Iteration 25971 / 50000) loss: 7.524239
(Iteration 25981 / 50000) loss: 8.441215
(Iteration 25991 / 50000) loss: 7.928577
(Iteration 26001 / 50000) loss: 8.497976
(Iteration 26011 / 50000) loss: 7.661036
(Iteration 26021 / 50000) loss: 7.328275
(Iteration 26031 / 50000) loss: 7.293892
(Iteration 26041 / 50000) loss: 7.627136
(Iteration 26051 / 50000) loss: 7.529629
(Iteration 26061 / 50000) loss: 8.039648
(Iteration 26071 / 50000) loss: 7.434939
(Iteration 26081 / 50000) loss: 7.367186
(Iteration 26091 / 50000) loss: 7.909718
(Iteration 26101 / 50000) loss: 7.817381
(Iteration 26111 / 50000) loss: 7.411246
(Iteration 26121 / 50000) loss: 7.659663
(Iteration 26131 / 50000) loss: 8.081749
(Iteration 26141 / 50000) loss: 7.513522
(Iteration 26151 / 50000) loss: 8.180260
(Iteration 26161 / 50000) loss: 7.764392
(Iteration 26171 / 50000) loss: 8.316936
(Iteration 26181 / 50000) loss: 7.361645
(Iteration 26191 / 50000) loss: 8.103156
(Iteration 26201 / 50000) loss: 7.321040
(Iteration 26211 / 50000) loss: 8.542064
(Iteration 26221 / 50000) loss: 7.972256
(Iteration 26231 / 50000) loss: 7.562520
(Iteration 26241 / 50000) loss: 7.577855
(Iteration 26251 / 50000) loss: 7.800477
(Iteration 26261 / 50000) loss: 8.179725
(Iteration 26271 / 50000) loss: 7.441531
(Iteration 26281 / 50000) loss: 7.327582
(Iteration 26291 / 50000) loss: 6.903157
(Iteration 26301 / 50000) loss: 7.013829
(Iteration 26311 / 50000) loss: 8.223009
(Iteration 26321 / 50000) loss: 8.616586
(Iteration 26331 / 50000) loss: 8.025665
(Iteration 26341 / 50000) loss: 7.376438
(Iteration 26351 / 50000) loss: 8.549404
(Iteration 26361 / 50000) loss: 7.704421
(Iteration 26371 / 50000) loss: 7.690894
(Iteration 26381 / 50000) loss: 7.186907
(Iteration 26391 / 50000) loss: 7.656265
(Iteration 26401 / 50000) loss: 7.857482
(Iteration 26411 / 50000) loss: 7.824433
(Iteration 26421 / 50000) loss: 6.856884
(Iteration 26431 / 50000) loss: 8.013830
(Iteration 26441 / 50000) loss: 6.917641
(Iteration 26451 / 50000) loss: 7.342820
(Iteration 26461 / 50000) loss: 7.958167
(Iteration 26471 / 50000) loss: 7.575996
(Iteration 26481 / 50000) loss: 7.775510
(Iteration 26491 / 50000) loss: 7.326292
(Iteration 26501 / 50000) loss: 7.252481
(Iteration 26511 / 50000) loss: 7.703423
(Iteration 26521 / 50000) loss: 7.858870
(Iteration 26531 / 50000) loss: 7.991191
(Iteration 26541 / 50000) loss: 8.220461
(Iteration 26551 / 50000) loss: 7.722456
(Iteration 26561 / 50000) loss: 7.633142
(Iteration 26571 / 50000) loss: 6.973334
(Iteration 26581 / 50000) loss: 8.442999
(Iteration 26591 / 50000) loss: 7.419782
(Iteration 26601 / 50000) loss: 8.324310
(Iteration 26611 / 50000) loss: 7.009107
(Iteration 26621 / 50000) loss: 7.383329
(Iteration 26631 / 50000) loss: 7.496563
(Iteration 26641 / 50000) loss: 8.138605
(Iteration 26651 / 50000) loss: 6.790598
(Iteration 26661 / 50000) loss: 7.226410
(Iteration 26671 / 50000) loss: 8.210968
(Iteration 26681 / 50000) loss: 7.818577
(Iteration 26691 / 50000) loss: 7.221381
(Iteration 26701 / 50000) loss: 7.655045
(Iteration 26711 / 50000) loss: 8.765531
(Iteration 26721 / 50000) loss: 7.654484
(Iteration 26731 / 50000) loss: 7.621042
(Iteration 26741 / 50000) loss: 7.547407
(Iteration 26751 / 50000) loss: 8.011016
(Iteration 26761 / 50000) loss: 7.108625
(Iteration 26771 / 50000) loss: 7.800760
(Iteration 26781 / 50000) loss: 7.666537
(Iteration 26791 / 50000) loss: 7.504023
(Iteration 26801 / 50000) loss: 7.667867
(Iteration 26811 / 50000) loss: 8.497022
(Iteration 26821 / 50000) loss: 8.345657
(Iteration 26831 / 50000) loss: 7.234842
(Iteration 26841 / 50000) loss: 6.959331
(Iteration 26851 / 50000) loss: 7.187354
(Iteration 26861 / 50000) loss: 7.587562
(Iteration 26871 / 50000) loss: 7.070105
(Iteration 26881 / 50000) loss: 7.212628
(Iteration 26891 / 50000) loss: 7.830024
(Iteration 26901 / 50000) loss: 7.615251
(Iteration 26911 / 50000) loss: 7.147823
(Iteration 26921 / 50000) loss: 7.344125
(Iteration 26931 / 50000) loss: 7.979127
(Iteration 26941 / 50000) loss: 7.131867
(Iteration 26951 / 50000) loss: 7.967615
(Iteration 26961 / 50000) loss: 8.037566
(Iteration 26971 / 50000) loss: 7.577976
(Iteration 26981 / 50000) loss: 6.995808
(Iteration 26991 / 50000) loss: 7.264195
(Iteration 27001 / 50000) loss: 7.152865
(Iteration 27011 / 50000) loss: 7.456389
(Iteration 27021 / 50000) loss: 7.859306
(Iteration 27031 / 50000) loss: 7.521985
(Iteration 27041 / 50000) loss: 7.979931
(Iteration 27051 / 50000) loss: 7.715937
(Iteration 27061 / 50000) loss: 7.678891
(Iteration 27071 / 50000) loss: 7.589580
(Iteration 27081 / 50000) loss: 7.489795
(Iteration 27091 / 50000) loss: 8.394795
(Iteration 27101 / 50000) loss: 8.042896
(Iteration 27111 / 50000) loss: 7.308944
(Iteration 27121 / 50000) loss: 6.798898
(Iteration 27131 / 50000) loss: 7.032466
(Iteration 27141 / 50000) loss: 7.537439
(Iteration 27151 / 50000) loss: 6.996385
(Iteration 27161 / 50000) loss: 7.045507
(Iteration 27171 / 50000) loss: 7.907840
(Iteration 27181 / 50000) loss: 7.106988
(Iteration 27191 / 50000) loss: 7.495406
(Iteration 27201 / 50000) loss: 7.353376
(Iteration 27211 / 50000) loss: 7.080645
(Iteration 27221 / 50000) loss: 7.947496
(Iteration 27231 / 50000) loss: 7.750695
(Iteration 27241 / 50000) loss: 6.816972
(Iteration 27251 / 50000) loss: 7.381038
(Iteration 27261 / 50000) loss: 7.613413
(Iteration 27271 / 50000) loss: 7.689278
(Iteration 27281 / 50000) loss: 7.631551
(Iteration 27291 / 50000) loss: 8.787505
(Iteration 27301 / 50000) loss: 8.010296
(Iteration 27311 / 50000) loss: 7.515618
(Iteration 27321 / 50000) loss: 6.978251
(Iteration 27331 / 50000) loss: 7.176751
(Iteration 27341 / 50000) loss: 7.894900
(Iteration 27351 / 50000) loss: 8.022135
(Iteration 27361 / 50000) loss: 7.534135
(Iteration 27371 / 50000) loss: 7.858518
(Iteration 27381 / 50000) loss: 7.046296
(Iteration 27391 / 50000) loss: 7.580141
(Iteration 27401 / 50000) loss: 7.130490
(Iteration 27411 / 50000) loss: 7.211680
(Iteration 27421 / 50000) loss: 7.817950
(Iteration 27431 / 50000) loss: 8.364658
(Iteration 27441 / 50000) loss: 6.100786
(Iteration 27451 / 50000) loss: 6.867558
(Iteration 27461 / 50000) loss: 6.241940
(Iteration 27471 / 50000) loss: 7.924798
(Iteration 27481 / 50000) loss: 7.251300
(Iteration 27491 / 50000) loss: 7.380916
(Iteration 27501 / 50000) loss: 8.846993
(Iteration 27511 / 50000) loss: 7.269657
(Iteration 27521 / 50000) loss: 7.105157
(Iteration 27531 / 50000) loss: 7.534039
(Iteration 27541 / 50000) loss: 7.823092
(Iteration 27551 / 50000) loss: 8.698991
(Iteration 27561 / 50000) loss: 7.497857
(Iteration 27571 / 50000) loss: 7.477459
(Iteration 27581 / 50000) loss: 7.593883
(Iteration 27591 / 50000) loss: 7.089148
(Iteration 27601 / 50000) loss: 7.965036
(Iteration 27611 / 50000) loss: 7.050960
(Iteration 27621 / 50000) loss: 7.168500
(Iteration 27631 / 50000) loss: 6.675798
(Iteration 27641 / 50000) loss: 7.824516
(Iteration 27651 / 50000) loss: 6.979448
(Iteration 27661 / 50000) loss: 6.403974
(Iteration 27671 / 50000) loss: 7.402050
(Iteration 27681 / 50000) loss: 7.254982
(Iteration 27691 / 50000) loss: 7.067836
(Iteration 27701 / 50000) loss: 7.424843
(Iteration 27711 / 50000) loss: 7.436243
(Iteration 27721 / 50000) loss: 7.379275
(Iteration 27731 / 50000) loss: 7.130455
(Iteration 27741 / 50000) loss: 6.871210
(Iteration 27751 / 50000) loss: 7.316108
(Iteration 27761 / 50000) loss: 7.568695
(Iteration 27771 / 50000) loss: 7.289368
(Iteration 27781 / 50000) loss: 7.153935
(Iteration 27791 / 50000) loss: 7.167840
(Iteration 27801 / 50000) loss: 7.055030
(Iteration 27811 / 50000) loss: 6.731782
(Iteration 27821 / 50000) loss: 7.650488
(Iteration 27831 / 50000) loss: 7.065702
(Iteration 27841 / 50000) loss: 6.895704
(Iteration 27851 / 50000) loss: 6.838322
(Iteration 27861 / 50000) loss: 6.938224
(Iteration 27871 / 50000) loss: 7.248393
(Iteration 27881 / 50000) loss: 8.215472
(Iteration 27891 / 50000) loss: 8.041580
(Iteration 27901 / 50000) loss: 7.816905
(Iteration 27911 / 50000) loss: 7.238732
(Iteration 27921 / 50000) loss: 7.947225
(Iteration 27931 / 50000) loss: 6.308353
(Iteration 27941 / 50000) loss: 7.331069
(Iteration 27951 / 50000) loss: 7.586483
(Iteration 27961 / 50000) loss: 6.778589
(Iteration 27971 / 50000) loss: 7.192425
(Iteration 27981 / 50000) loss: 7.118912
(Iteration 27991 / 50000) loss: 8.223699
(Iteration 28001 / 50000) loss: 5.949817
(Iteration 28011 / 50000) loss: 7.368647
(Iteration 28021 / 50000) loss: 6.531797
(Iteration 28031 / 50000) loss: 6.681814
(Iteration 28041 / 50000) loss: 7.156660
(Iteration 28051 / 50000) loss: 6.751592
(Iteration 28061 / 50000) loss: 6.833828
(Iteration 28071 / 50000) loss: 6.595813
(Iteration 28081 / 50000) loss: 7.265611
(Iteration 28091 / 50000) loss: 8.186067
(Iteration 28101 / 50000) loss: 7.088625
(Iteration 28111 / 50000) loss: 7.864802
(Iteration 28121 / 50000) loss: 7.152707
(Iteration 28131 / 50000) loss: 6.732909
(Iteration 28141 / 50000) loss: 7.742754
(Iteration 28151 / 50000) loss: 6.836557
(Iteration 28161 / 50000) loss: 7.121028
(Iteration 28171 / 50000) loss: 6.888137
(Iteration 28181 / 50000) loss: 6.580404
(Iteration 28191 / 50000) loss: 6.454153
(Iteration 28201 / 50000) loss: 6.868467
(Iteration 28211 / 50000) loss: 6.768700
(Iteration 28221 / 50000) loss: 7.514250
(Iteration 28231 / 50000) loss: 6.433816
(Iteration 28241 / 50000) loss: 7.813155
(Iteration 28251 / 50000) loss: 7.391418
(Iteration 28261 / 50000) loss: 7.364510
(Iteration 28271 / 50000) loss: 7.334480
(Iteration 28281 / 50000) loss: 8.706440
(Iteration 28291 / 50000) loss: 7.582667
(Iteration 28301 / 50000) loss: 6.515620
(Iteration 28311 / 50000) loss: 7.271309
(Iteration 28321 / 50000) loss: 6.577398
(Iteration 28331 / 50000) loss: 7.458299
(Iteration 28341 / 50000) loss: 6.962780
(Iteration 28351 / 50000) loss: 6.213887
(Iteration 28361 / 50000) loss: 7.365142
(Iteration 28371 / 50000) loss: 7.626739
(Iteration 28381 / 50000) loss: 7.005687
(Iteration 28391 / 50000) loss: 7.466628
(Iteration 28401 / 50000) loss: 7.096127
(Iteration 28411 / 50000) loss: 7.456704
(Iteration 28421 / 50000) loss: 6.700184
(Iteration 28431 / 50000) loss: 8.668132
(Iteration 28441 / 50000) loss: 7.364075
(Iteration 28451 / 50000) loss: 6.576522
(Iteration 28461 / 50000) loss: 7.199333
(Iteration 28471 / 50000) loss: 7.934616
(Iteration 28481 / 50000) loss: 6.916126
(Iteration 28491 / 50000) loss: 7.075500
(Iteration 28501 / 50000) loss: 7.721235
(Iteration 28511 / 50000) loss: 7.831263
(Iteration 28521 / 50000) loss: 6.906903
(Iteration 28531 / 50000) loss: 7.190777
(Iteration 28541 / 50000) loss: 7.283574
(Iteration 28551 / 50000) loss: 6.845867
(Iteration 28561 / 50000) loss: 7.860993
(Iteration 28571 / 50000) loss: 7.503522
(Iteration 28581 / 50000) loss: 7.172870
(Iteration 28591 / 50000) loss: 7.626427
(Iteration 28601 / 50000) loss: 8.140659
(Iteration 28611 / 50000) loss: 6.625252
(Iteration 28621 / 50000) loss: 7.394071
(Iteration 28631 / 50000) loss: 7.760205
(Iteration 28641 / 50000) loss: 6.882310
(Iteration 28651 / 50000) loss: 7.567080
(Iteration 28661 / 50000) loss: 7.660043
(Iteration 28671 / 50000) loss: 6.886299
(Iteration 28681 / 50000) loss: 7.840143
(Iteration 28691 / 50000) loss: 6.388740
(Iteration 28701 / 50000) loss: 8.132313
(Iteration 28711 / 50000) loss: 7.047295
(Iteration 28721 / 50000) loss: 7.432205
(Iteration 28731 / 50000) loss: 6.617247
(Iteration 28741 / 50000) loss: 6.566612
(Iteration 28751 / 50000) loss: 6.920055
(Iteration 28761 / 50000) loss: 7.047909
(Iteration 28771 / 50000) loss: 6.782951
(Iteration 28781 / 50000) loss: 6.450817
(Iteration 28791 / 50000) loss: 7.539389
(Iteration 28801 / 50000) loss: 7.870274
(Iteration 28811 / 50000) loss: 7.113060
(Iteration 28821 / 50000) loss: 6.654270
(Iteration 28831 / 50000) loss: 6.705054
(Iteration 28841 / 50000) loss: 7.055719
(Iteration 28851 / 50000) loss: 7.256784
(Iteration 28861 / 50000) loss: 6.333742
(Iteration 28871 / 50000) loss: 6.900287
(Iteration 28881 / 50000) loss: 6.284855
(Iteration 28891 / 50000) loss: 7.566592
(Iteration 28901 / 50000) loss: 7.486326
(Iteration 28911 / 50000) loss: 7.248897
(Iteration 28921 / 50000) loss: 7.145208
(Iteration 28931 / 50000) loss: 7.106504
(Iteration 28941 / 50000) loss: 6.697749
(Iteration 28951 / 50000) loss: 6.889575
(Iteration 28961 / 50000) loss: 7.312684
(Iteration 28971 / 50000) loss: 7.231175
(Iteration 28981 / 50000) loss: 7.443804
(Iteration 28991 / 50000) loss: 7.295204
(Iteration 29001 / 50000) loss: 5.917845
(Iteration 29011 / 50000) loss: 7.251649
(Iteration 29021 / 50000) loss: 7.285198
(Iteration 29031 / 50000) loss: 7.191893
(Iteration 29041 / 50000) loss: 6.973921
(Iteration 29051 / 50000) loss: 7.290552
(Iteration 29061 / 50000) loss: 8.118273
(Iteration 29071 / 50000) loss: 6.489273
(Iteration 29081 / 50000) loss: 6.476597
(Iteration 29091 / 50000) loss: 7.923972
(Iteration 29101 / 50000) loss: 6.710901
(Iteration 29111 / 50000) loss: 7.553532
(Iteration 29121 / 50000) loss: 6.974629
(Iteration 29131 / 50000) loss: 7.030274
(Iteration 29141 / 50000) loss: 7.628365
(Iteration 29151 / 50000) loss: 6.718468
(Iteration 29161 / 50000) loss: 6.797282
(Iteration 29171 / 50000) loss: 6.825717
(Iteration 29181 / 50000) loss: 7.252330
(Iteration 29191 / 50000) loss: 6.924583
(Iteration 29201 / 50000) loss: 6.688187
(Iteration 29211 / 50000) loss: 6.896950
(Iteration 29221 / 50000) loss: 6.652292
(Iteration 29231 / 50000) loss: 7.632608
(Iteration 29241 / 50000) loss: 6.167642
(Iteration 29251 / 50000) loss: 7.656710
(Iteration 29261 / 50000) loss: 6.798780
(Iteration 29271 / 50000) loss: 6.325005
(Iteration 29281 / 50000) loss: 6.954420
(Iteration 29291 / 50000) loss: 7.152018
(Iteration 29301 / 50000) loss: 6.595236
(Iteration 29311 / 50000) loss: 6.918424
(Iteration 29321 / 50000) loss: 7.803064
(Iteration 29331 / 50000) loss: 6.792849
(Iteration 29341 / 50000) loss: 7.354359
(Iteration 29351 / 50000) loss: 7.209678
(Iteration 29361 / 50000) loss: 7.032438
(Iteration 29371 / 50000) loss: 7.523323
(Iteration 29381 / 50000) loss: 7.275890
(Iteration 29391 / 50000) loss: 7.108623
(Iteration 29401 / 50000) loss: 7.801648
(Iteration 29411 / 50000) loss: 5.917385
(Iteration 29421 / 50000) loss: 6.376636
(Iteration 29431 / 50000) loss: 6.414223
(Iteration 29441 / 50000) loss: 5.876540
(Iteration 29451 / 50000) loss: 6.784350
(Iteration 29461 / 50000) loss: 6.483991
(Iteration 29471 / 50000) loss: 6.249752
(Iteration 29481 / 50000) loss: 7.031054
(Iteration 29491 / 50000) loss: 6.406995
(Iteration 29501 / 50000) loss: 7.092308
(Iteration 29511 / 50000) loss: 6.600810
(Iteration 29521 / 50000) loss: 7.354079
(Iteration 29531 / 50000) loss: 7.072084
(Iteration 29541 / 50000) loss: 7.063308
(Iteration 29551 / 50000) loss: 6.495722
(Iteration 29561 / 50000) loss: 7.297350
(Iteration 29571 / 50000) loss: 7.462034
(Iteration 29581 / 50000) loss: 6.855714
(Iteration 29591 / 50000) loss: 7.089670
(Iteration 29601 / 50000) loss: 7.470248
(Iteration 29611 / 50000) loss: 7.010058
(Iteration 29621 / 50000) loss: 7.332512
(Iteration 29631 / 50000) loss: 6.584206
(Iteration 29641 / 50000) loss: 6.784315
(Iteration 29651 / 50000) loss: 6.795872
(Iteration 29661 / 50000) loss: 6.499430
(Iteration 29671 / 50000) loss: 6.809458
(Iteration 29681 / 50000) loss: 6.459322
(Iteration 29691 / 50000) loss: 7.392109
(Iteration 29701 / 50000) loss: 6.981882
(Iteration 29711 / 50000) loss: 6.372262
(Iteration 29721 / 50000) loss: 6.613456
(Iteration 29731 / 50000) loss: 7.197454
(Iteration 29741 / 50000) loss: 6.392886
(Iteration 29751 / 50000) loss: 6.755135
(Iteration 29761 / 50000) loss: 6.613220
(Iteration 29771 / 50000) loss: 6.429121
(Iteration 29781 / 50000) loss: 6.109212
(Iteration 29791 / 50000) loss: 6.442539
(Iteration 29801 / 50000) loss: 6.804757
(Iteration 29811 / 50000) loss: 7.170039
(Iteration 29821 / 50000) loss: 6.610002
(Iteration 29831 / 50000) loss: 7.060719
(Iteration 29841 / 50000) loss: 7.028062
(Iteration 29851 / 50000) loss: 6.547413
(Iteration 29861 / 50000) loss: 6.441208
(Iteration 29871 / 50000) loss: 7.123979
(Iteration 29881 / 50000) loss: 7.209429
(Iteration 29891 / 50000) loss: 7.437257
(Iteration 29901 / 50000) loss: 7.614864
(Iteration 29911 / 50000) loss: 5.871320
(Iteration 29921 / 50000) loss: 7.085354
(Iteration 29931 / 50000) loss: 6.839132
(Iteration 29941 / 50000) loss: 7.903743
(Iteration 29951 / 50000) loss: 6.883146
(Iteration 29961 / 50000) loss: 7.672978
(Iteration 29971 / 50000) loss: 7.369830
(Iteration 29981 / 50000) loss: 6.442009
(Iteration 29991 / 50000) loss: 7.110330
(Iteration 30001 / 50000) loss: 6.511206
(Iteration 30011 / 50000) loss: 6.797487
(Iteration 30021 / 50000) loss: 7.016727
(Iteration 30031 / 50000) loss: 6.547969
(Iteration 30041 / 50000) loss: 7.139266
(Iteration 30051 / 50000) loss: 7.466479
(Iteration 30061 / 50000) loss: 7.307373
(Iteration 30071 / 50000) loss: 6.903296
(Iteration 30081 / 50000) loss: 7.021793
(Iteration 30091 / 50000) loss: 7.270808
(Iteration 30101 / 50000) loss: 6.810637
(Iteration 30111 / 50000) loss: 7.597209
(Iteration 30121 / 50000) loss: 6.392886
(Iteration 30131 / 50000) loss: 7.301366
(Iteration 30141 / 50000) loss: 7.178529
(Iteration 30151 / 50000) loss: 6.666624
(Iteration 30161 / 50000) loss: 6.171742
(Iteration 30171 / 50000) loss: 7.214293
(Iteration 30181 / 50000) loss: 6.916416
(Iteration 30191 / 50000) loss: 7.226059
(Iteration 30201 / 50000) loss: 7.196351
(Iteration 30211 / 50000) loss: 6.679601
(Iteration 30221 / 50000) loss: 6.952634
(Iteration 30231 / 50000) loss: 6.347738
(Iteration 30241 / 50000) loss: 7.587157
(Iteration 30251 / 50000) loss: 7.118362
(Iteration 30261 / 50000) loss: 7.280914
(Iteration 30271 / 50000) loss: 6.871604
(Iteration 30281 / 50000) loss: 6.580851
(Iteration 30291 / 50000) loss: 6.442397
(Iteration 30301 / 50000) loss: 7.289938
(Iteration 30311 / 50000) loss: 6.657729
(Iteration 30321 / 50000) loss: 6.870208
(Iteration 30331 / 50000) loss: 6.010552
(Iteration 30341 / 50000) loss: 6.041015
(Iteration 30351 / 50000) loss: 7.451844
(Iteration 30361 / 50000) loss: 7.289750
(Iteration 30371 / 50000) loss: 6.684414
(Iteration 30381 / 50000) loss: 6.470913
(Iteration 30391 / 50000) loss: 6.922692
(Iteration 30401 / 50000) loss: 6.270634
(Iteration 30411 / 50000) loss: 6.262552
(Iteration 30421 / 50000) loss: 7.193902
(Iteration 30431 / 50000) loss: 7.431029
(Iteration 30441 / 50000) loss: 6.685455
(Iteration 30451 / 50000) loss: 7.048635
(Iteration 30461 / 50000) loss: 6.502102
(Iteration 30471 / 50000) loss: 6.984587
(Iteration 30481 / 50000) loss: 7.147220
(Iteration 30491 / 50000) loss: 6.509155
(Iteration 30501 / 50000) loss: 6.931756
(Iteration 30511 / 50000) loss: 6.628910
(Iteration 30521 / 50000) loss: 6.340525
(Iteration 30531 / 50000) loss: 7.150487
(Iteration 30541 / 50000) loss: 6.334132
(Iteration 30551 / 50000) loss: 6.828914
(Iteration 30561 / 50000) loss: 7.047994
(Iteration 30571 / 50000) loss: 6.454392
(Iteration 30581 / 50000) loss: 6.561184
(Iteration 30591 / 50000) loss: 6.405987
(Iteration 30601 / 50000) loss: 6.777956
(Iteration 30611 / 50000) loss: 7.298763
(Iteration 30621 / 50000) loss: 6.741918
(Iteration 30631 / 50000) loss: 6.305472
(Iteration 30641 / 50000) loss: 6.249074
(Iteration 30651 / 50000) loss: 6.205657
(Iteration 30661 / 50000) loss: 7.384310
(Iteration 30671 / 50000) loss: 6.440476
(Iteration 30681 / 50000) loss: 6.507559
(Iteration 30691 / 50000) loss: 6.958042
(Iteration 30701 / 50000) loss: 7.114684
(Iteration 30711 / 50000) loss: 7.140297
(Iteration 30721 / 50000) loss: 6.739398
(Iteration 30731 / 50000) loss: 7.342091
(Iteration 30741 / 50000) loss: 6.544926
(Iteration 30751 / 50000) loss: 7.117671
(Iteration 30761 / 50000) loss: 7.330226
(Iteration 30771 / 50000) loss: 6.660837
(Iteration 30781 / 50000) loss: 7.190175
(Iteration 30791 / 50000) loss: 6.587666
(Iteration 30801 / 50000) loss: 7.030423
(Iteration 30811 / 50000) loss: 6.511036
(Iteration 30821 / 50000) loss: 6.555304
(Iteration 30831 / 50000) loss: 6.933194
(Iteration 30841 / 50000) loss: 6.556946
(Iteration 30851 / 50000) loss: 6.748255
(Iteration 30861 / 50000) loss: 7.364789
(Iteration 30871 / 50000) loss: 6.663785
(Iteration 30881 / 50000) loss: 7.416833
(Iteration 30891 / 50000) loss: 6.362935
(Iteration 30901 / 50000) loss: 6.935542
(Iteration 30911 / 50000) loss: 6.518869
(Iteration 30921 / 50000) loss: 7.158775
(Iteration 30931 / 50000) loss: 6.844549
(Iteration 30941 / 50000) loss: 6.743436
(Iteration 30951 / 50000) loss: 7.097135
(Iteration 30961 / 50000) loss: 6.512828
(Iteration 30971 / 50000) loss: 6.317278
(Iteration 30981 / 50000) loss: 6.528240
(Iteration 30991 / 50000) loss: 6.866795
(Iteration 31001 / 50000) loss: 6.835971
(Iteration 31011 / 50000) loss: 6.918521
(Iteration 31021 / 50000) loss: 6.449663
(Iteration 31031 / 50000) loss: 6.256006
(Iteration 31041 / 50000) loss: 7.239800
(Iteration 31051 / 50000) loss: 6.210949
(Iteration 31061 / 50000) loss: 6.168396
(Iteration 31071 / 50000) loss: 6.582840
(Iteration 31081 / 50000) loss: 6.619564
(Iteration 31091 / 50000) loss: 7.260625
(Iteration 31101 / 50000) loss: 6.371779
(Iteration 31111 / 50000) loss: 7.074984
(Iteration 31121 / 50000) loss: 7.506067
(Iteration 31131 / 50000) loss: 6.557107
(Iteration 31141 / 50000) loss: 6.985355
(Iteration 31151 / 50000) loss: 6.672951
(Iteration 31161 / 50000) loss: 6.275035
(Iteration 31171 / 50000) loss: 7.135570
(Iteration 31181 / 50000) loss: 6.644361
(Iteration 31191 / 50000) loss: 6.656940
(Iteration 31201 / 50000) loss: 5.897045
(Iteration 31211 / 50000) loss: 7.159460
(Iteration 31221 / 50000) loss: 6.422430
(Iteration 31231 / 50000) loss: 6.408073
(Iteration 31241 / 50000) loss: 6.308722
(Iteration 31251 / 50000) loss: 6.664698
(Iteration 31261 / 50000) loss: 6.202233
(Iteration 31271 / 50000) loss: 6.766394
(Iteration 31281 / 50000) loss: 6.649322
(Iteration 31291 / 50000) loss: 6.267230
(Iteration 31301 / 50000) loss: 7.360839
(Iteration 31311 / 50000) loss: 6.570270
(Iteration 31321 / 50000) loss: 5.898422
(Iteration 31331 / 50000) loss: 6.677997
(Iteration 31341 / 50000) loss: 6.103157
(Iteration 31351 / 50000) loss: 7.485938
(Iteration 31361 / 50000) loss: 6.365999
(Iteration 31371 / 50000) loss: 7.293494
(Iteration 31381 / 50000) loss: 6.916647
(Iteration 31391 / 50000) loss: 6.946386
(Iteration 31401 / 50000) loss: 6.354800
(Iteration 31411 / 50000) loss: 6.634653
(Iteration 31421 / 50000) loss: 6.160396
(Iteration 31431 / 50000) loss: 6.552868
(Iteration 31441 / 50000) loss: 6.782326
(Iteration 31451 / 50000) loss: 6.911940
(Iteration 31461 / 50000) loss: 6.072836
(Iteration 31471 / 50000) loss: 6.603909
(Iteration 31481 / 50000) loss: 7.167071
(Iteration 31491 / 50000) loss: 6.124860
(Iteration 31501 / 50000) loss: 7.252236
(Iteration 31511 / 50000) loss: 6.185292
(Iteration 31521 / 50000) loss: 6.165784
(Iteration 31531 / 50000) loss: 6.575437
(Iteration 31541 / 50000) loss: 7.057423
(Iteration 31551 / 50000) loss: 6.860652
(Iteration 31561 / 50000) loss: 6.833319
(Iteration 31571 / 50000) loss: 6.228390
(Iteration 31581 / 50000) loss: 6.282843
(Iteration 31591 / 50000) loss: 6.399900
(Iteration 31601 / 50000) loss: 5.728668
(Iteration 31611 / 50000) loss: 6.962562
(Iteration 31621 / 50000) loss: 6.234606
(Iteration 31631 / 50000) loss: 7.013979
(Iteration 31641 / 50000) loss: 6.185366
(Iteration 31651 / 50000) loss: 6.495507
(Iteration 31661 / 50000) loss: 6.026712
(Iteration 31671 / 50000) loss: 6.598672
(Iteration 31681 / 50000) loss: 6.678318
(Iteration 31691 / 50000) loss: 6.524831
(Iteration 31701 / 50000) loss: 7.116942
(Iteration 31711 / 50000) loss: 7.635857
(Iteration 31721 / 50000) loss: 6.806810
(Iteration 31731 / 50000) loss: 5.682414
(Iteration 31741 / 50000) loss: 6.584495
(Iteration 31751 / 50000) loss: 6.469948
(Iteration 31761 / 50000) loss: 5.597311
(Iteration 31771 / 50000) loss: 7.132330
(Iteration 31781 / 50000) loss: 5.992571
(Iteration 31791 / 50000) loss: 6.344848
(Iteration 31801 / 50000) loss: 6.531434
(Iteration 31811 / 50000) loss: 6.904576
(Iteration 31821 / 50000) loss: 7.142070
(Iteration 31831 / 50000) loss: 6.194258
(Iteration 31841 / 50000) loss: 6.623099
(Iteration 31851 / 50000) loss: 6.656893
(Iteration 31861 / 50000) loss: 6.950193
(Iteration 31871 / 50000) loss: 6.395789
(Iteration 31881 / 50000) loss: 6.064066
(Iteration 31891 / 50000) loss: 6.730887
(Iteration 31901 / 50000) loss: 6.727625
(Iteration 31911 / 50000) loss: 6.917735
(Iteration 31921 / 50000) loss: 6.352053
(Iteration 31931 / 50000) loss: 6.649856
(Iteration 31941 / 50000) loss: 7.264018
(Iteration 31951 / 50000) loss: 6.929481
(Iteration 31961 / 50000) loss: 5.822005
(Iteration 31971 / 50000) loss: 6.864611
(Iteration 31981 / 50000) loss: 6.882140
(Iteration 31991 / 50000) loss: 6.571188
(Iteration 32001 / 50000) loss: 6.503912
(Iteration 32011 / 50000) loss: 6.088359
(Iteration 32021 / 50000) loss: 6.087791
(Iteration 32031 / 50000) loss: 6.633901
(Iteration 32041 / 50000) loss: 6.081381
(Iteration 32051 / 50000) loss: 6.353980
(Iteration 32061 / 50000) loss: 6.862988
(Iteration 32071 / 50000) loss: 6.706196
(Iteration 32081 / 50000) loss: 6.040417
(Iteration 32091 / 50000) loss: 7.402182
(Iteration 32101 / 50000) loss: 6.679390
(Iteration 32111 / 50000) loss: 6.316299
(Iteration 32121 / 50000) loss: 7.189917
(Iteration 32131 / 50000) loss: 6.170017
(Iteration 32141 / 50000) loss: 6.802485
(Iteration 32151 / 50000) loss: 7.461129
(Iteration 32161 / 50000) loss: 5.888992
(Iteration 32171 / 50000) loss: 6.514722
(Iteration 32181 / 50000) loss: 6.298405
(Iteration 32191 / 50000) loss: 6.914166
(Iteration 32201 / 50000) loss: 6.258140
(Iteration 32211 / 50000) loss: 6.694226
(Iteration 32221 / 50000) loss: 6.505503
(Iteration 32231 / 50000) loss: 6.919879
(Iteration 32241 / 50000) loss: 6.198838
(Iteration 32251 / 50000) loss: 5.957718
(Iteration 32261 / 50000) loss: 7.014023
(Iteration 32271 / 50000) loss: 6.564854
(Iteration 32281 / 50000) loss: 6.971103
(Iteration 32291 / 50000) loss: 5.875910
(Iteration 32301 / 50000) loss: 6.450813
(Iteration 32311 / 50000) loss: 6.589084
(Iteration 32321 / 50000) loss: 6.694832
(Iteration 32331 / 50000) loss: 6.360074
(Iteration 32341 / 50000) loss: 6.165702
(Iteration 32351 / 50000) loss: 6.630485
(Iteration 32361 / 50000) loss: 6.470185
(Iteration 32371 / 50000) loss: 6.871434
(Iteration 32381 / 50000) loss: 6.514093
(Iteration 32391 / 50000) loss: 5.648915
(Iteration 32401 / 50000) loss: 6.413232
(Iteration 32411 / 50000) loss: 7.132882
(Iteration 32421 / 50000) loss: 6.398582
(Iteration 32431 / 50000) loss: 7.336711
(Iteration 32441 / 50000) loss: 6.400874
(Iteration 32451 / 50000) loss: 6.745983
(Iteration 32461 / 50000) loss: 7.368495
(Iteration 32471 / 50000) loss: 6.770562
(Iteration 32481 / 50000) loss: 6.446168
(Iteration 32491 / 50000) loss: 7.153647
(Iteration 32501 / 50000) loss: 6.121638
(Iteration 32511 / 50000) loss: 7.129923
(Iteration 32521 / 50000) loss: 7.062949
(Iteration 32531 / 50000) loss: 6.118818
(Iteration 32541 / 50000) loss: 6.120230
(Iteration 32551 / 50000) loss: 6.228640
(Iteration 32561 / 50000) loss: 6.579809
(Iteration 32571 / 50000) loss: 5.929590
(Iteration 32581 / 50000) loss: 6.287616
(Iteration 32591 / 50000) loss: 6.106757
(Iteration 32601 / 50000) loss: 6.197225
(Iteration 32611 / 50000) loss: 5.736285
(Iteration 32621 / 50000) loss: 6.658434
(Iteration 32631 / 50000) loss: 6.554923
(Iteration 32641 / 50000) loss: 6.476481
(Iteration 32651 / 50000) loss: 6.562245
(Iteration 32661 / 50000) loss: 6.066814
(Iteration 32671 / 50000) loss: 6.594385
(Iteration 32681 / 50000) loss: 5.970149
(Iteration 32691 / 50000) loss: 6.563892
(Iteration 32701 / 50000) loss: 6.503281
(Iteration 32711 / 50000) loss: 6.061750
(Iteration 32721 / 50000) loss: 6.219471
(Iteration 32731 / 50000) loss: 5.959925
(Iteration 32741 / 50000) loss: 6.001663
(Iteration 32751 / 50000) loss: 5.909231
(Iteration 32761 / 50000) loss: 6.885128
(Iteration 32771 / 50000) loss: 6.364004
(Iteration 32781 / 50000) loss: 6.624065
(Iteration 32791 / 50000) loss: 6.111480
(Iteration 32801 / 50000) loss: 7.202744
(Iteration 32811 / 50000) loss: 6.226539
(Iteration 32821 / 50000) loss: 6.577796
(Iteration 32831 / 50000) loss: 6.729563
(Iteration 32841 / 50000) loss: 6.482417
(Iteration 32851 / 50000) loss: 6.516869
(Iteration 32861 / 50000) loss: 5.916538
(Iteration 32871 / 50000) loss: 6.775447
(Iteration 32881 / 50000) loss: 6.508531
(Iteration 32891 / 50000) loss: 6.639704
(Iteration 32901 / 50000) loss: 6.107874
(Iteration 32911 / 50000) loss: 5.847914
(Iteration 32921 / 50000) loss: 7.073089
(Iteration 32931 / 50000) loss: 6.020523
(Iteration 32941 / 50000) loss: 5.867306
(Iteration 32951 / 50000) loss: 6.397351
(Iteration 32961 / 50000) loss: 6.559034
(Iteration 32971 / 50000) loss: 6.465102
(Iteration 32981 / 50000) loss: 6.275652
(Iteration 32991 / 50000) loss: 6.115503
(Iteration 33001 / 50000) loss: 6.772652
(Iteration 33011 / 50000) loss: 6.798830
(Iteration 33021 / 50000) loss: 6.170609
(Iteration 33031 / 50000) loss: 7.128184
(Iteration 33041 / 50000) loss: 6.084116
(Iteration 33051 / 50000) loss: 5.475770
(Iteration 33061 / 50000) loss: 6.842925
(Iteration 33071 / 50000) loss: 6.793855
(Iteration 33081 / 50000) loss: 6.113295
(Iteration 33091 / 50000) loss: 7.060971
(Iteration 33101 / 50000) loss: 6.453397
(Iteration 33111 / 50000) loss: 6.051717
(Iteration 33121 / 50000) loss: 6.802320
(Iteration 33131 / 50000) loss: 6.272760
(Iteration 33141 / 50000) loss: 5.477293
(Iteration 33151 / 50000) loss: 6.957388
(Iteration 33161 / 50000) loss: 6.737566
(Iteration 33171 / 50000) loss: 6.199370
(Iteration 33181 / 50000) loss: 6.382353
(Iteration 33191 / 50000) loss: 6.089738
(Iteration 33201 / 50000) loss: 6.061399
(Iteration 33211 / 50000) loss: 6.317818
(Iteration 33221 / 50000) loss: 6.409702
(Iteration 33231 / 50000) loss: 6.193429
(Iteration 33241 / 50000) loss: 6.608880
(Iteration 33251 / 50000) loss: 6.006261
(Iteration 33261 / 50000) loss: 6.636309
(Iteration 33271 / 50000) loss: 6.478284
(Iteration 33281 / 50000) loss: 6.186926
(Iteration 33291 / 50000) loss: 6.117874
(Iteration 33301 / 50000) loss: 7.377424
(Iteration 33311 / 50000) loss: 6.500027
(Iteration 33321 / 50000) loss: 6.468626
(Iteration 33331 / 50000) loss: 6.903806
(Iteration 33341 / 50000) loss: 6.419773
(Iteration 33351 / 50000) loss: 6.582051
(Iteration 33361 / 50000) loss: 5.953413
(Iteration 33371 / 50000) loss: 6.712571
(Iteration 33381 / 50000) loss: 5.727542
(Iteration 33391 / 50000) loss: 7.186527
(Iteration 33401 / 50000) loss: 6.534514
(Iteration 33411 / 50000) loss: 5.832942
(Iteration 33421 / 50000) loss: 5.950591
(Iteration 33431 / 50000) loss: 6.075368
(Iteration 33441 / 50000) loss: 5.949391
(Iteration 33451 / 50000) loss: 6.525452
(Iteration 33461 / 50000) loss: 6.722688
(Iteration 33471 / 50000) loss: 6.972256
(Iteration 33481 / 50000) loss: 5.715440
(Iteration 33491 / 50000) loss: 6.921008
(Iteration 33501 / 50000) loss: 7.143597
(Iteration 33511 / 50000) loss: 6.426470
(Iteration 33521 / 50000) loss: 6.610262
(Iteration 33531 / 50000) loss: 6.150328
(Iteration 33541 / 50000) loss: 6.095324
(Iteration 33551 / 50000) loss: 6.705926
(Iteration 33561 / 50000) loss: 5.756299
(Iteration 33571 / 50000) loss: 5.158575
(Iteration 33581 / 50000) loss: 5.635501
(Iteration 33591 / 50000) loss: 5.419860
(Iteration 33601 / 50000) loss: 6.097139
(Iteration 33611 / 50000) loss: 6.319955
(Iteration 33621 / 50000) loss: 6.460305
(Iteration 33631 / 50000) loss: 6.821310
(Iteration 33641 / 50000) loss: 6.208306
(Iteration 33651 / 50000) loss: 6.041909
(Iteration 33661 / 50000) loss: 6.176602
(Iteration 33671 / 50000) loss: 6.217665
(Iteration 33681 / 50000) loss: 5.632294
(Iteration 33691 / 50000) loss: 5.821758
(Iteration 33701 / 50000) loss: 6.120054
(Iteration 33711 / 50000) loss: 6.232434
(Iteration 33721 / 50000) loss: 6.575611
(Iteration 33731 / 50000) loss: 6.484099
(Iteration 33741 / 50000) loss: 6.607393
(Iteration 33751 / 50000) loss: 5.625647
(Iteration 33761 / 50000) loss: 7.011692
(Iteration 33771 / 50000) loss: 6.018433
(Iteration 33781 / 50000) loss: 6.138358
(Iteration 33791 / 50000) loss: 6.408087
(Iteration 33801 / 50000) loss: 5.877224
(Iteration 33811 / 50000) loss: 6.084623
(Iteration 33821 / 50000) loss: 6.179813
(Iteration 33831 / 50000) loss: 6.551014
(Iteration 33841 / 50000) loss: 5.901298
(Iteration 33851 / 50000) loss: 6.417708
(Iteration 33861 / 50000) loss: 5.939698
(Iteration 33871 / 50000) loss: 6.184933
(Iteration 33881 / 50000) loss: 5.931549
(Iteration 33891 / 50000) loss: 6.722104
(Iteration 33901 / 50000) loss: 6.124042
(Iteration 33911 / 50000) loss: 6.580227
(Iteration 33921 / 50000) loss: 6.615595
(Iteration 33931 / 50000) loss: 5.465714
(Iteration 33941 / 50000) loss: 5.814246
(Iteration 33951 / 50000) loss: 6.640102
(Iteration 33961 / 50000) loss: 6.369732
(Iteration 33971 / 50000) loss: 6.654973
(Iteration 33981 / 50000) loss: 5.922307
(Iteration 33991 / 50000) loss: 6.056777
(Iteration 34001 / 50000) loss: 5.867270
(Iteration 34011 / 50000) loss: 6.391993
(Iteration 34021 / 50000) loss: 6.202279
(Iteration 34031 / 50000) loss: 5.790286
(Iteration 34041 / 50000) loss: 6.703792
(Iteration 34051 / 50000) loss: 6.080486
(Iteration 34061 / 50000) loss: 6.486394
(Iteration 34071 / 50000) loss: 6.563314
(Iteration 34081 / 50000) loss: 5.719879
(Iteration 34091 / 50000) loss: 6.153733
(Iteration 34101 / 50000) loss: 6.230984
(Iteration 34111 / 50000) loss: 7.452268
(Iteration 34121 / 50000) loss: 6.211860
(Iteration 34131 / 50000) loss: 6.504547
(Iteration 34141 / 50000) loss: 5.935830
(Iteration 34151 / 50000) loss: 6.001194
(Iteration 34161 / 50000) loss: 5.892824
(Iteration 34171 / 50000) loss: 5.446536
(Iteration 34181 / 50000) loss: 5.688439
(Iteration 34191 / 50000) loss: 5.732302
(Iteration 34201 / 50000) loss: 5.745043
(Iteration 34211 / 50000) loss: 6.219178
(Iteration 34221 / 50000) loss: 6.234943
(Iteration 34231 / 50000) loss: 5.948501
(Iteration 34241 / 50000) loss: 6.308642
(Iteration 34251 / 50000) loss: 6.210519
(Iteration 34261 / 50000) loss: 6.547621
(Iteration 34271 / 50000) loss: 6.109004
(Iteration 34281 / 50000) loss: 6.310601
(Iteration 34291 / 50000) loss: 5.916971
(Iteration 34301 / 50000) loss: 6.076842
(Iteration 34311 / 50000) loss: 6.396098
(Iteration 34321 / 50000) loss: 5.964274
(Iteration 34331 / 50000) loss: 6.288417
(Iteration 34341 / 50000) loss: 5.619789
(Iteration 34351 / 50000) loss: 6.473602
(Iteration 34361 / 50000) loss: 6.245280
(Iteration 34371 / 50000) loss: 6.345651
(Iteration 34381 / 50000) loss: 6.012103
(Iteration 34391 / 50000) loss: 6.077355
(Iteration 34401 / 50000) loss: 6.581474
(Iteration 34411 / 50000) loss: 6.707255
(Iteration 34421 / 50000) loss: 6.492517
(Iteration 34431 / 50000) loss: 6.341260
(Iteration 34441 / 50000) loss: 6.645651
(Iteration 34451 / 50000) loss: 6.078804
(Iteration 34461 / 50000) loss: 6.982769
(Iteration 34471 / 50000) loss: 6.327593
(Iteration 34481 / 50000) loss: 6.852869
(Iteration 34491 / 50000) loss: 6.353560
(Iteration 34501 / 50000) loss: 6.103495
(Iteration 34511 / 50000) loss: 6.699489
(Iteration 34521 / 50000) loss: 6.114716
(Iteration 34531 / 50000) loss: 5.528866
(Iteration 34541 / 50000) loss: 6.243149
(Iteration 34551 / 50000) loss: 6.495166
(Iteration 34561 / 50000) loss: 5.942905
(Iteration 34571 / 50000) loss: 5.183140
(Iteration 34581 / 50000) loss: 6.180903
(Iteration 34591 / 50000) loss: 5.963105
(Iteration 34601 / 50000) loss: 5.794618
(Iteration 34611 / 50000) loss: 5.709130
(Iteration 34621 / 50000) loss: 6.213696
(Iteration 34631 / 50000) loss: 5.908803
(Iteration 34641 / 50000) loss: 5.579823
(Iteration 34651 / 50000) loss: 6.539295
(Iteration 34661 / 50000) loss: 6.772353
(Iteration 34671 / 50000) loss: 6.456310
(Iteration 34681 / 50000) loss: 6.438612
(Iteration 34691 / 50000) loss: 6.116370
(Iteration 34701 / 50000) loss: 6.354158
(Iteration 34711 / 50000) loss: 6.202469
(Iteration 34721 / 50000) loss: 6.567620
(Iteration 34731 / 50000) loss: 6.564256
(Iteration 34741 / 50000) loss: 6.698931
(Iteration 34751 / 50000) loss: 6.491751
(Iteration 34761 / 50000) loss: 5.440385
(Iteration 34771 / 50000) loss: 5.842072
(Iteration 34781 / 50000) loss: 5.659965
(Iteration 34791 / 50000) loss: 6.294562
(Iteration 34801 / 50000) loss: 6.846735
(Iteration 34811 / 50000) loss: 6.912852
(Iteration 34821 / 50000) loss: 5.708560
(Iteration 34831 / 50000) loss: 6.504806
(Iteration 34841 / 50000) loss: 6.750461
(Iteration 34851 / 50000) loss: 5.888388
(Iteration 34861 / 50000) loss: 5.629185
(Iteration 34871 / 50000) loss: 5.527789
(Iteration 34881 / 50000) loss: 5.995371
(Iteration 34891 / 50000) loss: 6.053866
(Iteration 34901 / 50000) loss: 6.411787
(Iteration 34911 / 50000) loss: 5.905584
(Iteration 34921 / 50000) loss: 6.426443
(Iteration 34931 / 50000) loss: 5.839538
(Iteration 34941 / 50000) loss: 6.094960
(Iteration 34951 / 50000) loss: 6.055451
(Iteration 34961 / 50000) loss: 5.881227
(Iteration 34971 / 50000) loss: 5.879697
(Iteration 34981 / 50000) loss: 6.260236
(Iteration 34991 / 50000) loss: 6.223913
(Iteration 35001 / 50000) loss: 6.204674
(Iteration 35011 / 50000) loss: 5.972530
(Iteration 35021 / 50000) loss: 5.949280
(Iteration 35031 / 50000) loss: 5.799331
(Iteration 35041 / 50000) loss: 6.151718
(Iteration 35051 / 50000) loss: 5.747576
(Iteration 35061 / 50000) loss: 5.827331
(Iteration 35071 / 50000) loss: 6.906164
(Iteration 35081 / 50000) loss: 5.362863
(Iteration 35091 / 50000) loss: 5.934331
(Iteration 35101 / 50000) loss: 5.990564
(Iteration 35111 / 50000) loss: 5.278743
(Iteration 35121 / 50000) loss: 5.737533
(Iteration 35131 / 50000) loss: 5.902167
(Iteration 35141 / 50000) loss: 6.197280
(Iteration 35151 / 50000) loss: 6.460800
(Iteration 35161 / 50000) loss: 5.992568
(Iteration 35171 / 50000) loss: 6.838581
(Iteration 35181 / 50000) loss: 6.224810
(Iteration 35191 / 50000) loss: 6.075601
(Iteration 35201 / 50000) loss: 5.392961
(Iteration 35211 / 50000) loss: 6.184629
(Iteration 35221 / 50000) loss: 5.777076
(Iteration 35231 / 50000) loss: 6.553021
(Iteration 35241 / 50000) loss: 6.025959
(Iteration 35251 / 50000) loss: 5.792927
(Iteration 35261 / 50000) loss: 5.356376
(Iteration 35271 / 50000) loss: 5.806860
(Iteration 35281 / 50000) loss: 6.307287
(Iteration 35291 / 50000) loss: 5.579786
(Iteration 35301 / 50000) loss: 6.131964
(Iteration 35311 / 50000) loss: 6.263736
(Iteration 35321 / 50000) loss: 6.141942
(Iteration 35331 / 50000) loss: 5.610599
(Iteration 35341 / 50000) loss: 5.796452
(Iteration 35351 / 50000) loss: 5.655140
(Iteration 35361 / 50000) loss: 6.235911
(Iteration 35371 / 50000) loss: 6.123786
(Iteration 35381 / 50000) loss: 5.965900
(Iteration 35391 / 50000) loss: 6.142193
(Iteration 35401 / 50000) loss: 5.726459
(Iteration 35411 / 50000) loss: 6.336547
(Iteration 35421 / 50000) loss: 6.048051
(Iteration 35431 / 50000) loss: 6.855697
(Iteration 35441 / 50000) loss: 5.779220
(Iteration 35451 / 50000) loss: 5.888858
(Iteration 35461 / 50000) loss: 6.149855
(Iteration 35471 / 50000) loss: 5.551447
(Iteration 35481 / 50000) loss: 6.971084
(Iteration 35491 / 50000) loss: 6.260471
(Iteration 35501 / 50000) loss: 6.359081
(Iteration 35511 / 50000) loss: 5.476464
(Iteration 35521 / 50000) loss: 6.608962
(Iteration 35531 / 50000) loss: 6.042954
(Iteration 35541 / 50000) loss: 5.980465
(Iteration 35551 / 50000) loss: 5.738495
(Iteration 35561 / 50000) loss: 5.806937
(Iteration 35571 / 50000) loss: 6.203372
(Iteration 35581 / 50000) loss: 5.691847
(Iteration 35591 / 50000) loss: 6.286928
(Iteration 35601 / 50000) loss: 6.081672
(Iteration 35611 / 50000) loss: 5.833375
(Iteration 35621 / 50000) loss: 5.181817
(Iteration 35631 / 50000) loss: 6.511720
(Iteration 35641 / 50000) loss: 5.961867
(Iteration 35651 / 50000) loss: 5.900588
(Iteration 35661 / 50000) loss: 6.183792
(Iteration 35671 / 50000) loss: 5.962975
(Iteration 35681 / 50000) loss: 5.293322
(Iteration 35691 / 50000) loss: 6.573689
(Iteration 35701 / 50000) loss: 6.046567
(Iteration 35711 / 50000) loss: 5.536121
(Iteration 35721 / 50000) loss: 5.340922
(Iteration 35731 / 50000) loss: 5.577808
(Iteration 35741 / 50000) loss: 6.259829
(Iteration 35751 / 50000) loss: 6.788752
(Iteration 35761 / 50000) loss: 5.812124
(Iteration 35771 / 50000) loss: 5.141500
(Iteration 35781 / 50000) loss: 5.890078
(Iteration 35791 / 50000) loss: 6.436054
(Iteration 35801 / 50000) loss: 6.335947
(Iteration 35811 / 50000) loss: 5.637908
(Iteration 35821 / 50000) loss: 5.570034
(Iteration 35831 / 50000) loss: 5.946718
(Iteration 35841 / 50000) loss: 5.963629
(Iteration 35851 / 50000) loss: 5.977232
(Iteration 35861 / 50000) loss: 5.558967
(Iteration 35871 / 50000) loss: 6.020352
(Iteration 35881 / 50000) loss: 5.496058
(Iteration 35891 / 50000) loss: 5.307707
(Iteration 35901 / 50000) loss: 5.772600
(Iteration 35911 / 50000) loss: 5.850235
(Iteration 35921 / 50000) loss: 5.878691
(Iteration 35931 / 50000) loss: 6.143219
(Iteration 35941 / 50000) loss: 5.808921
(Iteration 35951 / 50000) loss: 6.156000
(Iteration 35961 / 50000) loss: 5.300365
(Iteration 35971 / 50000) loss: 5.321035
(Iteration 35981 / 50000) loss: 5.980645
(Iteration 35991 / 50000) loss: 5.786727
(Iteration 36001 / 50000) loss: 5.798022
(Iteration 36011 / 50000) loss: 5.649387
(Iteration 36021 / 50000) loss: 6.292190
(Iteration 36031 / 50000) loss: 6.616163
(Iteration 36041 / 50000) loss: 6.275842
(Iteration 36051 / 50000) loss: 5.290043
(Iteration 36061 / 50000) loss: 5.553420
(Iteration 36071 / 50000) loss: 5.259310
(Iteration 36081 / 50000) loss: 6.280230
(Iteration 36091 / 50000) loss: 6.552514
(Iteration 36101 / 50000) loss: 5.458178
(Iteration 36111 / 50000) loss: 5.140199
(Iteration 36121 / 50000) loss: 5.312321
(Iteration 36131 / 50000) loss: 5.107644
(Iteration 36141 / 50000) loss: 6.172249
(Iteration 36151 / 50000) loss: 6.057468
(Iteration 36161 / 50000) loss: 5.576723
(Iteration 36171 / 50000) loss: 6.294926
(Iteration 36181 / 50000) loss: 6.251414
(Iteration 36191 / 50000) loss: 6.843019
(Iteration 36201 / 50000) loss: 6.187479
(Iteration 36211 / 50000) loss: 6.343057
(Iteration 36221 / 50000) loss: 6.223429
(Iteration 36231 / 50000) loss: 6.192047
(Iteration 36241 / 50000) loss: 5.179218
(Iteration 36251 / 50000) loss: 6.274055
(Iteration 36261 / 50000) loss: 5.865305
(Iteration 36271 / 50000) loss: 5.916608
(Iteration 36281 / 50000) loss: 6.720035
(Iteration 36291 / 50000) loss: 5.850962
(Iteration 36301 / 50000) loss: 6.465898
(Iteration 36311 / 50000) loss: 5.610715
(Iteration 36321 / 50000) loss: 6.273708
(Iteration 36331 / 50000) loss: 6.669589
(Iteration 36341 / 50000) loss: 5.562871
(Iteration 36351 / 50000) loss: 6.324941
(Iteration 36361 / 50000) loss: 6.099401
(Iteration 36371 / 50000) loss: 6.226664
(Iteration 36381 / 50000) loss: 5.820413
(Iteration 36391 / 50000) loss: 5.433854
(Iteration 36401 / 50000) loss: 5.980405
(Iteration 36411 / 50000) loss: 6.236756
(Iteration 36421 / 50000) loss: 6.477215
(Iteration 36431 / 50000) loss: 6.668716
(Iteration 36441 / 50000) loss: 6.384339
(Iteration 36451 / 50000) loss: 5.753316
(Iteration 36461 / 50000) loss: 5.427437
(Iteration 36471 / 50000) loss: 6.738424
(Iteration 36481 / 50000) loss: 6.632324
(Iteration 36491 / 50000) loss: 5.493319
(Iteration 36501 / 50000) loss: 5.341340
(Iteration 36511 / 50000) loss: 6.038316
(Iteration 36521 / 50000) loss: 5.428177
(Iteration 36531 / 50000) loss: 6.096582
(Iteration 36541 / 50000) loss: 6.372317
(Iteration 36551 / 50000) loss: 5.665790
(Iteration 36561 / 50000) loss: 6.002690
(Iteration 36571 / 50000) loss: 6.437604
(Iteration 36581 / 50000) loss: 6.103149
(Iteration 36591 / 50000) loss: 5.909182
(Iteration 36601 / 50000) loss: 5.822899
(Iteration 36611 / 50000) loss: 5.331941
(Iteration 36621 / 50000) loss: 5.863524
(Iteration 36631 / 50000) loss: 6.107216
(Iteration 36641 / 50000) loss: 5.384054
(Iteration 36651 / 50000) loss: 6.366263
(Iteration 36661 / 50000) loss: 5.997478
(Iteration 36671 / 50000) loss: 5.842438
(Iteration 36681 / 50000) loss: 5.040924
(Iteration 36691 / 50000) loss: 5.516166
(Iteration 36701 / 50000) loss: 6.492134
(Iteration 36711 / 50000) loss: 5.954701
(Iteration 36721 / 50000) loss: 6.131226
(Iteration 36731 / 50000) loss: 6.124965
(Iteration 36741 / 50000) loss: 5.449492
(Iteration 36751 / 50000) loss: 5.788028
(Iteration 36761 / 50000) loss: 6.571535
(Iteration 36771 / 50000) loss: 5.825789
(Iteration 36781 / 50000) loss: 6.092513
(Iteration 36791 / 50000) loss: 6.099758
(Iteration 36801 / 50000) loss: 5.617866
(Iteration 36811 / 50000) loss: 6.251101
(Iteration 36821 / 50000) loss: 5.928293
(Iteration 36831 / 50000) loss: 5.472995
(Iteration 36841 / 50000) loss: 5.966611
(Iteration 36851 / 50000) loss: 5.409573
(Iteration 36861 / 50000) loss: 5.934099
(Iteration 36871 / 50000) loss: 5.569891
(Iteration 36881 / 50000) loss: 5.900413
(Iteration 36891 / 50000) loss: 6.259734
(Iteration 36901 / 50000) loss: 5.332795
(Iteration 36911 / 50000) loss: 5.156120
(Iteration 36921 / 50000) loss: 6.411997
(Iteration 36931 / 50000) loss: 5.832558
(Iteration 36941 / 50000) loss: 5.856943
(Iteration 36951 / 50000) loss: 5.496552
(Iteration 36961 / 50000) loss: 5.797537
(Iteration 36971 / 50000) loss: 5.977353
(Iteration 36981 / 50000) loss: 5.672292
(Iteration 36991 / 50000) loss: 6.522848
(Iteration 37001 / 50000) loss: 5.713881
(Iteration 37011 / 50000) loss: 5.538757
(Iteration 37021 / 50000) loss: 6.061649
(Iteration 37031 / 50000) loss: 5.746229
(Iteration 37041 / 50000) loss: 5.794309
(Iteration 37051 / 50000) loss: 5.304345
(Iteration 37061 / 50000) loss: 5.380011
(Iteration 37071 / 50000) loss: 5.281807
(Iteration 37081 / 50000) loss: 5.954399
(Iteration 37091 / 50000) loss: 6.160084
(Iteration 37101 / 50000) loss: 6.008944
(Iteration 37111 / 50000) loss: 5.231755
(Iteration 37121 / 50000) loss: 6.136147
(Iteration 37131 / 50000) loss: 5.612661
(Iteration 37141 / 50000) loss: 6.061099
(Iteration 37151 / 50000) loss: 5.650736
(Iteration 37161 / 50000) loss: 5.297460
(Iteration 37171 / 50000) loss: 5.378065
(Iteration 37181 / 50000) loss: 6.749672
(Iteration 37191 / 50000) loss: 5.768361
(Iteration 37201 / 50000) loss: 6.220980
(Iteration 37211 / 50000) loss: 5.151656
(Iteration 37221 / 50000) loss: 5.133945
(Iteration 37231 / 50000) loss: 5.274505
(Iteration 37241 / 50000) loss: 5.847536
(Iteration 37251 / 50000) loss: 5.849229
(Iteration 37261 / 50000) loss: 5.519047
(Iteration 37271 / 50000) loss: 5.266188
(Iteration 37281 / 50000) loss: 6.220080
(Iteration 37291 / 50000) loss: 5.904487
(Iteration 37301 / 50000) loss: 5.705117
(Iteration 37311 / 50000) loss: 5.735493
(Iteration 37321 / 50000) loss: 5.500305
(Iteration 37331 / 50000) loss: 6.131462
(Iteration 37341 / 50000) loss: 5.441907
(Iteration 37351 / 50000) loss: 5.673763
(Iteration 37361 / 50000) loss: 5.723968
(Iteration 37371 / 50000) loss: 5.631703
(Iteration 37381 / 50000) loss: 5.742857
(Iteration 37391 / 50000) loss: 5.936949
(Iteration 37401 / 50000) loss: 5.800348
(Iteration 37411 / 50000) loss: 5.208215
(Iteration 37421 / 50000) loss: 6.607208
(Iteration 37431 / 50000) loss: 6.062491
(Iteration 37441 / 50000) loss: 5.973464
(Iteration 37451 / 50000) loss: 5.628569
(Iteration 37461 / 50000) loss: 5.994537
(Iteration 37471 / 50000) loss: 5.951440
(Iteration 37481 / 50000) loss: 6.103820
(Iteration 37491 / 50000) loss: 6.194556
(Iteration 37501 / 50000) loss: 4.757592
(Iteration 37511 / 50000) loss: 6.009471
(Iteration 37521 / 50000) loss: 5.655467
(Iteration 37531 / 50000) loss: 6.039258
(Iteration 37541 / 50000) loss: 5.476729
(Iteration 37551 / 50000) loss: 6.084130
(Iteration 37561 / 50000) loss: 6.418392
(Iteration 37571 / 50000) loss: 6.323510
(Iteration 37581 / 50000) loss: 6.017204
(Iteration 37591 / 50000) loss: 6.008333
(Iteration 37601 / 50000) loss: 5.748596
(Iteration 37611 / 50000) loss: 5.701552
(Iteration 37621 / 50000) loss: 5.351684
(Iteration 37631 / 50000) loss: 5.112128
(Iteration 37641 / 50000) loss: 6.078094
(Iteration 37651 / 50000) loss: 5.955578
(Iteration 37661 / 50000) loss: 5.952730
(Iteration 37671 / 50000) loss: 5.709934
(Iteration 37681 / 50000) loss: 6.655486
(Iteration 37691 / 50000) loss: 5.391398
(Iteration 37701 / 50000) loss: 6.133303
(Iteration 37711 / 50000) loss: 5.664954
(Iteration 37721 / 50000) loss: 5.495217
(Iteration 37731 / 50000) loss: 6.029771
(Iteration 37741 / 50000) loss: 5.820692
(Iteration 37751 / 50000) loss: 5.651679
(Iteration 37761 / 50000) loss: 6.171649
(Iteration 37771 / 50000) loss: 5.782313
(Iteration 37781 / 50000) loss: 5.837087
(Iteration 37791 / 50000) loss: 5.784001
(Iteration 37801 / 50000) loss: 5.318070
(Iteration 37811 / 50000) loss: 5.932943
(Iteration 37821 / 50000) loss: 6.296930
(Iteration 37831 / 50000) loss: 5.004304
(Iteration 37841 / 50000) loss: 6.036933
(Iteration 37851 / 50000) loss: 5.822437
(Iteration 37861 / 50000) loss: 5.996201
(Iteration 37871 / 50000) loss: 6.105077
(Iteration 37881 / 50000) loss: 5.402329
(Iteration 37891 / 50000) loss: 5.792655
(Iteration 37901 / 50000) loss: 5.463986
(Iteration 37911 / 50000) loss: 5.076759
(Iteration 37921 / 50000) loss: 5.606504
(Iteration 37931 / 50000) loss: 6.830954
(Iteration 37941 / 50000) loss: 5.807423
(Iteration 37951 / 50000) loss: 6.080746
(Iteration 37961 / 50000) loss: 6.209117
(Iteration 37971 / 50000) loss: 5.500764
(Iteration 37981 / 50000) loss: 5.702669
(Iteration 37991 / 50000) loss: 5.856465
(Iteration 38001 / 50000) loss: 5.975811
(Iteration 38011 / 50000) loss: 5.891304
(Iteration 38021 / 50000) loss: 6.564999
(Iteration 38031 / 50000) loss: 5.659295
(Iteration 38041 / 50000) loss: 5.637956
(Iteration 38051 / 50000) loss: 4.944180
(Iteration 38061 / 50000) loss: 4.997287
(Iteration 38071 / 50000) loss: 5.978300
(Iteration 38081 / 50000) loss: 5.882469
(Iteration 38091 / 50000) loss: 5.625664
(Iteration 38101 / 50000) loss: 5.361724
(Iteration 38111 / 50000) loss: 6.055125
(Iteration 38121 / 50000) loss: 4.861261
(Iteration 38131 / 50000) loss: 6.157290
(Iteration 38141 / 50000) loss: 5.629681
(Iteration 38151 / 50000) loss: 5.415113
(Iteration 38161 / 50000) loss: 6.104999
(Iteration 38171 / 50000) loss: 5.833349
(Iteration 38181 / 50000) loss: 6.111832
(Iteration 38191 / 50000) loss: 5.579053
(Iteration 38201 / 50000) loss: 6.075328
(Iteration 38211 / 50000) loss: 5.024038
(Iteration 38221 / 50000) loss: 5.510834
(Iteration 38231 / 50000) loss: 5.756928
(Iteration 38241 / 50000) loss: 5.698095
(Iteration 38251 / 50000) loss: 4.784123
(Iteration 38261 / 50000) loss: 5.427448
(Iteration 38271 / 50000) loss: 5.782633
(Iteration 38281 / 50000) loss: 5.987512
(Iteration 38291 / 50000) loss: 5.478483
(Iteration 38301 / 50000) loss: 5.639373
(Iteration 38311 / 50000) loss: 5.708324
(Iteration 38321 / 50000) loss: 5.164860
(Iteration 38331 / 50000) loss: 5.491224
(Iteration 38341 / 50000) loss: 5.438339
(Iteration 38351 / 50000) loss: 5.637159
(Iteration 38361 / 50000) loss: 6.148258
(Iteration 38371 / 50000) loss: 5.807697
(Iteration 38381 / 50000) loss: 5.584172
(Iteration 38391 / 50000) loss: 5.968767
(Iteration 38401 / 50000) loss: 4.996792
(Iteration 38411 / 50000) loss: 5.806882
(Iteration 38421 / 50000) loss: 4.645729
(Iteration 38431 / 50000) loss: 5.334190
(Iteration 38441 / 50000) loss: 5.091575
(Iteration 38451 / 50000) loss: 6.112029
(Iteration 38461 / 50000) loss: 6.071476
(Iteration 38471 / 50000) loss: 5.784314
(Iteration 38481 / 50000) loss: 5.369188
(Iteration 38491 / 50000) loss: 5.330884
(Iteration 38501 / 50000) loss: 5.292777
(Iteration 38511 / 50000) loss: 5.578560
(Iteration 38521 / 50000) loss: 5.530669
(Iteration 38531 / 50000) loss: 6.297039
(Iteration 38541 / 50000) loss: 5.705751
(Iteration 38551 / 50000) loss: 5.659738
(Iteration 38561 / 50000) loss: 6.075870
(Iteration 38571 / 50000) loss: 5.316460
(Iteration 38581 / 50000) loss: 5.566412
(Iteration 38591 / 50000) loss: 6.213158
(Iteration 38601 / 50000) loss: 5.902636
(Iteration 38611 / 50000) loss: 5.844347
(Iteration 38621 / 50000) loss: 5.077012
(Iteration 38631 / 50000) loss: 5.975278
(Iteration 38641 / 50000) loss: 5.764465
(Iteration 38651 / 50000) loss: 5.452162
(Iteration 38661 / 50000) loss: 5.611415
(Iteration 38671 / 50000) loss: 5.572511
(Iteration 38681 / 50000) loss: 6.690976
(Iteration 38691 / 50000) loss: 5.648686
(Iteration 38701 / 50000) loss: 5.487986
(Iteration 38711 / 50000) loss: 5.205107
(Iteration 38721 / 50000) loss: 6.487259
(Iteration 38731 / 50000) loss: 5.489800
(Iteration 38741 / 50000) loss: 5.669317
(Iteration 38751 / 50000) loss: 5.528761
(Iteration 38761 / 50000) loss: 5.729817
(Iteration 38771 / 50000) loss: 5.759655
(Iteration 38781 / 50000) loss: 5.328182
(Iteration 38791 / 50000) loss: 5.806673
(Iteration 38801 / 50000) loss: 5.280409
(Iteration 38811 / 50000) loss: 5.091848
(Iteration 38821 / 50000) loss: 5.260470
(Iteration 38831 / 50000) loss: 5.809851
(Iteration 38841 / 50000) loss: 6.064303
(Iteration 38851 / 50000) loss: 5.482559
(Iteration 38861 / 50000) loss: 6.205742
(Iteration 38871 / 50000) loss: 5.453959
(Iteration 38881 / 50000) loss: 4.948289
(Iteration 38891 / 50000) loss: 5.572090
(Iteration 38901 / 50000) loss: 6.101616
(Iteration 38911 / 50000) loss: 5.939502
(Iteration 38921 / 50000) loss: 5.369668
(Iteration 38931 / 50000) loss: 5.281850
(Iteration 38941 / 50000) loss: 5.759258
(Iteration 38951 / 50000) loss: 6.047507
(Iteration 38961 / 50000) loss: 5.254425
(Iteration 38971 / 50000) loss: 5.720087
(Iteration 38981 / 50000) loss: 5.678466
(Iteration 38991 / 50000) loss: 5.998131
(Iteration 39001 / 50000) loss: 5.617399
(Iteration 39011 / 50000) loss: 5.496122
(Iteration 39021 / 50000) loss: 5.938125
(Iteration 39031 / 50000) loss: 5.002325
(Iteration 39041 / 50000) loss: 5.582337
(Iteration 39051 / 50000) loss: 5.716881
(Iteration 39061 / 50000) loss: 5.632376
(Iteration 39071 / 50000) loss: 5.465307
(Iteration 39081 / 50000) loss: 6.011772
(Iteration 39091 / 50000) loss: 5.048175
(Iteration 39101 / 50000) loss: 6.481903
(Iteration 39111 / 50000) loss: 6.240449
(Iteration 39121 / 50000) loss: 5.201045
(Iteration 39131 / 50000) loss: 5.419336
(Iteration 39141 / 50000) loss: 5.954873
(Iteration 39151 / 50000) loss: 5.713175
(Iteration 39161 / 50000) loss: 5.583934
(Iteration 39171 / 50000) loss: 5.970361
(Iteration 39181 / 50000) loss: 5.706827
(Iteration 39191 / 50000) loss: 5.636537
(Iteration 39201 / 50000) loss: 6.425813
(Iteration 39211 / 50000) loss: 5.098257
(Iteration 39221 / 50000) loss: 5.794915
(Iteration 39231 / 50000) loss: 6.062106
(Iteration 39241 / 50000) loss: 5.576505
(Iteration 39251 / 50000) loss: 5.104898
(Iteration 39261 / 50000) loss: 6.105964
(Iteration 39271 / 50000) loss: 5.261367
(Iteration 39281 / 50000) loss: 5.399304
(Iteration 39291 / 50000) loss: 4.744284
(Iteration 39301 / 50000) loss: 5.662089
(Iteration 39311 / 50000) loss: 5.188102
(Iteration 39321 / 50000) loss: 5.625708
(Iteration 39331 / 50000) loss: 5.119836
(Iteration 39341 / 50000) loss: 5.973935
(Iteration 39351 / 50000) loss: 5.762032
(Iteration 39361 / 50000) loss: 5.558610
(Iteration 39371 / 50000) loss: 5.799994
(Iteration 39381 / 50000) loss: 5.305445
(Iteration 39391 / 50000) loss: 6.316374
(Iteration 39401 / 50000) loss: 5.115601
(Iteration 39411 / 50000) loss: 5.193814
(Iteration 39421 / 50000) loss: 5.506855
(Iteration 39431 / 50000) loss: 5.395265
(Iteration 39441 / 50000) loss: 5.544790
(Iteration 39451 / 50000) loss: 4.928855
(Iteration 39461 / 50000) loss: 5.722966
(Iteration 39471 / 50000) loss: 4.867957
(Iteration 39481 / 50000) loss: 6.148445
(Iteration 39491 / 50000) loss: 5.832144
(Iteration 39501 / 50000) loss: 5.663852
(Iteration 39511 / 50000) loss: 6.383483
(Iteration 39521 / 50000) loss: 5.597765
(Iteration 39531 / 50000) loss: 5.550085
(Iteration 39541 / 50000) loss: 5.143458
(Iteration 39551 / 50000) loss: 6.131054
(Iteration 39561 / 50000) loss: 5.770988
(Iteration 39571 / 50000) loss: 5.770063
(Iteration 39581 / 50000) loss: 5.984470
(Iteration 39591 / 50000) loss: 5.540641
(Iteration 39601 / 50000) loss: 5.507337
(Iteration 39611 / 50000) loss: 5.664418
(Iteration 39621 / 50000) loss: 4.842382
(Iteration 39631 / 50000) loss: 5.762154
(Iteration 39641 / 50000) loss: 6.186003
(Iteration 39651 / 50000) loss: 5.573048
(Iteration 39661 / 50000) loss: 6.032389
(Iteration 39671 / 50000) loss: 5.383506
(Iteration 39681 / 50000) loss: 6.101142
(Iteration 39691 / 50000) loss: 5.582388
(Iteration 39701 / 50000) loss: 5.967876
(Iteration 39711 / 50000) loss: 5.096036
(Iteration 39721 / 50000) loss: 6.222730
(Iteration 39731 / 50000) loss: 5.696388
(Iteration 39741 / 50000) loss: 5.482920
(Iteration 39751 / 50000) loss: 5.141216
(Iteration 39761 / 50000) loss: 5.696793
(Iteration 39771 / 50000) loss: 5.259194
(Iteration 39781 / 50000) loss: 5.060760
(Iteration 39791 / 50000) loss: 5.748764
(Iteration 39801 / 50000) loss: 5.909417
(Iteration 39811 / 50000) loss: 5.851249
(Iteration 39821 / 50000) loss: 5.197184
(Iteration 39831 / 50000) loss: 5.286833
(Iteration 39841 / 50000) loss: 5.271277
(Iteration 39851 / 50000) loss: 5.905715
(Iteration 39861 / 50000) loss: 5.173069
(Iteration 39871 / 50000) loss: 6.187474
(Iteration 39881 / 50000) loss: 4.937187
(Iteration 39891 / 50000) loss: 6.356470
(Iteration 39901 / 50000) loss: 5.909963
(Iteration 39911 / 50000) loss: 6.231213
(Iteration 39921 / 50000) loss: 5.692571
(Iteration 39931 / 50000) loss: 5.867777
(Iteration 39941 / 50000) loss: 5.640531
(Iteration 39951 / 50000) loss: 5.444580
(Iteration 39961 / 50000) loss: 5.590076
(Iteration 39971 / 50000) loss: 5.972519
(Iteration 39981 / 50000) loss: 5.630186
(Iteration 39991 / 50000) loss: 5.496582
(Iteration 40001 / 50000) loss: 6.149565
(Iteration 40011 / 50000) loss: 5.547661
(Iteration 40021 / 50000) loss: 5.612764
(Iteration 40031 / 50000) loss: 5.580825
(Iteration 40041 / 50000) loss: 6.050694
(Iteration 40051 / 50000) loss: 5.787826
(Iteration 40061 / 50000) loss: 5.178556
(Iteration 40071 / 50000) loss: 5.366058
(Iteration 40081 / 50000) loss: 5.519518
(Iteration 40091 / 50000) loss: 4.847500
(Iteration 40101 / 50000) loss: 5.630235
(Iteration 40111 / 50000) loss: 5.781289
(Iteration 40121 / 50000) loss: 5.316692
(Iteration 40131 / 50000) loss: 4.927374
(Iteration 40141 / 50000) loss: 5.735411
(Iteration 40151 / 50000) loss: 5.474232
(Iteration 40161 / 50000) loss: 5.052732
(Iteration 40171 / 50000) loss: 5.580441
(Iteration 40181 / 50000) loss: 5.851517
(Iteration 40191 / 50000) loss: 5.907941
(Iteration 40201 / 50000) loss: 5.634331
(Iteration 40211 / 50000) loss: 5.282181
(Iteration 40221 / 50000) loss: 5.239828
(Iteration 40231 / 50000) loss: 4.759852
(Iteration 40241 / 50000) loss: 5.529556
(Iteration 40251 / 50000) loss: 5.774253
(Iteration 40261 / 50000) loss: 5.864738
(Iteration 40271 / 50000) loss: 5.413490
(Iteration 40281 / 50000) loss: 5.434329
(Iteration 40291 / 50000) loss: 5.268287
(Iteration 40301 / 50000) loss: 6.131868
(Iteration 40311 / 50000) loss: 5.366401
(Iteration 40321 / 50000) loss: 5.979022
(Iteration 40331 / 50000) loss: 5.480539
(Iteration 40341 / 50000) loss: 4.876818
(Iteration 40351 / 50000) loss: 5.845873
(Iteration 40361 / 50000) loss: 5.919489
(Iteration 40371 / 50000) loss: 5.669002
(Iteration 40381 / 50000) loss: 5.779053
(Iteration 40391 / 50000) loss: 5.259434
(Iteration 40401 / 50000) loss: 5.834647
(Iteration 40411 / 50000) loss: 5.530699
(Iteration 40421 / 50000) loss: 5.849983
(Iteration 40431 / 50000) loss: 5.558140
(Iteration 40441 / 50000) loss: 5.425610
(Iteration 40451 / 50000) loss: 5.443937
(Iteration 40461 / 50000) loss: 5.328678
(Iteration 40471 / 50000) loss: 5.829209
(Iteration 40481 / 50000) loss: 5.436939
(Iteration 40491 / 50000) loss: 5.085971
(Iteration 40501 / 50000) loss: 5.102167
(Iteration 40511 / 50000) loss: 5.500458
(Iteration 40521 / 50000) loss: 4.603740
(Iteration 40531 / 50000) loss: 5.437447
(Iteration 40541 / 50000) loss: 4.834800
(Iteration 40551 / 50000) loss: 5.793722
(Iteration 40561 / 50000) loss: 5.028161
(Iteration 40571 / 50000) loss: 5.249849
(Iteration 40581 / 50000) loss: 6.238836
(Iteration 40591 / 50000) loss: 5.523617
(Iteration 40601 / 50000) loss: 5.343746
(Iteration 40611 / 50000) loss: 5.653337
(Iteration 40621 / 50000) loss: 5.837091
(Iteration 40631 / 50000) loss: 5.167337
(Iteration 40641 / 50000) loss: 4.996474
(Iteration 40651 / 50000) loss: 5.355239
(Iteration 40661 / 50000) loss: 5.804355
(Iteration 40671 / 50000) loss: 5.244627
(Iteration 40681 / 50000) loss: 5.857480
(Iteration 40691 / 50000) loss: 5.423480
(Iteration 40701 / 50000) loss: 5.437679
(Iteration 40711 / 50000) loss: 4.986427
(Iteration 40721 / 50000) loss: 5.552220
(Iteration 40731 / 50000) loss: 5.672102
(Iteration 40741 / 50000) loss: 5.322354
(Iteration 40751 / 50000) loss: 5.859338
(Iteration 40761 / 50000) loss: 5.582067
(Iteration 40771 / 50000) loss: 4.988993
(Iteration 40781 / 50000) loss: 5.153318
(Iteration 40791 / 50000) loss: 4.890143
(Iteration 40801 / 50000) loss: 4.978778
(Iteration 40811 / 50000) loss: 5.361416
(Iteration 40821 / 50000) loss: 5.590392
(Iteration 40831 / 50000) loss: 5.059602
(Iteration 40841 / 50000) loss: 5.559934
(Iteration 40851 / 50000) loss: 5.670769
(Iteration 40861 / 50000) loss: 5.902595
(Iteration 40871 / 50000) loss: 5.055137
(Iteration 40881 / 50000) loss: 5.759760
(Iteration 40891 / 50000) loss: 5.265455
(Iteration 40901 / 50000) loss: 5.437934
(Iteration 40911 / 50000) loss: 5.568351
(Iteration 40921 / 50000) loss: 5.563744
(Iteration 40931 / 50000) loss: 5.187555
(Iteration 40941 / 50000) loss: 5.763455
(Iteration 40951 / 50000) loss: 5.550783
(Iteration 40961 / 50000) loss: 5.404901
(Iteration 40971 / 50000) loss: 5.377658
(Iteration 40981 / 50000) loss: 5.080455
(Iteration 40991 / 50000) loss: 5.028849
(Iteration 41001 / 50000) loss: 5.647064
(Iteration 41011 / 50000) loss: 5.548365
(Iteration 41021 / 50000) loss: 5.213434
(Iteration 41031 / 50000) loss: 5.933386
(Iteration 41041 / 50000) loss: 4.710100
(Iteration 41051 / 50000) loss: 5.688946
(Iteration 41061 / 50000) loss: 5.217356
(Iteration 41071 / 50000) loss: 6.556028
(Iteration 41081 / 50000) loss: 5.523542
(Iteration 41091 / 50000) loss: 5.762629
(Iteration 41101 / 50000) loss: 5.575719
(Iteration 41111 / 50000) loss: 6.427012
(Iteration 41121 / 50000) loss: 5.392734
(Iteration 41131 / 50000) loss: 6.175937
(Iteration 41141 / 50000) loss: 4.964099
(Iteration 41151 / 50000) loss: 5.069688
(Iteration 41161 / 50000) loss: 5.505804
(Iteration 41171 / 50000) loss: 5.669126
(Iteration 41181 / 50000) loss: 5.256291
(Iteration 41191 / 50000) loss: 5.485484
(Iteration 41201 / 50000) loss: 4.934022
(Iteration 41211 / 50000) loss: 5.910222
(Iteration 41221 / 50000) loss: 5.686489
(Iteration 41231 / 50000) loss: 5.225760
(Iteration 41241 / 50000) loss: 5.278174
(Iteration 41251 / 50000) loss: 5.250526
(Iteration 41261 / 50000) loss: 5.641857
(Iteration 41271 / 50000) loss: 5.465962
(Iteration 41281 / 50000) loss: 5.396621
(Iteration 41291 / 50000) loss: 6.343080
(Iteration 41301 / 50000) loss: 5.185457
(Iteration 41311 / 50000) loss: 5.703909
(Iteration 41321 / 50000) loss: 4.899514
(Iteration 41331 / 50000) loss: 5.325303
(Iteration 41341 / 50000) loss: 5.498416
(Iteration 41351 / 50000) loss: 4.995251
(Iteration 41361 / 50000) loss: 5.121667
(Iteration 41371 / 50000) loss: 5.429056
(Iteration 41381 / 50000) loss: 6.008427
(Iteration 41391 / 50000) loss: 5.109831
(Iteration 41401 / 50000) loss: 5.015337
(Iteration 41411 / 50000) loss: 5.681370
(Iteration 41421 / 50000) loss: 5.787730
(Iteration 41431 / 50000) loss: 5.261144
(Iteration 41441 / 50000) loss: 5.234658
(Iteration 41451 / 50000) loss: 5.918540
(Iteration 41461 / 50000) loss: 5.502487
(Iteration 41471 / 50000) loss: 5.582723
(Iteration 41481 / 50000) loss: 5.203522
(Iteration 41491 / 50000) loss: 5.471304
(Iteration 41501 / 50000) loss: 5.442958
(Iteration 41511 / 50000) loss: 4.905884
(Iteration 41521 / 50000) loss: 5.394290
(Iteration 41531 / 50000) loss: 5.872674
(Iteration 41541 / 50000) loss: 6.421974
(Iteration 41551 / 50000) loss: 6.110113
(Iteration 41561 / 50000) loss: 5.176553
(Iteration 41571 / 50000) loss: 4.898505
(Iteration 41581 / 50000) loss: 5.626280
(Iteration 41591 / 50000) loss: 5.308990
(Iteration 41601 / 50000) loss: 5.314257
(Iteration 41611 / 50000) loss: 5.462137
(Iteration 41621 / 50000) loss: 4.984379
(Iteration 41631 / 50000) loss: 5.697966
(Iteration 41641 / 50000) loss: 5.314289
(Iteration 41651 / 50000) loss: 5.046948
(Iteration 41661 / 50000) loss: 5.545802
(Iteration 41671 / 50000) loss: 4.957982
(Iteration 41681 / 50000) loss: 5.675345
(Iteration 41691 / 50000) loss: 5.556287
(Iteration 41701 / 50000) loss: 5.751751
(Iteration 41711 / 50000) loss: 5.502299
(Iteration 41721 / 50000) loss: 4.837165
(Iteration 41731 / 50000) loss: 4.673269
(Iteration 41741 / 50000) loss: 5.997752
(Iteration 41751 / 50000) loss: 5.600921
(Iteration 41761 / 50000) loss: 5.012630
(Iteration 41771 / 50000) loss: 5.419868
(Iteration 41781 / 50000) loss: 4.929330
(Iteration 41791 / 50000) loss: 5.182435
(Iteration 41801 / 50000) loss: 5.928765
(Iteration 41811 / 50000) loss: 5.504768
(Iteration 41821 / 50000) loss: 5.673706
(Iteration 41831 / 50000) loss: 5.821131
(Iteration 41841 / 50000) loss: 5.613975
(Iteration 41851 / 50000) loss: 5.633980
(Iteration 41861 / 50000) loss: 6.179125
(Iteration 41871 / 50000) loss: 4.481323
(Iteration 41881 / 50000) loss: 4.895236
(Iteration 41891 / 50000) loss: 5.875729
(Iteration 41901 / 50000) loss: 5.405150
(Iteration 41911 / 50000) loss: 5.271995
(Iteration 41921 / 50000) loss: 6.060746
(Iteration 41931 / 50000) loss: 6.205720
(Iteration 41941 / 50000) loss: 5.067173
(Iteration 41951 / 50000) loss: 5.182430
(Iteration 41961 / 50000) loss: 5.953201
(Iteration 41971 / 50000) loss: 6.234339
(Iteration 41981 / 50000) loss: 5.481985
(Iteration 41991 / 50000) loss: 6.096986
(Iteration 42001 / 50000) loss: 5.193210
(Iteration 42011 / 50000) loss: 4.988473
(Iteration 42021 / 50000) loss: 5.835669
(Iteration 42031 / 50000) loss: 5.449788
(Iteration 42041 / 50000) loss: 5.380736
(Iteration 42051 / 50000) loss: 6.085981
(Iteration 42061 / 50000) loss: 5.140848
(Iteration 42071 / 50000) loss: 5.227142
(Iteration 42081 / 50000) loss: 5.878998
(Iteration 42091 / 50000) loss: 4.730460
(Iteration 42101 / 50000) loss: 4.832293
(Iteration 42111 / 50000) loss: 6.027740
(Iteration 42121 / 50000) loss: 5.337737
(Iteration 42131 / 50000) loss: 5.086346
(Iteration 42141 / 50000) loss: 4.341263
(Iteration 42151 / 50000) loss: 6.011622
(Iteration 42161 / 50000) loss: 6.182453
(Iteration 42171 / 50000) loss: 5.876332
(Iteration 42181 / 50000) loss: 5.189445
(Iteration 42191 / 50000) loss: 5.459310
(Iteration 42201 / 50000) loss: 4.939917
(Iteration 42211 / 50000) loss: 5.683856
(Iteration 42221 / 50000) loss: 4.981198
(Iteration 42231 / 50000) loss: 5.928541
(Iteration 42241 / 50000) loss: 6.026557
(Iteration 42251 / 50000) loss: 5.594046
(Iteration 42261 / 50000) loss: 5.405882
(Iteration 42271 / 50000) loss: 5.305505
(Iteration 42281 / 50000) loss: 5.358461
(Iteration 42291 / 50000) loss: 4.972814
(Iteration 42301 / 50000) loss: 5.512676
(Iteration 42311 / 50000) loss: 4.695846
(Iteration 42321 / 50000) loss: 5.011284
(Iteration 42331 / 50000) loss: 5.656192
(Iteration 42341 / 50000) loss: 4.777912
(Iteration 42351 / 50000) loss: 5.134959
(Iteration 42361 / 50000) loss: 4.943738
(Iteration 42371 / 50000) loss: 5.434310
(Iteration 42381 / 50000) loss: 5.375386
(Iteration 42391 / 50000) loss: 5.067312
(Iteration 42401 / 50000) loss: 4.750105
(Iteration 42411 / 50000) loss: 4.760344
(Iteration 42421 / 50000) loss: 5.825333
(Iteration 42431 / 50000) loss: 4.590497
(Iteration 42441 / 50000) loss: 5.549553
(Iteration 42451 / 50000) loss: 4.647309
(Iteration 42461 / 50000) loss: 5.358617
(Iteration 42471 / 50000) loss: 5.104711
(Iteration 42481 / 50000) loss: 4.808240
(Iteration 42491 / 50000) loss: 5.376156
(Iteration 42501 / 50000) loss: 5.668397
(Iteration 42511 / 50000) loss: 5.315168
(Iteration 42521 / 50000) loss: 5.257905
(Iteration 42531 / 50000) loss: 5.227761
(Iteration 42541 / 50000) loss: 5.193682
(Iteration 42551 / 50000) loss: 4.950701
(Iteration 42561 / 50000) loss: 5.138141
(Iteration 42571 / 50000) loss: 4.883028
(Iteration 42581 / 50000) loss: 5.257561
(Iteration 42591 / 50000) loss: 5.538761
(Iteration 42601 / 50000) loss: 5.380917
(Iteration 42611 / 50000) loss: 5.861836
(Iteration 42621 / 50000) loss: 5.432572
(Iteration 42631 / 50000) loss: 5.680879
(Iteration 42641 / 50000) loss: 5.291443
(Iteration 42651 / 50000) loss: 4.438146
(Iteration 42661 / 50000) loss: 5.502505
(Iteration 42671 / 50000) loss: 5.208844
(Iteration 42681 / 50000) loss: 5.949679
(Iteration 42691 / 50000) loss: 5.635554
(Iteration 42701 / 50000) loss: 5.356431
(Iteration 42711 / 50000) loss: 5.333552
(Iteration 42721 / 50000) loss: 4.740140
(Iteration 42731 / 50000) loss: 5.567846
(Iteration 42741 / 50000) loss: 5.989626
(Iteration 42751 / 50000) loss: 4.755439
(Iteration 42761 / 50000) loss: 4.843200
(Iteration 42771 / 50000) loss: 5.087350
(Iteration 42781 / 50000) loss: 5.101187
(Iteration 42791 / 50000) loss: 5.439024
(Iteration 42801 / 50000) loss: 5.650746
(Iteration 42811 / 50000) loss: 5.835640
(Iteration 42821 / 50000) loss: 5.402243
(Iteration 42831 / 50000) loss: 5.853441
(Iteration 42841 / 50000) loss: 5.228538
(Iteration 42851 / 50000) loss: 4.786377
(Iteration 42861 / 50000) loss: 4.588937
(Iteration 42871 / 50000) loss: 5.312340
(Iteration 42881 / 50000) loss: 5.402701
(Iteration 42891 / 50000) loss: 5.258197
(Iteration 42901 / 50000) loss: 5.789629
(Iteration 42911 / 50000) loss: 5.477028
(Iteration 42921 / 50000) loss: 5.026462
(Iteration 42931 / 50000) loss: 5.738497
(Iteration 42941 / 50000) loss: 5.354862
(Iteration 42951 / 50000) loss: 4.939656
(Iteration 42961 / 50000) loss: 5.010988
(Iteration 42971 / 50000) loss: 5.457916
(Iteration 42981 / 50000) loss: 5.491435
(Iteration 42991 / 50000) loss: 5.523230
(Iteration 43001 / 50000) loss: 5.484211
(Iteration 43011 / 50000) loss: 5.538059
(Iteration 43021 / 50000) loss: 5.104407
(Iteration 43031 / 50000) loss: 5.390814
(Iteration 43041 / 50000) loss: 4.811515
(Iteration 43051 / 50000) loss: 4.543898
(Iteration 43061 / 50000) loss: 4.750349
(Iteration 43071 / 50000) loss: 5.023239
(Iteration 43081 / 50000) loss: 5.313461
(Iteration 43091 / 50000) loss: 5.459360
(Iteration 43101 / 50000) loss: 5.138036
(Iteration 43111 / 50000) loss: 5.125939
(Iteration 43121 / 50000) loss: 5.220711
(Iteration 43131 / 50000) loss: 5.716977
(Iteration 43141 / 50000) loss: 5.411571
(Iteration 43151 / 50000) loss: 5.576255
(Iteration 43161 / 50000) loss: 4.976221
(Iteration 43171 / 50000) loss: 5.169958
(Iteration 43181 / 50000) loss: 5.335230
(Iteration 43191 / 50000) loss: 5.471429
(Iteration 43201 / 50000) loss: 5.833747
(Iteration 43211 / 50000) loss: 5.372652
(Iteration 43221 / 50000) loss: 5.433106
(Iteration 43231 / 50000) loss: 4.982352
(Iteration 43241 / 50000) loss: 5.456767
(Iteration 43251 / 50000) loss: 5.481292
(Iteration 43261 / 50000) loss: 4.794006
(Iteration 43271 / 50000) loss: 5.247368
(Iteration 43281 / 50000) loss: 5.442085
(Iteration 43291 / 50000) loss: 5.141669
(Iteration 43301 / 50000) loss: 5.484887
(Iteration 43311 / 50000) loss: 5.523241
(Iteration 43321 / 50000) loss: 5.006747
(Iteration 43331 / 50000) loss: 5.515036
(Iteration 43341 / 50000) loss: 5.596003
(Iteration 43351 / 50000) loss: 5.155901
(Iteration 43361 / 50000) loss: 4.882652
(Iteration 43371 / 50000) loss: 5.392405
(Iteration 43381 / 50000) loss: 5.464572
(Iteration 43391 / 50000) loss: 5.223067
(Iteration 43401 / 50000) loss: 5.046177
(Iteration 43411 / 50000) loss: 4.842057
(Iteration 43421 / 50000) loss: 5.789438
(Iteration 43431 / 50000) loss: 4.935156
(Iteration 43441 / 50000) loss: 4.902918
(Iteration 43451 / 50000) loss: 5.883281
(Iteration 43461 / 50000) loss: 4.960232
(Iteration 43471 / 50000) loss: 5.534606
(Iteration 43481 / 50000) loss: 4.922002
(Iteration 43491 / 50000) loss: 4.793893
(Iteration 43501 / 50000) loss: 4.763085
(Iteration 43511 / 50000) loss: 4.748884
(Iteration 43521 / 50000) loss: 5.338470
(Iteration 43531 / 50000) loss: 5.498346
(Iteration 43541 / 50000) loss: 5.450516
(Iteration 43551 / 50000) loss: 5.287506
(Iteration 43561 / 50000) loss: 5.384856
(Iteration 43571 / 50000) loss: 5.020273
(Iteration 43581 / 50000) loss: 5.385032
(Iteration 43591 / 50000) loss: 5.628804
(Iteration 43601 / 50000) loss: 5.410988
(Iteration 43611 / 50000) loss: 5.311620
(Iteration 43621 / 50000) loss: 4.735703
(Iteration 43631 / 50000) loss: 5.233298
(Iteration 43641 / 50000) loss: 4.669250
(Iteration 43651 / 50000) loss: 4.943696
(Iteration 43661 / 50000) loss: 5.331033
(Iteration 43671 / 50000) loss: 4.994873
(Iteration 43681 / 50000) loss: 5.227074
(Iteration 43691 / 50000) loss: 5.580692
(Iteration 43701 / 50000) loss: 5.230926
(Iteration 43711 / 50000) loss: 5.867423
(Iteration 43721 / 50000) loss: 4.648491
(Iteration 43731 / 50000) loss: 5.694596
(Iteration 43741 / 50000) loss: 4.956635
(Iteration 43751 / 50000) loss: 5.453378
(Iteration 43761 / 50000) loss: 4.681377
(Iteration 43771 / 50000) loss: 5.035872
(Iteration 43781 / 50000) loss: 5.485294
(Iteration 43791 / 50000) loss: 5.339291
(Iteration 43801 / 50000) loss: 5.685176
(Iteration 43811 / 50000) loss: 5.090594
(Iteration 43821 / 50000) loss: 4.527797
(Iteration 43831 / 50000) loss: 5.231988
(Iteration 43841 / 50000) loss: 5.311808
(Iteration 43851 / 50000) loss: 5.174561
(Iteration 43861 / 50000) loss: 5.191983
(Iteration 43871 / 50000) loss: 5.769613
(Iteration 43881 / 50000) loss: 5.278997
(Iteration 43891 / 50000) loss: 6.140237
(Iteration 43901 / 50000) loss: 4.828996
(Iteration 43911 / 50000) loss: 4.487235
(Iteration 43921 / 50000) loss: 4.613873
(Iteration 43931 / 50000) loss: 5.238370
(Iteration 43941 / 50000) loss: 5.201627
(Iteration 43951 / 50000) loss: 5.193782
(Iteration 43961 / 50000) loss: 4.230243
(Iteration 43971 / 50000) loss: 5.253839
(Iteration 43981 / 50000) loss: 4.983950
(Iteration 43991 / 50000) loss: 5.230571
(Iteration 44001 / 50000) loss: 4.698290
(Iteration 44011 / 50000) loss: 5.512025
(Iteration 44021 / 50000) loss: 5.463587
(Iteration 44031 / 50000) loss: 4.924236
(Iteration 44041 / 50000) loss: 5.111295
(Iteration 44051 / 50000) loss: 4.820228
(Iteration 44061 / 50000) loss: 5.705987
(Iteration 44071 / 50000) loss: 5.136033
(Iteration 44081 / 50000) loss: 5.354730
(Iteration 44091 / 50000) loss: 5.752659
(Iteration 44101 / 50000) loss: 5.905477
(Iteration 44111 / 50000) loss: 5.294932
(Iteration 44121 / 50000) loss: 4.766038
(Iteration 44131 / 50000) loss: 5.371398
(Iteration 44141 / 50000) loss: 4.978839
(Iteration 44151 / 50000) loss: 5.120658
(Iteration 44161 / 50000) loss: 5.636966
(Iteration 44171 / 50000) loss: 6.027669
(Iteration 44181 / 50000) loss: 5.119032
(Iteration 44191 / 50000) loss: 5.548153
(Iteration 44201 / 50000) loss: 5.181743
(Iteration 44211 / 50000) loss: 4.682301
(Iteration 44221 / 50000) loss: 4.806877
(Iteration 44231 / 50000) loss: 5.224691
(Iteration 44241 / 50000) loss: 4.918099
(Iteration 44251 / 50000) loss: 5.199224
(Iteration 44261 / 50000) loss: 4.394568
(Iteration 44271 / 50000) loss: 4.993088
(Iteration 44281 / 50000) loss: 5.385722
(Iteration 44291 / 50000) loss: 5.562110
(Iteration 44301 / 50000) loss: 5.359615
(Iteration 44311 / 50000) loss: 5.045357
(Iteration 44321 / 50000) loss: 5.174937
(Iteration 44331 / 50000) loss: 4.578071
(Iteration 44341 / 50000) loss: 4.266064
(Iteration 44351 / 50000) loss: 4.413052
(Iteration 44361 / 50000) loss: 5.149018
(Iteration 44371 / 50000) loss: 5.931766
(Iteration 44381 / 50000) loss: 5.460862
(Iteration 44391 / 50000) loss: 5.566345
(Iteration 44401 / 50000) loss: 5.610415
(Iteration 44411 / 50000) loss: 4.705345
(Iteration 44421 / 50000) loss: 5.176616
(Iteration 44431 / 50000) loss: 5.147513
(Iteration 44441 / 50000) loss: 4.821331
(Iteration 44451 / 50000) loss: 5.513298
(Iteration 44461 / 50000) loss: 5.112529
(Iteration 44471 / 50000) loss: 5.689818
(Iteration 44481 / 50000) loss: 5.175300
(Iteration 44491 / 50000) loss: 5.555002
(Iteration 44501 / 50000) loss: 5.582611
(Iteration 44511 / 50000) loss: 4.942736
(Iteration 44521 / 50000) loss: 5.065837
(Iteration 44531 / 50000) loss: 5.297311
(Iteration 44541 / 50000) loss: 5.231035
(Iteration 44551 / 50000) loss: 5.068218
(Iteration 44561 / 50000) loss: 5.401501
(Iteration 44571 / 50000) loss: 5.619292
(Iteration 44581 / 50000) loss: 5.325192
(Iteration 44591 / 50000) loss: 5.208709
(Iteration 44601 / 50000) loss: 5.036839
(Iteration 44611 / 50000) loss: 5.384250
(Iteration 44621 / 50000) loss: 4.977408
(Iteration 44631 / 50000) loss: 4.691292
(Iteration 44641 / 50000) loss: 5.268418
(Iteration 44651 / 50000) loss: 5.249679
(Iteration 44661 / 50000) loss: 5.164507
(Iteration 44671 / 50000) loss: 5.338846
(Iteration 44681 / 50000) loss: 5.718344
(Iteration 44691 / 50000) loss: 5.343474
(Iteration 44701 / 50000) loss: 5.027562
(Iteration 44711 / 50000) loss: 5.426543
(Iteration 44721 / 50000) loss: 4.744429
(Iteration 44731 / 50000) loss: 5.307176
(Iteration 44741 / 50000) loss: 5.617985
(Iteration 44751 / 50000) loss: 5.051803
(Iteration 44761 / 50000) loss: 5.269867
(Iteration 44771 / 50000) loss: 5.299982
(Iteration 44781 / 50000) loss: 5.104455
(Iteration 44791 / 50000) loss: 4.875040
(Iteration 44801 / 50000) loss: 5.958869
(Iteration 44811 / 50000) loss: 5.038013
(Iteration 44821 / 50000) loss: 4.889483
(Iteration 44831 / 50000) loss: 5.302302
(Iteration 44841 / 50000) loss: 4.793868
(Iteration 44851 / 50000) loss: 5.939053
(Iteration 44861 / 50000) loss: 4.959479
(Iteration 44871 / 50000) loss: 5.575453
(Iteration 44881 / 50000) loss: 4.756216
(Iteration 44891 / 50000) loss: 4.523675
(Iteration 44901 / 50000) loss: 5.244631
(Iteration 44911 / 50000) loss: 5.455683
(Iteration 44921 / 50000) loss: 5.736442
(Iteration 44931 / 50000) loss: 5.367849
(Iteration 44941 / 50000) loss: 4.963418
(Iteration 44951 / 50000) loss: 5.412180
(Iteration 44961 / 50000) loss: 4.936616
(Iteration 44971 / 50000) loss: 5.027520
(Iteration 44981 / 50000) loss: 5.371333
(Iteration 44991 / 50000) loss: 5.232096
(Iteration 45001 / 50000) loss: 5.130845
(Iteration 45011 / 50000) loss: 4.834021
(Iteration 45021 / 50000) loss: 5.415214
(Iteration 45031 / 50000) loss: 4.968727
(Iteration 45041 / 50000) loss: 4.773791
(Iteration 45051 / 50000) loss: 5.256672
(Iteration 45061 / 50000) loss: 5.200029
(Iteration 45071 / 50000) loss: 5.544351
(Iteration 45081 / 50000) loss: 5.565252
(Iteration 45091 / 50000) loss: 5.434778
(Iteration 45101 / 50000) loss: 5.388171
(Iteration 45111 / 50000) loss: 4.996230
(Iteration 45121 / 50000) loss: 5.315617
(Iteration 45131 / 50000) loss: 5.496687
(Iteration 45141 / 50000) loss: 5.141859
(Iteration 45151 / 50000) loss: 4.964534
(Iteration 45161 / 50000) loss: 5.348040
(Iteration 45171 / 50000) loss: 5.267598
(Iteration 45181 / 50000) loss: 5.188645
(Iteration 45191 / 50000) loss: 4.964868
(Iteration 45201 / 50000) loss: 5.160988
(Iteration 45211 / 50000) loss: 4.733652
(Iteration 45221 / 50000) loss: 5.165364
(Iteration 45231 / 50000) loss: 5.405534
(Iteration 45241 / 50000) loss: 5.025283
(Iteration 45251 / 50000) loss: 5.316397
(Iteration 45261 / 50000) loss: 4.798735
(Iteration 45271 / 50000) loss: 5.211899
(Iteration 45281 / 50000) loss: 5.138305
(Iteration 45291 / 50000) loss: 5.078218
(Iteration 45301 / 50000) loss: 4.631933
(Iteration 45311 / 50000) loss: 5.285652
(Iteration 45321 / 50000) loss: 5.000529
(Iteration 45331 / 50000) loss: 5.751277
(Iteration 45341 / 50000) loss: 5.601569
(Iteration 45351 / 50000) loss: 4.868462
(Iteration 45361 / 50000) loss: 4.736218
(Iteration 45371 / 50000) loss: 4.456814
(Iteration 45381 / 50000) loss: 4.498604
(Iteration 45391 / 50000) loss: 5.687433
(Iteration 45401 / 50000) loss: 5.435469
(Iteration 45411 / 50000) loss: 5.100668
(Iteration 45421 / 50000) loss: 5.675942
(Iteration 45431 / 50000) loss: 6.025237
(Iteration 45441 / 50000) loss: 5.087340
(Iteration 45451 / 50000) loss: 4.561016
(Iteration 45461 / 50000) loss: 4.759572
(Iteration 45471 / 50000) loss: 4.315975
(Iteration 45481 / 50000) loss: 5.531467
(Iteration 45491 / 50000) loss: 5.831375
(Iteration 45501 / 50000) loss: 5.382960
(Iteration 45511 / 50000) loss: 5.006442
(Iteration 45521 / 50000) loss: 4.958492
(Iteration 45531 / 50000) loss: 4.217898
(Iteration 45541 / 50000) loss: 4.847986
(Iteration 45551 / 50000) loss: 4.251575
(Iteration 45561 / 50000) loss: 5.563556
(Iteration 45571 / 50000) loss: 4.739258
(Iteration 45581 / 50000) loss: 4.948453
(Iteration 45591 / 50000) loss: 5.436950
(Iteration 45601 / 50000) loss: 4.964809
(Iteration 45611 / 50000) loss: 5.195756
(Iteration 45621 / 50000) loss: 5.530208
(Iteration 45631 / 50000) loss: 5.312521
(Iteration 45641 / 50000) loss: 4.959986
(Iteration 45651 / 50000) loss: 5.949300
(Iteration 45661 / 50000) loss: 5.222128
(Iteration 45671 / 50000) loss: 5.213927
(Iteration 45681 / 50000) loss: 5.343274
(Iteration 45691 / 50000) loss: 5.697163
(Iteration 45701 / 50000) loss: 5.512528
(Iteration 45711 / 50000) loss: 4.786240
(Iteration 45721 / 50000) loss: 5.186468
(Iteration 45731 / 50000) loss: 5.210442
(Iteration 45741 / 50000) loss: 5.625512
(Iteration 45751 / 50000) loss: 5.258360
(Iteration 45761 / 50000) loss: 5.724288
(Iteration 45771 / 50000) loss: 5.248978
(Iteration 45781 / 50000) loss: 4.578710
(Iteration 45791 / 50000) loss: 5.172560
(Iteration 45801 / 50000) loss: 5.546560
(Iteration 45811 / 50000) loss: 4.937810
(Iteration 45821 / 50000) loss: 4.974560
(Iteration 45831 / 50000) loss: 5.575851
(Iteration 45841 / 50000) loss: 4.770966
(Iteration 45851 / 50000) loss: 5.007031
(Iteration 45861 / 50000) loss: 5.425419
(Iteration 45871 / 50000) loss: 4.536939
(Iteration 45881 / 50000) loss: 4.702260
(Iteration 45891 / 50000) loss: 4.645369
(Iteration 45901 / 50000) loss: 5.574531
(Iteration 45911 / 50000) loss: 5.236790
(Iteration 45921 / 50000) loss: 5.686111
(Iteration 45931 / 50000) loss: 4.360619
(Iteration 45941 / 50000) loss: 5.529153
(Iteration 45951 / 50000) loss: 4.822834
(Iteration 45961 / 50000) loss: 5.873787
(Iteration 45971 / 50000) loss: 4.990044
(Iteration 45981 / 50000) loss: 5.308492
(Iteration 45991 / 50000) loss: 5.056962
(Iteration 46001 / 50000) loss: 5.099375
(Iteration 46011 / 50000) loss: 4.356855
(Iteration 46021 / 50000) loss: 4.883913
(Iteration 46031 / 50000) loss: 5.435831
(Iteration 46041 / 50000) loss: 5.424710
(Iteration 46051 / 50000) loss: 5.459373
(Iteration 46061 / 50000) loss: 4.612882
(Iteration 46071 / 50000) loss: 4.720281
(Iteration 46081 / 50000) loss: 4.951684
(Iteration 46091 / 50000) loss: 5.106454
(Iteration 46101 / 50000) loss: 4.931180
(Iteration 46111 / 50000) loss: 4.669753
(Iteration 46121 / 50000) loss: 5.118020
(Iteration 46131 / 50000) loss: 5.067141
(Iteration 46141 / 50000) loss: 5.022894
(Iteration 46151 / 50000) loss: 4.658503
(Iteration 46161 / 50000) loss: 4.813952
(Iteration 46171 / 50000) loss: 4.637677
(Iteration 46181 / 50000) loss: 3.992839
(Iteration 46191 / 50000) loss: 4.689022
(Iteration 46201 / 50000) loss: 4.674007
(Iteration 46211 / 50000) loss: 4.654600
(Iteration 46221 / 50000) loss: 5.102054
(Iteration 46231 / 50000) loss: 4.603053
(Iteration 46241 / 50000) loss: 5.335529
(Iteration 46251 / 50000) loss: 4.851182
(Iteration 46261 / 50000) loss: 5.168598
(Iteration 46271 / 50000) loss: 5.771190
(Iteration 46281 / 50000) loss: 5.326949
(Iteration 46291 / 50000) loss: 5.530299
(Iteration 46301 / 50000) loss: 5.640052
(Iteration 46311 / 50000) loss: 5.387009
(Iteration 46321 / 50000) loss: 5.300174
(Iteration 46331 / 50000) loss: 5.678271
(Iteration 46341 / 50000) loss: 5.319687
(Iteration 46351 / 50000) loss: 4.691865
(Iteration 46361 / 50000) loss: 5.248442
(Iteration 46371 / 50000) loss: 5.082333
(Iteration 46381 / 50000) loss: 5.729870
(Iteration 46391 / 50000) loss: 5.800074
(Iteration 46401 / 50000) loss: 4.942971
(Iteration 46411 / 50000) loss: 5.280018
(Iteration 46421 / 50000) loss: 4.559754
(Iteration 46431 / 50000) loss: 5.256118
(Iteration 46441 / 50000) loss: 5.333559
(Iteration 46451 / 50000) loss: 4.943512
(Iteration 46461 / 50000) loss: 5.313025
(Iteration 46471 / 50000) loss: 5.716622
(Iteration 46481 / 50000) loss: 4.915994
(Iteration 46491 / 50000) loss: 5.281662
(Iteration 46501 / 50000) loss: 5.350490
(Iteration 46511 / 50000) loss: 4.996656
(Iteration 46521 / 50000) loss: 5.383670
(Iteration 46531 / 50000) loss: 4.557714
(Iteration 46541 / 50000) loss: 5.025109
(Iteration 46551 / 50000) loss: 4.726755
(Iteration 46561 / 50000) loss: 5.165017
(Iteration 46571 / 50000) loss: 4.331202
(Iteration 46581 / 50000) loss: 5.194563
(Iteration 46591 / 50000) loss: 5.240401
(Iteration 46601 / 50000) loss: 4.801658
(Iteration 46611 / 50000) loss: 4.946284
(Iteration 46621 / 50000) loss: 5.231586
(Iteration 46631 / 50000) loss: 5.052099
(Iteration 46641 / 50000) loss: 5.388507
(Iteration 46651 / 50000) loss: 4.008849
(Iteration 46661 / 50000) loss: 4.951026
(Iteration 46671 / 50000) loss: 5.288510
(Iteration 46681 / 50000) loss: 5.134641
(Iteration 46691 / 50000) loss: 4.776896
(Iteration 46701 / 50000) loss: 5.047859
(Iteration 46711 / 50000) loss: 5.971784
(Iteration 46721 / 50000) loss: 5.277661
(Iteration 46731 / 50000) loss: 5.053136
(Iteration 46741 / 50000) loss: 5.326352
(Iteration 46751 / 50000) loss: 4.993297
(Iteration 46761 / 50000) loss: 4.900790
(Iteration 46771 / 50000) loss: 5.412417
(Iteration 46781 / 50000) loss: 4.971479
(Iteration 46791 / 50000) loss: 4.808398
(Iteration 46801 / 50000) loss: 5.362919
(Iteration 46811 / 50000) loss: 5.014742
(Iteration 46821 / 50000) loss: 5.177652
(Iteration 46831 / 50000) loss: 5.417285
(Iteration 46841 / 50000) loss: 4.614275
(Iteration 46851 / 50000) loss: 5.082099
(Iteration 46861 / 50000) loss: 5.336719
(Iteration 46871 / 50000) loss: 5.161899
(Iteration 46881 / 50000) loss: 4.994279
(Iteration 46891 / 50000) loss: 4.350177
(Iteration 46901 / 50000) loss: 5.161782
(Iteration 46911 / 50000) loss: 5.055948
(Iteration 46921 / 50000) loss: 4.157732
(Iteration 46931 / 50000) loss: 5.368811
(Iteration 46941 / 50000) loss: 4.929541
(Iteration 46951 / 50000) loss: 4.935134
(Iteration 46961 / 50000) loss: 5.766431
(Iteration 46971 / 50000) loss: 5.145928
(Iteration 46981 / 50000) loss: 4.754225
(Iteration 46991 / 50000) loss: 5.282622
(Iteration 47001 / 50000) loss: 5.157298
(Iteration 47011 / 50000) loss: 4.993598
(Iteration 47021 / 50000) loss: 5.589919
(Iteration 47031 / 50000) loss: 5.083948
(Iteration 47041 / 50000) loss: 5.687728
(Iteration 47051 / 50000) loss: 5.283178
(Iteration 47061 / 50000) loss: 4.763415
(Iteration 47071 / 50000) loss: 5.577854
(Iteration 47081 / 50000) loss: 4.745455
(Iteration 47091 / 50000) loss: 4.842993
(Iteration 47101 / 50000) loss: 5.094577
(Iteration 47111 / 50000) loss: 5.404017
(Iteration 47121 / 50000) loss: 4.995842
(Iteration 47131 / 50000) loss: 5.173024
(Iteration 47141 / 50000) loss: 4.860224
(Iteration 47151 / 50000) loss: 4.682947
(Iteration 47161 / 50000) loss: 5.652501
(Iteration 47171 / 50000) loss: 4.692696
(Iteration 47181 / 50000) loss: 4.548441
(Iteration 47191 / 50000) loss: 5.489570
(Iteration 47201 / 50000) loss: 4.684776
(Iteration 47211 / 50000) loss: 4.865422
(Iteration 47221 / 50000) loss: 4.697394
(Iteration 47231 / 50000) loss: 5.324905
(Iteration 47241 / 50000) loss: 4.911788
(Iteration 47251 / 50000) loss: 5.707105
(Iteration 47261 / 50000) loss: 5.331719
(Iteration 47271 / 50000) loss: 4.527254
(Iteration 47281 / 50000) loss: 4.585569
(Iteration 47291 / 50000) loss: 5.330248
(Iteration 47301 / 50000) loss: 5.119588
(Iteration 47311 / 50000) loss: 4.912568
(Iteration 47321 / 50000) loss: 4.993590
(Iteration 47331 / 50000) loss: 5.147046
(Iteration 47341 / 50000) loss: 4.789664
(Iteration 47351 / 50000) loss: 4.786418
(Iteration 47361 / 50000) loss: 5.198579
(Iteration 47371 / 50000) loss: 5.127116
(Iteration 47381 / 50000) loss: 5.852070
(Iteration 47391 / 50000) loss: 4.925934
(Iteration 47401 / 50000) loss: 4.708528
(Iteration 47411 / 50000) loss: 4.897190
(Iteration 47421 / 50000) loss: 4.745486
(Iteration 47431 / 50000) loss: 5.910188
(Iteration 47441 / 50000) loss: 5.611912
(Iteration 47451 / 50000) loss: 4.770853
(Iteration 47461 / 50000) loss: 4.452225
(Iteration 47471 / 50000) loss: 4.489676
(Iteration 47481 / 50000) loss: 4.784930
(Iteration 47491 / 50000) loss: 5.148650
(Iteration 47501 / 50000) loss: 5.144456
(Iteration 47511 / 50000) loss: 4.868283
(Iteration 47521 / 50000) loss: 5.317827
(Iteration 47531 / 50000) loss: 5.036849
(Iteration 47541 / 50000) loss: 4.934437
(Iteration 47551 / 50000) loss: 4.745998
(Iteration 47561 / 50000) loss: 5.637244
(Iteration 47571 / 50000) loss: 5.016647
(Iteration 47581 / 50000) loss: 4.857729
(Iteration 47591 / 50000) loss: 5.156177
(Iteration 47601 / 50000) loss: 4.976842
(Iteration 47611 / 50000) loss: 4.302502
(Iteration 47621 / 50000) loss: 5.024882
(Iteration 47631 / 50000) loss: 4.956566
(Iteration 47641 / 50000) loss: 4.656335
(Iteration 47651 / 50000) loss: 5.045079
(Iteration 47661 / 50000) loss: 4.869484
(Iteration 47671 / 50000) loss: 4.858334
(Iteration 47681 / 50000) loss: 5.172522
(Iteration 47691 / 50000) loss: 4.730553
(Iteration 47701 / 50000) loss: 5.344105
(Iteration 47711 / 50000) loss: 4.761704
(Iteration 47721 / 50000) loss: 4.274773
(Iteration 47731 / 50000) loss: 5.347785
(Iteration 47741 / 50000) loss: 5.154635
(Iteration 47751 / 50000) loss: 4.821087
(Iteration 47761 / 50000) loss: 5.046308
(Iteration 47771 / 50000) loss: 5.055055
(Iteration 47781 / 50000) loss: 4.725170
(Iteration 47791 / 50000) loss: 4.527396
(Iteration 47801 / 50000) loss: 4.487073
(Iteration 47811 / 50000) loss: 4.639754
(Iteration 47821 / 50000) loss: 5.402688
(Iteration 47831 / 50000) loss: 5.358788
(Iteration 47841 / 50000) loss: 5.388831
(Iteration 47851 / 50000) loss: 5.539782
(Iteration 47861 / 50000) loss: 4.977380
(Iteration 47871 / 50000) loss: 4.743683
(Iteration 47881 / 50000) loss: 5.744875
(Iteration 47891 / 50000) loss: 4.760842
(Iteration 47901 / 50000) loss: 4.671541
(Iteration 47911 / 50000) loss: 5.298233
(Iteration 47921 / 50000) loss: 5.325019
(Iteration 47931 / 50000) loss: 5.079854
(Iteration 47941 / 50000) loss: 4.849681
(Iteration 47951 / 50000) loss: 5.136692
(Iteration 47961 / 50000) loss: 4.513616
(Iteration 47971 / 50000) loss: 5.502441
(Iteration 47981 / 50000) loss: 4.783224
(Iteration 47991 / 50000) loss: 4.970872
(Iteration 48001 / 50000) loss: 5.214979
(Iteration 48011 / 50000) loss: 5.029698
(Iteration 48021 / 50000) loss: 5.374228
(Iteration 48031 / 50000) loss: 4.867151
(Iteration 48041 / 50000) loss: 4.729684
(Iteration 48051 / 50000) loss: 5.083028
(Iteration 48061 / 50000) loss: 5.130319
(Iteration 48071 / 50000) loss: 5.140748
(Iteration 48081 / 50000) loss: 4.854952
(Iteration 48091 / 50000) loss: 4.345643
(Iteration 48101 / 50000) loss: 5.858382
(Iteration 48111 / 50000) loss: 4.692747
(Iteration 48121 / 50000) loss: 4.796843
(Iteration 48131 / 50000) loss: 5.054175
(Iteration 48141 / 50000) loss: 4.408984
(Iteration 48151 / 50000) loss: 5.586374
(Iteration 48161 / 50000) loss: 4.729210
(Iteration 48171 / 50000) loss: 4.956761
(Iteration 48181 / 50000) loss: 4.486281
(Iteration 48191 / 50000) loss: 5.410888
(Iteration 48201 / 50000) loss: 4.483176
(Iteration 48211 / 50000) loss: 5.062662
(Iteration 48221 / 50000) loss: 4.866526
(Iteration 48231 / 50000) loss: 5.485916
(Iteration 48241 / 50000) loss: 4.921504
(Iteration 48251 / 50000) loss: 4.959250
(Iteration 48261 / 50000) loss: 4.725866
(Iteration 48271 / 50000) loss: 5.965253
(Iteration 48281 / 50000) loss: 4.680338
(Iteration 48291 / 50000) loss: 5.103705
(Iteration 48301 / 50000) loss: 4.808215
(Iteration 48311 / 50000) loss: 4.575053
(Iteration 48321 / 50000) loss: 4.947476
(Iteration 48331 / 50000) loss: 4.887461
(Iteration 48341 / 50000) loss: 4.466528
(Iteration 48351 / 50000) loss: 5.151607
(Iteration 48361 / 50000) loss: 4.767912
(Iteration 48371 / 50000) loss: 5.488635
(Iteration 48381 / 50000) loss: 5.102046
(Iteration 48391 / 50000) loss: 4.570899
(Iteration 48401 / 50000) loss: 4.521377
(Iteration 48411 / 50000) loss: 5.012564
(Iteration 48421 / 50000) loss: 4.801443
(Iteration 48431 / 50000) loss: 5.175388
(Iteration 48441 / 50000) loss: 4.981082
(Iteration 48451 / 50000) loss: 4.862914
(Iteration 48461 / 50000) loss: 4.733837
(Iteration 48471 / 50000) loss: 4.818900
(Iteration 48481 / 50000) loss: 5.991330
(Iteration 48491 / 50000) loss: 5.208624
(Iteration 48501 / 50000) loss: 4.944359
(Iteration 48511 / 50000) loss: 4.814369
(Iteration 48521 / 50000) loss: 4.469385
(Iteration 48531 / 50000) loss: 4.430485
(Iteration 48541 / 50000) loss: 5.097204
(Iteration 48551 / 50000) loss: 4.905378
(Iteration 48561 / 50000) loss: 5.210978
(Iteration 48571 / 50000) loss: 4.874699
(Iteration 48581 / 50000) loss: 4.572843
(Iteration 48591 / 50000) loss: 4.395446
(Iteration 48601 / 50000) loss: 5.059456
(Iteration 48611 / 50000) loss: 4.589462
(Iteration 48621 / 50000) loss: 4.658811
(Iteration 48631 / 50000) loss: 4.764175
(Iteration 48641 / 50000) loss: 5.207649
(Iteration 48651 / 50000) loss: 4.526176
(Iteration 48661 / 50000) loss: 5.004376
(Iteration 48671 / 50000) loss: 4.787407
(Iteration 48681 / 50000) loss: 5.527824
(Iteration 48691 / 50000) loss: 4.157642
(Iteration 48701 / 50000) loss: 4.825883
(Iteration 48711 / 50000) loss: 4.675910
(Iteration 48721 / 50000) loss: 4.928158
(Iteration 48731 / 50000) loss: 4.308931
(Iteration 48741 / 50000) loss: 5.254661
(Iteration 48751 / 50000) loss: 5.354953
(Iteration 48761 / 50000) loss: 5.408144
(Iteration 48771 / 50000) loss: 4.914572
(Iteration 48781 / 50000) loss: 4.659526
(Iteration 48791 / 50000) loss: 5.600999
(Iteration 48801 / 50000) loss: 4.432065
(Iteration 48811 / 50000) loss: 4.790988
(Iteration 48821 / 50000) loss: 5.317098
(Iteration 48831 / 50000) loss: 5.207159
(Iteration 48841 / 50000) loss: 5.578248
(Iteration 48851 / 50000) loss: 4.997896
(Iteration 48861 / 50000) loss: 5.171075
(Iteration 48871 / 50000) loss: 5.497323
(Iteration 48881 / 50000) loss: 5.232023
(Iteration 48891 / 50000) loss: 5.425415
(Iteration 48901 / 50000) loss: 4.654369
(Iteration 48911 / 50000) loss: 4.663005
(Iteration 48921 / 50000) loss: 5.192826
(Iteration 48931 / 50000) loss: 5.197409
(Iteration 48941 / 50000) loss: 5.317922
(Iteration 48951 / 50000) loss: 4.355244
(Iteration 48961 / 50000) loss: 4.449171
(Iteration 48971 / 50000) loss: 5.042314
(Iteration 48981 / 50000) loss: 4.645004
(Iteration 48991 / 50000) loss: 4.430694
(Iteration 49001 / 50000) loss: 5.003181
(Iteration 49011 / 50000) loss: 5.253469
(Iteration 49021 / 50000) loss: 4.921456
(Iteration 49031 / 50000) loss: 4.725998
(Iteration 49041 / 50000) loss: 4.830995
(Iteration 49051 / 50000) loss: 5.074775
(Iteration 49061 / 50000) loss: 5.043628
(Iteration 49071 / 50000) loss: 5.892237
(Iteration 49081 / 50000) loss: 4.516867
(Iteration 49091 / 50000) loss: 4.594828
(Iteration 49101 / 50000) loss: 5.265089
(Iteration 49111 / 50000) loss: 5.262130
(Iteration 49121 / 50000) loss: 4.883457
(Iteration 49131 / 50000) loss: 4.819940
(Iteration 49141 / 50000) loss: 5.205019
(Iteration 49151 / 50000) loss: 4.514503
(Iteration 49161 / 50000) loss: 4.811991
(Iteration 49171 / 50000) loss: 4.889469
(Iteration 49181 / 50000) loss: 5.090885
(Iteration 49191 / 50000) loss: 5.302499
(Iteration 49201 / 50000) loss: 5.148423
(Iteration 49211 / 50000) loss: 4.763197
(Iteration 49221 / 50000) loss: 5.171199
(Iteration 49231 / 50000) loss: 4.374295
(Iteration 49241 / 50000) loss: 4.803613
(Iteration 49251 / 50000) loss: 5.126912
(Iteration 49261 / 50000) loss: 4.804142
(Iteration 49271 / 50000) loss: 5.043730
(Iteration 49281 / 50000) loss: 5.190020
(Iteration 49291 / 50000) loss: 4.927842
(Iteration 49301 / 50000) loss: 4.728107
(Iteration 49311 / 50000) loss: 5.266110
(Iteration 49321 / 50000) loss: 4.995474
(Iteration 49331 / 50000) loss: 4.857098
(Iteration 49341 / 50000) loss: 4.656197
(Iteration 49351 / 50000) loss: 4.731958
(Iteration 49361 / 50000) loss: 5.098492
(Iteration 49371 / 50000) loss: 4.245257
(Iteration 49381 / 50000) loss: 5.184144
(Iteration 49391 / 50000) loss: 5.147379
(Iteration 49401 / 50000) loss: 4.893561
(Iteration 49411 / 50000) loss: 5.101745
(Iteration 49421 / 50000) loss: 4.933324
(Iteration 49431 / 50000) loss: 4.726953
(Iteration 49441 / 50000) loss: 5.169777
(Iteration 49451 / 50000) loss: 4.991100
(Iteration 49461 / 50000) loss: 5.293823
(Iteration 49471 / 50000) loss: 4.813910
(Iteration 49481 / 50000) loss: 4.551907
(Iteration 49491 / 50000) loss: 5.255053
(Iteration 49501 / 50000) loss: 4.211944
(Iteration 49511 / 50000) loss: 4.735895
(Iteration 49521 / 50000) loss: 4.344431
(Iteration 49531 / 50000) loss: 5.567289
(Iteration 49541 / 50000) loss: 5.114772
(Iteration 49551 / 50000) loss: 4.918598
(Iteration 49561 / 50000) loss: 5.111366
(Iteration 49571 / 50000) loss: 4.990022
(Iteration 49581 / 50000) loss: 4.785156
(Iteration 49591 / 50000) loss: 4.968936
(Iteration 49601 / 50000) loss: 4.839501
(Iteration 49611 / 50000) loss: 4.541588
(Iteration 49621 / 50000) loss: 5.169776
(Iteration 49631 / 50000) loss: 5.478606
(Iteration 49641 / 50000) loss: 4.937177
(Iteration 49651 / 50000) loss: 4.201233
(Iteration 49661 / 50000) loss: 4.551475
(Iteration 49671 / 50000) loss: 4.787985
(Iteration 49681 / 50000) loss: 4.883517
(Iteration 49691 / 50000) loss: 4.352181
(Iteration 49701 / 50000) loss: 5.384501
(Iteration 49711 / 50000) loss: 5.415583
(Iteration 49721 / 50000) loss: 5.563091
(Iteration 49731 / 50000) loss: 4.801522
(Iteration 49741 / 50000) loss: 4.629277
(Iteration 49751 / 50000) loss: 4.907603
(Iteration 49761 / 50000) loss: 4.563795
(Iteration 49771 / 50000) loss: 5.147919
(Iteration 49781 / 50000) loss: 4.927869
(Iteration 49791 / 50000) loss: 4.856292
(Iteration 49801 / 50000) loss: 4.921810
(Iteration 49811 / 50000) loss: 4.640391
(Iteration 49821 / 50000) loss: 5.008207
(Iteration 49831 / 50000) loss: 4.999980
(Iteration 49841 / 50000) loss: 4.700072
(Iteration 49851 / 50000) loss: 4.832501
(Iteration 49861 / 50000) loss: 5.002167
(Iteration 49871 / 50000) loss: 5.382934
(Iteration 49881 / 50000) loss: 4.727874
(Iteration 49891 / 50000) loss: 5.036825
(Iteration 49901 / 50000) loss: 4.670415
(Iteration 49911 / 50000) loss: 5.619590
(Iteration 49921 / 50000) loss: 5.009828
(Iteration 49931 / 50000) loss: 4.854103
(Iteration 49941 / 50000) loss: 4.611403
(Iteration 49951 / 50000) loss: 4.977466
(Iteration 49961 / 50000) loss: 4.946524
(Iteration 49971 / 50000) loss: 4.131866
(Iteration 49981 / 50000) loss: 4.165108
(Iteration 49991 / 50000) loss: 4.977619


In [48]:
# Plot the training losses
plt.plot(small_lstm_solver.loss_history)
plt.xlabel('Iteration')
plt.ylabel('Loss')
plt.title('Training loss history')
plt.show()

for split in ['train', 'val']:
  minibatch = sample_coco_minibatch(small_data, split=split, batch_size=3)
  gt_captions, features, urls = minibatch
  gt_captions = decode_captions(gt_captions, data['idx_to_word'])

  sample_captions = small_lstm_model.sample(features)
  sample_captions = decode_captions(sample_captions, data['idx_to_word'])

  for gt_caption, sample_caption, url in zip(gt_captions, sample_captions, urls):
    plt.imshow(image_from_url(url))
    plt.title('%s\n%s\nGT:%s' % (split, sample_caption, gt_caption))
    plt.axis('off')
    plt.show()



In [60]:
#import pickle

#f = open("/tmp/model.pck", "wb")
#pickle.dump(small_lstm_model.params, f)
#f.close()

#f = open("/tmp/trainer.pck", "wb")
#pickle.dump(small_lstm_solver.optim_configs, f)
#f.close()

In [61]:
for split in ['train', 'val']:
  minibatch = sample_coco_minibatch(small_data, split=split, batch_size=3)
  gt_captions, features, urls = minibatch
  gt_captions = decode_captions(gt_captions, data['idx_to_word'])

  sample_captions = small_lstm_model.sample(features)
  sample_captions = decode_captions(sample_captions, data['idx_to_word'])

  for gt_caption, sample_caption, url in zip(gt_captions, sample_captions, urls):
    plt.imshow(image_from_url(url))
    plt.title('%s\n%s\nGT:%s' % (split, sample_caption, gt_caption))
    plt.axis('off')
    plt.show()