MIPT, Advanced ML, Spring 2018

HW #7: CNN models

Sergey Kolesnikov, scitator@gmail.com

Organization Info

  • Дедлайн 27 апреля 2018 23:59 для всех групп.
  • В качестве решения задания нужно прислать ноутбук с подробными комментариями ( без присланного решения результат контеста не будет засчитан ).
  • Название команды в контесте должно соответствовать шаблону: НомерГруппы_Имя_Фамилия, например, 594_Ivan_Ivanov.

Оформление дз:

  • Присылайте выполненное задание на почту ml.course.mipt@gmail.com
  • Укажите тему письма в следующем формате ML2018_fall_<номер_группы>_<фамилия>, к примеру -- ML2018_fall_495_ivanov
  • Выполненное дз сохраните в файл <фамилия>_<группа>_task<номер>.ipnb, к примеру -- ivanov_401_task7.ipnb

Вопросы:

  • Присылайте вопросы на почту ml.course.mipt@gmail.com
  • Укажите тему письма в следующем формате ML2018_fall Question <Содержание вопроса>

  • PS1: Используются автоматические фильтры, и просто не найдем ваше дз, если вы неаккуратно его подпишите.
  • PS2: Просроченный дедлайн снижает максимальный вес задания по формуле, указнной на первом семинаре
  • PS3: Допустимы исправление кода предложенного кода ниже, если вы считаете

Check Questions

Ниже приводится список вопросов, с ответами на которые может быть полезно разобраться для понимания темы.

Вопрос 1: Чем отличаются современные сверточные сети от сетей 5 летней давности?

Они стали куда более глубокими и распиаренными. Вообще 5 лет назад не слышал о нейросетях, а сейчас каждый второй знакомый о них что-то знает.

Вопрос 2: Какие неприятности могут возникнуть во время обучения современных нейросетей?

Как показало выполнение этого задания, неприятности могут быть как физического характера (мигрень, нервные тики, бессонница, рвотные позывы), так и психологического (желание кпмнуться). Но это еще не все. На более сложных нейросетях во-первых, все это усиливается, а во-вторых, добавляются проблемы непосредственно самих нейросетей: например, возникает переобучение, то есть нейросетка будет запоминать данные, а не закономерности в них. А если делать дропаут, то можно забыть что-то важное. Еще очень редко, но иногда случается, что отмереть могут все нейроны.

Вопрос 3: У вас есть очень маленький датасет из 100 картинок, классификация, но вы очень хотите использовать нейросеть, какие неприятности вас ждут и как их решить? что делать если первый вариант решения не заработает?

Во-первых, нейрости должны обучаться на ооочень большом объеме данных, а у нас 100 картинок. Это сравнительно мало, так что ничего там не обучится. Чтобы решить это, надо сделать аугментацию, то есть надо все картинки как-то размножить, немного изменив (типа отразить, повернуть и все такое).

Но это только цветочки. Осторожно, впереди ягодки! Во-первых, вам надо будет поставить 100500 питоновских библиотек на вашу систему, во-вторых, скорее всего, вам придется писать код на Python (очевидные проблемы: нет статической типизации, все источники в интернете говнокодят аббревиатурами и непонятными именами, нет адекватной проверки написанного, но не выполненного кода), в-третьих, нужно будет заботать нейросети и прочитать тонны документации по выбранной либе, а это неинтересно.

Лично мое решение — вообще не использовать нейросети и машинное обучение. Камон, 100 картинок можно разметить вручную, выйдет куда точнее и быстрее.

Вопрос 4: Можно ли сделать стайл трансфер для музыки и как?

Да, в какой-то степени можно. Есть же стайл-трансфер для картинок. Поэтому можно представить музыку как картинку, сделать стайл-трансфер и получить аудио обратно. Проблема в том, что аудио - это одномерный массив, причем в одной секунде находится около 44000 точек, а песни обычно длятся от 3 минут. Поэтому работать с сырым аудио сложно, системы распознаваия речи обычно работают с низкоразмерными аудио.

Чтобы получить из аудио изображение, нужно применить оконное преобразование Фурье, которое берет музыку и получает два изображения: спектрограмму и фазовую картинку (на фазовую картинку забиваем). Для получения спектрограммы берется по 2000 точек со сдвигом примерно 500 точек. Для каждыйх таких 2000 точек применяется дискретное преобразование Фурье (по сути замена базиса), и новые координаты выстравиваются в столбик. Так получилась спектрограмма. По оси x - время, по оси y - частоты.

Обратное получают картинку с помощью преобразования Гриффина-Лима. Однако нужно будет решать проблемы: надо получить новую фазу, а также запретить сильное перемешивание по вертикали (так как поменять частоты с низких на высокие или наоборот сильно все испортит).

Решив эти проблемы, получим аудио обратно.


Theory Questions


Useful notebooks


CIFAR Quest

(please read it at least diagonally)

  • The ultimate quest is to create a network that has as high accuracy as you can push it.
  • There is a mini-report at the end that you will have to fill in. We recommend reading it first and filling it while you iterate.

Grading

  • starting at zero points
  • +2 for describing your iteration path in a report below.
  • +2 for correct check questions
  • +1 for beating each of these milestones on TEST dataset:
    • 60% (5 total)
    • 65% (6 total)
    • 70% (7 total)
    • 75% (8 total)
    • 80% (9 total)
    • 82% (10 total)
  • +2 for really cool solution:
    • 84% (12 total)
    • 86% (14 total)
    • 88% (16 total)
    • 90% (18 total)
    • 92% (20 total)

Bonus points

Common ways to get bonus points are:

  • Get higher score, obviously.
  • Anything special about your NN. For example "A super-small/fast NN that gets 80%" gets a bonus.
  • Any detailed analysis of the results. (saliency maps, whatever)

Restrictions

  • Please do NOT use pre-trained networks for this assignment.
    • In other words, milestones must be beaten without pre-trained nets (and such net must be present in the e-mail).
  • you can use validation data for training, but you can't' do anything with test data apart from running the evaluation procedure.


In [1]:
# Load data. It may work slow.
!mkdir cifar10 && curl -o cifar-10-python.tar.gz https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz && tar -xvzf cifar-10-python.tar.gz -C cifar10


mkdir: cannot create directory ‘cifar10’: File exists

In [0]:
import _pickle as pickle
import os
import time
import numpy as np
import tensorflow as tf

import matplotlib.pyplot as plt
import seaborn
%matplotlib inline

In [3]:
tf.__version__


Out[3]:
'1.7.0'

In [4]:
tf.test.is_gpu_available()


Out[4]:
True

In [0]:
tf.set_random_seed(42)

In [0]:
def load_CIFAR_batch(filename):
    """ load single batch of cifar """
    with open(filename, 'rb') as f:
        datadict = pickle.load(f, encoding='iso-8859-1')
        X = datadict['data']
        Y = datadict['labels']
        X = X.reshape(10000, 3, 32, 32).astype("float")
        Y = np.array(Y)
        return X, Y

def load_CIFAR10(ROOT):
    """ load all of cifar """
    xs = []
    ys = []
    for b in range(1,6):
        f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
        X, Y = load_CIFAR_batch(f)
        xs.append(X)
        ys.append(Y)    
    Xtr = np.concatenate(xs)
    Ytr = np.concatenate(ys)
    del X, Y
    Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
    return Xtr, Ytr, Xte, Yte

In [0]:
plt.rcParams['figure.figsize'] = (10.0, 8.0) 

cifar10_dir = './cifar10/cifar-10-batches-py'
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

In [8]:
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y)
    idxs = np.random.choice(idxs, samples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt_idx = i * num_classes + y + 1
        plt.subplot(samples_per_class, num_classes, plt_idx)
        plt.imshow(X_train[idx].astype('uint8').transpose(1, 2, 0))
        plt.axis('off')
        if i == 0:
            plt.title(cls)
plt.show()



In [0]:
input_X = tf.placeholder('float32', shape=(None, 32, 32, 3), name='X')
input_y = tf.placeholder('int64', shape=(None, 1), name='y')
is_training = tf.placeholder('bool', name='train')

layer = tf.layers.conv2d(
    inputs=input_X,
    filters=64,
    kernel_size=[5, 5],
    padding="same",
    activation=tf.nn.relu)
layer = tf.layers.max_pooling2d(
    inputs=layer, 
    pool_size=[3, 3], 
    strides=2)
layer = tf.nn.local_response_normalization(
    layer,
    depth_radius=4,
    bias=1.0,
    alpha=0.001 / 9.0,
    beta=0.75)
layer = tf.layers.conv2d(
    inputs=layer,
    filters=64,
    kernel_size=[5, 5],
    padding="same",
    activation=tf.nn.relu)
layer = tf.nn.local_response_normalization(
    layer,
    depth_radius=4,
    bias=1.0,
    alpha=0.001 / 9.0,
    beta=0.75)
layer = tf.layers.max_pooling2d(
    inputs=layer, 
    pool_size=[2, 2], 
    strides=2)
layer = tf.reshape(
    layer,
    [-1, 7 * 7 * 64])
layer = tf.layers.dense(
    inputs=layer, 
    units=384, 
    activation=tf.nn.relu)
layer = tf.layers.dense(
    inputs=layer, 
    units=192, 
    activation=tf.nn.relu)
dropout = tf.layers.dropout(
    inputs=layer, 
    rate=0.4,
    training=is_training)
logits = tf.layers.dense(
    inputs=layer,
    units=10)

In [0]:
loss = tf.losses.sparse_softmax_cross_entropy(labels=input_y, logits=logits)
optimizer_step = (tf.train.GradientDescentOptimizer(0.001, use_locking=True).minimize(loss))

predictions = tf.argmax(input=logits, axis=1)
accuracy = tf.metrics.accuracy(labels=input_y, predictions=predictions)

In [0]:
def train_fn(X, y, sess):
    '''
    returns tuple (loss, accuracy) for model train phase
    '''
    X = np.transpose(X, (0, 2, 3, 1))
    y = np.reshape(y, (-1, 1))
    sess.run(optimizer_step, {input_X: X, input_y: y, is_training: True})
    current_loss = sess.run(loss, {input_X: X, input_y: y, is_training: True})
    current_accuracy = sess.run(accuracy, {input_X: X, input_y: y, is_training: True})[0]
    return (current_loss, current_accuracy)

def eval_fn(X, y, sess):
    '''
    returns tuple (loss, accuracy) for model evaluation phase
    '''
    X = np.transpose(X, (0, 2, 3, 1))
    y = np.reshape(y, (-1, 1))
    current_loss = sess.run(loss, {input_X: X, input_y: y, is_training: False})
    current_accuracy = sess.run(accuracy, {input_X: X, input_y: y, is_training: False})[0]
    return (current_loss, current_accuracy)

def predict_fn(X, sess):
    '''
    returns y_pred for model predict phase
    '''
    '''
    Функция не используется.
    '''
    assert(False)

In [0]:
def iterate_minibatches(inputs, targets, batchsize, shuffle=False):
    assert len(inputs) == len(targets)
    if shuffle:
        indices = np.arange(len(inputs))
        np.random.shuffle(indices)
    for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
        if shuffle:
            excerpt = indices[start_idx:start_idx + batchsize]
        else:
            excerpt = slice(start_idx, start_idx + batchsize)
        yield inputs[excerpt], targets[excerpt]

In [0]:
num_epochs = 200 
batch_size = 16

In [0]:
X_val = X_test
y_val = y_test

In [0]:
train_losses = []
train_accs = []
valid_losses = []
valid_accs = []

In [16]:
with tf.Session() as sess:
    tf.global_variables_initializer().run()
    tf.local_variables_initializer().run()
    
    for epoch in range(num_epochs):
        # In each epoch, we do a full pass over the training data:
        train_loss = 0
        train_acc = 0
        train_batches = 0
        start_time = time.time()
        for batch in iterate_minibatches(X_train, y_train, batch_size):
            inputs, targets = batch
            train_loss_batch, train_acc_batch = train_fn(inputs, targets, sess)
            train_loss += train_loss_batch
            train_acc += train_acc_batch
            train_batches += 1
    
        # And a full pass over the validation data:
        valid_loss = 0
        valid_acc = 0
        valid_batches = 0
        for batch in iterate_minibatches(X_val, y_val, batch_size):
            inputs, targets = batch
            valid_loss_batch, valid_acc_batch = eval_fn(inputs, targets, sess)
            valid_loss += valid_loss_batch
            valid_acc += valid_acc_batch
            valid_batches += 1
            
        train_losses += [train_loss / train_batches]
        train_accs += [train_acc / train_batches * 100]
        valid_losses += [valid_loss / valid_batches]
        valid_accs += [valid_acc / valid_batches * 100]
        
        # Then we print the results for this epoch:
        print("Epoch {} of {} took {:.3f}s".format(epoch + 1, num_epochs, time.time() - start_time))
        print("  train loss:\t\t{:.6f}".format(train_losses[-1]))
        print("  train accuracy:\t\t{:.2f} %".format(train_accs[-1]))
        print("  valid loss:\t\t{:.6f}".format(valid_losses[-1]))
        print("  valid accuracy:\t\t{:.2f} %".format(valid_accs[-1]))
    
    plt.figure(figsize=(15, 6))
    plt.subplot(1, 2, 1)
    plt.plot(np.arange(num_epochs) + 1, train_losses, label='train loss')
    plt.plot(np.arange(num_epochs) + 1, valid_losses, label='valid loss')
    plt.legend()
    plt.subplot(1, 2, 2)
    plt.plot(np.arange(num_epochs) + 1, train_accs, label='train accuracy')
    plt.plot(np.arange(num_epochs) + 1, valid_accs, label='valid accuracy')
    plt.legend()
    plt.show()
    
    test_acc = 0
    test_batches = 0
    for batch in iterate_minibatches(X_test, y_test, 500):
        inputs, targets = batch
        _, acc = eval_fn(inputs, targets, sess)
        test_acc += acc
        test_batches += 1
    print("Final results:")
    print("  test accuracy:\t\t{:.2f} %".format(
        test_acc / test_batches * 100))

    if test_acc / test_batches * 100 > 92.5:
        print("Achievement unlocked: mage 80 lvl")
    else:
        print("Feed more!")


Epoch 1 of 200 took 47.147s
  train loss:		1.943301
  train accuracy:		22.38 %
  valid loss:		1.815259
  valid accuracy:		32.99 %
Epoch 2 of 200 took 46.400s
  train loss:		1.338417
  train accuracy:		38.20 %
  valid loss:		1.608154
  valid accuracy:		42.56 %
Epoch 3 of 200 took 46.395s
  train loss:		1.068154
  train accuracy:		45.93 %
  valid loss:		1.352683
  valid accuracy:		49.14 %
Epoch 4 of 200 took 46.374s
  train loss:		0.893678
  train accuracy:		51.75 %
  valid loss:		1.258822
  valid accuracy:		54.13 %
Epoch 5 of 200 took 46.148s
  train loss:		0.771206
  train accuracy:		56.13 %
  valid loss:		1.151356
  valid accuracy:		58.01 %
Epoch 6 of 200 took 46.457s
  train loss:		0.680364
  train accuracy:		59.64 %
  valid loss:		1.110259
  valid accuracy:		61.15 %
Epoch 7 of 200 took 46.332s
  train loss:		0.602894
  train accuracy:		62.51 %
  valid loss:		1.091778
  valid accuracy:		63.78 %
Epoch 8 of 200 took 46.599s
  train loss:		0.534664
  train accuracy:		64.95 %
  valid loss:		1.065765
  valid accuracy:		66.05 %
Epoch 9 of 200 took 46.091s
  train loss:		0.477308
  train accuracy:		67.06 %
  valid loss:		1.040173
  valid accuracy:		68.02 %
Epoch 10 of 200 took 46.606s
  train loss:		0.423727
  train accuracy:		68.92 %
  valid loss:		1.050352
  valid accuracy:		69.77 %
Epoch 11 of 200 took 46.707s
  train loss:		0.374541
  train accuracy:		70.57 %
  valid loss:		1.072593
  valid accuracy:		71.34 %
Epoch 12 of 200 took 46.360s
  train loss:		0.327554
  train accuracy:		72.06 %
  valid loss:		1.091392
  valid accuracy:		72.76 %
Epoch 13 of 200 took 46.313s
  train loss:		0.286052
  train accuracy:		73.42 %
  valid loss:		1.082572
  valid accuracy:		74.05 %
Epoch 14 of 200 took 46.496s
  train loss:		0.247611
  train accuracy:		74.65 %
  valid loss:		1.114746
  valid accuracy:		75.24 %
Epoch 15 of 200 took 46.568s
  train loss:		0.210651
  train accuracy:		75.78 %
  valid loss:		1.134214
  valid accuracy:		76.31 %
Epoch 16 of 200 took 46.459s
  train loss:		0.179015
  train accuracy:		76.81 %
  valid loss:		1.163387
  valid accuracy:		77.29 %
Epoch 17 of 200 took 45.950s
  train loss:		0.148533
  train accuracy:		77.75 %
  valid loss:		1.173163
  valid accuracy:		78.19 %
Epoch 18 of 200 took 46.505s
  train loss:		0.123248
  train accuracy:		78.61 %
  valid loss:		1.219606
  valid accuracy:		79.02 %
Epoch 19 of 200 took 46.465s
  train loss:		0.099769
  train accuracy:		79.40 %
  valid loss:		1.262747
  valid accuracy:		79.77 %
Epoch 20 of 200 took 46.732s
  train loss:		0.081701
  train accuracy:		80.12 %
  valid loss:		1.340159
  valid accuracy:		80.47 %
Epoch 21 of 200 took 46.618s
  train loss:		0.066287
  train accuracy:		80.79 %
  valid loss:		1.414106
  valid accuracy:		81.10 %
Epoch 22 of 200 took 45.820s
  train loss:		0.054112
  train accuracy:		81.39 %
  valid loss:		1.444525
  valid accuracy:		81.67 %
Epoch 23 of 200 took 46.901s
  train loss:		0.045065
  train accuracy:		81.94 %
  valid loss:		1.564520
  valid accuracy:		82.20 %
Epoch 24 of 200 took 46.535s
  train loss:		0.036326
  train accuracy:		82.45 %
  valid loss:		1.573308
  valid accuracy:		82.69 %
Epoch 25 of 200 took 45.905s
  train loss:		0.030696
  train accuracy:		82.92 %
  valid loss:		1.636710
  valid accuracy:		83.14 %
Epoch 26 of 200 took 45.908s
  train loss:		0.025550
  train accuracy:		83.35 %
  valid loss:		1.697554
  valid accuracy:		83.55 %
Epoch 27 of 200 took 46.612s
  train loss:		0.021817
  train accuracy:		83.75 %
  valid loss:		1.769364
  valid accuracy:		83.94 %
Epoch 28 of 200 took 46.688s
  train loss:		0.019135
  train accuracy:		84.12 %
  valid loss:		1.892416
  valid accuracy:		84.29 %
Epoch 29 of 200 took 45.988s
  train loss:		0.016494
  train accuracy:		84.46 %
  valid loss:		1.921870
  valid accuracy:		84.62 %
Epoch 30 of 200 took 46.388s
  train loss:		0.014368
  train accuracy:		84.78 %
  valid loss:		1.955706
  valid accuracy:		84.93 %
Epoch 31 of 200 took 46.082s
  train loss:		0.012535
  train accuracy:		85.08 %
  valid loss:		1.986176
  valid accuracy:		85.22 %
Epoch 32 of 200 took 46.351s
  train loss:		0.011464
  train accuracy:		85.36 %
  valid loss:		2.004706
  valid accuracy:		85.50 %
Epoch 33 of 200 took 46.559s
  train loss:		0.010114
  train accuracy:		85.63 %
  valid loss:		2.025486
  valid accuracy:		85.76 %
Epoch 34 of 200 took 46.419s
  train loss:		0.009387
  train accuracy:		85.88 %
  valid loss:		2.057479
  valid accuracy:		86.00 %
Epoch 35 of 200 took 46.527s
  train loss:		0.008349
  train accuracy:		86.12 %
  valid loss:		2.094346
  valid accuracy:		86.24 %
Epoch 36 of 200 took 46.079s
  train loss:		0.007898
  train accuracy:		86.35 %
  valid loss:		2.157089
  valid accuracy:		86.46 %
Epoch 37 of 200 took 45.943s
  train loss:		0.007285
  train accuracy:		86.56 %
  valid loss:		2.116151
  valid accuracy:		86.67 %
Epoch 38 of 200 took 46.402s
  train loss:		0.006827
  train accuracy:		86.77 %
  valid loss:		2.193483
  valid accuracy:		86.87 %
Epoch 39 of 200 took 46.498s
  train loss:		0.006194
  train accuracy:		86.96 %
  valid loss:		2.164363
  valid accuracy:		87.06 %
Epoch 40 of 200 took 46.130s
  train loss:		0.005655
  train accuracy:		87.15 %
  valid loss:		2.203264
  valid accuracy:		87.24 %
Epoch 41 of 200 took 46.175s
  train loss:		0.005042
  train accuracy:		87.32 %
  valid loss:		2.190999
  valid accuracy:		87.41 %
Epoch 42 of 200 took 46.337s
  train loss:		0.004655
  train accuracy:		87.49 %
  valid loss:		2.248990
  valid accuracy:		87.57 %
Epoch 43 of 200 took 46.519s
  train loss:		0.004281
  train accuracy:		87.65 %
  valid loss:		2.283950
  valid accuracy:		87.73 %
Epoch 44 of 200 took 46.541s
  train loss:		0.003906
  train accuracy:		87.80 %
  valid loss:		2.288338
  valid accuracy:		87.88 %
Epoch 45 of 200 took 46.383s
  train loss:		0.003709
  train accuracy:		87.95 %
  valid loss:		2.321118
  valid accuracy:		88.02 %
Epoch 46 of 200 took 46.379s
  train loss:		0.003367
  train accuracy:		88.09 %
  valid loss:		2.318637
  valid accuracy:		88.16 %
Epoch 47 of 200 took 46.321s
  train loss:		0.003210
  train accuracy:		88.22 %
  valid loss:		2.336938
  valid accuracy:		88.29 %
Epoch 48 of 200 took 46.616s
  train loss:		0.003047
  train accuracy:		88.35 %
  valid loss:		2.365868
  valid accuracy:		88.41 %
Epoch 49 of 200 took 46.546s
  train loss:		0.002932
  train accuracy:		88.47 %
  valid loss:		2.384340
  valid accuracy:		88.54 %
Epoch 50 of 200 took 46.343s
  train loss:		0.002827
  train accuracy:		88.59 %
  valid loss:		2.404286
  valid accuracy:		88.65 %
Epoch 51 of 200 took 46.492s
  train loss:		0.002726
  train accuracy:		88.71 %
  valid loss:		2.427995
  valid accuracy:		88.76 %
Epoch 52 of 200 took 46.435s
  train loss:		0.002632
  train accuracy:		88.82 %
  valid loss:		2.446465
  valid accuracy:		88.87 %
Epoch 53 of 200 took 46.396s
  train loss:		0.002544
  train accuracy:		88.92 %
  valid loss:		2.464981
  valid accuracy:		88.98 %
Epoch 54 of 200 took 46.494s
  train loss:		0.002457
  train accuracy:		89.03 %
  valid loss:		2.480806
  valid accuracy:		89.08 %
Epoch 55 of 200 took 46.477s
  train loss:		0.002376
  train accuracy:		89.12 %
  valid loss:		2.495381
  valid accuracy:		89.17 %
Epoch 56 of 200 took 46.497s
  train loss:		0.002300
  train accuracy:		89.22 %
  valid loss:		2.510440
  valid accuracy:		89.27 %
Epoch 57 of 200 took 46.525s
  train loss:		0.002228
  train accuracy:		89.31 %
  valid loss:		2.524876
  valid accuracy:		89.36 %
Epoch 58 of 200 took 46.617s
  train loss:		0.002159
  train accuracy:		89.40 %
  valid loss:		2.539337
  valid accuracy:		89.44 %
Epoch 59 of 200 took 46.616s
  train loss:		0.002094
  train accuracy:		89.49 %
  valid loss:		2.552020
  valid accuracy:		89.53 %
Epoch 60 of 200 took 46.595s
  train loss:		0.002032
  train accuracy:		89.57 %
  valid loss:		2.563518
  valid accuracy:		89.61 %
Epoch 61 of 200 took 46.348s
  train loss:		0.001975
  train accuracy:		89.65 %
  valid loss:		2.575621
  valid accuracy:		89.69 %
Epoch 62 of 200 took 46.420s
  train loss:		0.001919
  train accuracy:		89.73 %
  valid loss:		2.587177
  valid accuracy:		89.77 %
Epoch 63 of 200 took 46.505s
  train loss:		0.001866
  train accuracy:		89.80 %
  valid loss:		2.598944
  valid accuracy:		89.84 %
Epoch 64 of 200 took 46.657s
  train loss:		0.001815
  train accuracy:		89.88 %
  valid loss:		2.609802
  valid accuracy:		89.91 %
Epoch 65 of 200 took 46.427s
  train loss:		0.001766
  train accuracy:		89.95 %
  valid loss:		2.619031
  valid accuracy:		89.98 %
Epoch 66 of 200 took 46.613s
  train loss:		0.001719
  train accuracy:		90.02 %
  valid loss:		2.628502
  valid accuracy:		90.05 %
Epoch 67 of 200 took 46.457s
  train loss:		0.001673
  train accuracy:		90.09 %
  valid loss:		2.637787
  valid accuracy:		90.12 %
Epoch 68 of 200 took 46.454s
  train loss:		0.001630
  train accuracy:		90.15 %
  valid loss:		2.647378
  valid accuracy:		90.18 %
Epoch 69 of 200 took 46.439s
  train loss:		0.001589
  train accuracy:		90.21 %
  valid loss:		2.656107
  valid accuracy:		90.25 %
Epoch 70 of 200 took 46.429s
  train loss:		0.001550
  train accuracy:		90.28 %
  valid loss:		2.664920
  valid accuracy:		90.31 %
Epoch 71 of 200 took 45.955s
  train loss:		0.001511
  train accuracy:		90.34 %
  valid loss:		2.673335
  valid accuracy:		90.37 %
Epoch 72 of 200 took 46.507s
  train loss:		0.001474
  train accuracy:		90.39 %
  valid loss:		2.681837
  valid accuracy:		90.42 %
Epoch 73 of 200 took 46.201s
  train loss:		0.001439
  train accuracy:		90.45 %
  valid loss:		2.690053
  valid accuracy:		90.48 %
Epoch 74 of 200 took 46.289s
  train loss:		0.001406
  train accuracy:		90.51 %
  valid loss:		2.697976
  valid accuracy:		90.53 %
Epoch 75 of 200 took 46.571s
  train loss:		0.001373
  train accuracy:		90.56 %
  valid loss:		2.706229
  valid accuracy:		90.59 %
Epoch 76 of 200 took 46.479s
  train loss:		0.001342
  train accuracy:		90.61 %
  valid loss:		2.713327
  valid accuracy:		90.64 %
Epoch 77 of 200 took 46.555s
  train loss:		0.001311
  train accuracy:		90.66 %
  valid loss:		2.720302
  valid accuracy:		90.69 %
Epoch 78 of 200 took 46.543s
  train loss:		0.001282
  train accuracy:		90.71 %
  valid loss:		2.727682
  valid accuracy:		90.74 %
Epoch 79 of 200 took 46.350s
  train loss:		0.001254
  train accuracy:		90.76 %
  valid loss:		2.735109
  valid accuracy:		90.79 %
Epoch 80 of 200 took 45.870s
  train loss:		0.001226
  train accuracy:		90.81 %
  valid loss:		2.741984
  valid accuracy:		90.83 %
Epoch 81 of 200 took 46.510s
  train loss:		0.001200
  train accuracy:		90.86 %
  valid loss:		2.749295
  valid accuracy:		90.88 %
Epoch 82 of 200 took 46.544s
  train loss:		0.001174
  train accuracy:		90.90 %
  valid loss:		2.756260
  valid accuracy:		90.92 %
Epoch 83 of 200 took 46.394s
  train loss:		0.001150
  train accuracy:		90.95 %
  valid loss:		2.762520
  valid accuracy:		90.97 %
Epoch 84 of 200 took 45.946s
  train loss:		0.001127
  train accuracy:		90.99 %
  valid loss:		2.768561
  valid accuracy:		91.01 %
Epoch 85 of 200 took 46.485s
  train loss:		0.001104
  train accuracy:		91.03 %
  valid loss:		2.774508
  valid accuracy:		91.05 %
Epoch 86 of 200 took 46.469s
  train loss:		0.001082
  train accuracy:		91.07 %
  valid loss:		2.780773
  valid accuracy:		91.09 %
Epoch 87 of 200 took 46.504s
  train loss:		0.001061
  train accuracy:		91.11 %
  valid loss:		2.786291
  valid accuracy:		91.13 %
Epoch 88 of 200 took 46.522s
  train loss:		0.001040
  train accuracy:		91.15 %
  valid loss:		2.792589
  valid accuracy:		91.17 %
Epoch 89 of 200 took 46.634s
  train loss:		0.001020
  train accuracy:		91.19 %
  valid loss:		2.798093
  valid accuracy:		91.21 %
Epoch 90 of 200 took 46.256s
  train loss:		0.001001
  train accuracy:		91.23 %
  valid loss:		2.804122
  valid accuracy:		91.25 %
Epoch 91 of 200 took 46.198s
  train loss:		0.000982
  train accuracy:		91.27 %
  valid loss:		2.809269
  valid accuracy:		91.28 %
Epoch 92 of 200 took 46.467s
  train loss:		0.000964
  train accuracy:		91.30 %
  valid loss:		2.814523
  valid accuracy:		91.32 %
Epoch 93 of 200 took 46.448s
  train loss:		0.000947
  train accuracy:		91.34 %
  valid loss:		2.819872
  valid accuracy:		91.35 %
Epoch 94 of 200 took 46.604s
  train loss:		0.000930
  train accuracy:		91.37 %
  valid loss:		2.824866
  valid accuracy:		91.39 %
Epoch 95 of 200 took 46.513s
  train loss:		0.000913
  train accuracy:		91.40 %
  valid loss:		2.829702
  valid accuracy:		91.42 %
Epoch 96 of 200 took 46.547s
  train loss:		0.000898
  train accuracy:		91.44 %
  valid loss:		2.834263
  valid accuracy:		91.45 %
Epoch 97 of 200 took 46.600s
  train loss:		0.000882
  train accuracy:		91.47 %
  valid loss:		2.839120
  valid accuracy:		91.49 %
Epoch 98 of 200 took 46.521s
  train loss:		0.000867
  train accuracy:		91.50 %
  valid loss:		2.843705
  valid accuracy:		91.52 %
Epoch 99 of 200 took 46.621s
  train loss:		0.000852
  train accuracy:		91.53 %
  valid loss:		2.848448
  valid accuracy:		91.55 %
Epoch 100 of 200 took 46.681s
  train loss:		0.000838
  train accuracy:		91.56 %
  valid loss:		2.852825
  valid accuracy:		91.58 %
Epoch 101 of 200 took 46.469s
  train loss:		0.000825
  train accuracy:		91.59 %
  valid loss:		2.857436
  valid accuracy:		91.61 %
Epoch 102 of 200 took 46.538s
  train loss:		0.000811
  train accuracy:		91.62 %
  valid loss:		2.861573
  valid accuracy:		91.64 %
Epoch 103 of 200 took 46.472s
  train loss:		0.000798
  train accuracy:		91.65 %
  valid loss:		2.866065
  valid accuracy:		91.67 %
Epoch 104 of 200 took 46.361s
  train loss:		0.000786
  train accuracy:		91.68 %
  valid loss:		2.870290
  valid accuracy:		91.69 %
Epoch 105 of 200 took 46.531s
  train loss:		0.000773
  train accuracy:		91.71 %
  valid loss:		2.874091
  valid accuracy:		91.72 %
Epoch 106 of 200 took 46.651s
  train loss:		0.000762
  train accuracy:		91.73 %
  valid loss:		2.877990
  valid accuracy:		91.75 %
Epoch 107 of 200 took 46.222s
  train loss:		0.000750
  train accuracy:		91.76 %
  valid loss:		2.882220
  valid accuracy:		91.77 %
Epoch 108 of 200 took 46.262s
  train loss:		0.000739
  train accuracy:		91.79 %
  valid loss:		2.886039
  valid accuracy:		91.80 %
Epoch 109 of 200 took 46.573s
  train loss:		0.000728
  train accuracy:		91.81 %
  valid loss:		2.889899
  valid accuracy:		91.83 %
Epoch 110 of 200 took 45.720s
  train loss:		0.000717
  train accuracy:		91.84 %
  valid loss:		2.893905
  valid accuracy:		91.85 %
Epoch 111 of 200 took 46.558s
  train loss:		0.000706
  train accuracy:		91.86 %
  valid loss:		2.897564
  valid accuracy:		91.88 %
Epoch 112 of 200 took 46.200s
  train loss:		0.000696
  train accuracy:		91.89 %
  valid loss:		2.901390
  valid accuracy:		91.90 %
Epoch 113 of 200 took 45.975s
  train loss:		0.000686
  train accuracy:		91.91 %
  valid loss:		2.905069
  valid accuracy:		91.92 %
Epoch 114 of 200 took 46.334s
  train loss:		0.000676
  train accuracy:		91.93 %
  valid loss:		2.908682
  valid accuracy:		91.95 %
Epoch 115 of 200 took 46.592s
  train loss:		0.000667
  train accuracy:		91.96 %
  valid loss:		2.912394
  valid accuracy:		91.97 %
Epoch 116 of 200 took 46.179s
  train loss:		0.000657
  train accuracy:		91.98 %
  valid loss:		2.916185
  valid accuracy:		91.99 %
Epoch 117 of 200 took 46.291s
  train loss:		0.000648
  train accuracy:		92.00 %
  valid loss:		2.919654
  valid accuracy:		92.01 %
Epoch 118 of 200 took 46.483s
  train loss:		0.000640
  train accuracy:		92.02 %
  valid loss:		2.923025
  valid accuracy:		92.04 %
Epoch 119 of 200 took 46.523s
  train loss:		0.000631
  train accuracy:		92.05 %
  valid loss:		2.926257
  valid accuracy:		92.06 %
Epoch 120 of 200 took 46.503s
  train loss:		0.000622
  train accuracy:		92.07 %
  valid loss:		2.929554
  valid accuracy:		92.08 %
Epoch 121 of 200 took 46.556s
  train loss:		0.000614
  train accuracy:		92.09 %
  valid loss:		2.932675
  valid accuracy:		92.10 %
Epoch 122 of 200 took 46.536s
  train loss:		0.000606
  train accuracy:		92.11 %
  valid loss:		2.935990
  valid accuracy:		92.12 %
Epoch 123 of 200 took 46.666s
  train loss:		0.000598
  train accuracy:		92.13 %
  valid loss:		2.939233
  valid accuracy:		92.14 %
Epoch 124 of 200 took 46.497s
  train loss:		0.000591
  train accuracy:		92.15 %
  valid loss:		2.942128
  valid accuracy:		92.16 %
Epoch 125 of 200 took 46.468s
  train loss:		0.000583
  train accuracy:		92.17 %
  valid loss:		2.945064
  valid accuracy:		92.18 %
Epoch 126 of 200 took 46.373s
  train loss:		0.000576
  train accuracy:		92.19 %
  valid loss:		2.948091
  valid accuracy:		92.20 %
Epoch 127 of 200 took 46.387s
  train loss:		0.000569
  train accuracy:		92.21 %
  valid loss:		2.951046
  valid accuracy:		92.22 %
Epoch 128 of 200 took 46.240s
  train loss:		0.000562
  train accuracy:		92.23 %
  valid loss:		2.953994
  valid accuracy:		92.24 %
Epoch 129 of 200 took 46.213s
  train loss:		0.000555
  train accuracy:		92.24 %
  valid loss:		2.956806
  valid accuracy:		92.25 %
Epoch 130 of 200 took 46.360s
  train loss:		0.000548
  train accuracy:		92.26 %
  valid loss:		2.959685
  valid accuracy:		92.27 %
Epoch 131 of 200 took 46.391s
  train loss:		0.000542
  train accuracy:		92.28 %
  valid loss:		2.962424
  valid accuracy:		92.29 %
Epoch 132 of 200 took 46.366s
  train loss:		0.000535
  train accuracy:		92.30 %
  valid loss:		2.965257
  valid accuracy:		92.31 %
Epoch 133 of 200 took 46.277s
  train loss:		0.000529
  train accuracy:		92.31 %
  valid loss:		2.968083
  valid accuracy:		92.32 %
Epoch 134 of 200 took 46.451s
  train loss:		0.000523
  train accuracy:		92.33 %
  valid loss:		2.970856
  valid accuracy:		92.34 %
Epoch 135 of 200 took 46.354s
  train loss:		0.000517
  train accuracy:		92.35 %
  valid loss:		2.973526
  valid accuracy:		92.36 %
Epoch 136 of 200 took 46.534s
  train loss:		0.000511
  train accuracy:		92.37 %
  valid loss:		2.976242
  valid accuracy:		92.37 %
Epoch 137 of 200 took 46.477s
  train loss:		0.000505
  train accuracy:		92.38 %
  valid loss:		2.978915
  valid accuracy:		92.39 %
Epoch 138 of 200 took 46.277s
  train loss:		0.000500
  train accuracy:		92.40 %
  valid loss:		2.981456
  valid accuracy:		92.41 %
Epoch 139 of 200 took 46.469s
  train loss:		0.000494
  train accuracy:		92.41 %
  valid loss:		2.983781
  valid accuracy:		92.42 %
Epoch 140 of 200 took 46.464s
  train loss:		0.000489
  train accuracy:		92.43 %
  valid loss:		2.986396
  valid accuracy:		92.44 %
Epoch 141 of 200 took 46.460s
  train loss:		0.000483
  train accuracy:		92.44 %
  valid loss:		2.988986
  valid accuracy:		92.45 %
Epoch 142 of 200 took 46.395s
  train loss:		0.000478
  train accuracy:		92.46 %
  valid loss:		2.991418
  valid accuracy:		92.47 %
Epoch 143 of 200 took 46.246s
  train loss:		0.000473
  train accuracy:		92.47 %
  valid loss:		2.993982
  valid accuracy:		92.48 %
Epoch 144 of 200 took 46.399s
  train loss:		0.000468
  train accuracy:		92.49 %
  valid loss:		2.996550
  valid accuracy:		92.50 %
Epoch 145 of 200 took 46.467s
  train loss:		0.000463
  train accuracy:		92.50 %
  valid loss:		2.998733
  valid accuracy:		92.51 %
Epoch 146 of 200 took 46.283s
  train loss:		0.000458
  train accuracy:		92.52 %
  valid loss:		3.000988
  valid accuracy:		92.52 %
Epoch 147 of 200 took 46.224s
  train loss:		0.000453
  train accuracy:		92.53 %
  valid loss:		3.003448
  valid accuracy:		92.54 %
Epoch 148 of 200 took 46.431s
  train loss:		0.000448
  train accuracy:		92.55 %
  valid loss:		3.005664
  valid accuracy:		92.55 %
Epoch 149 of 200 took 46.428s
  train loss:		0.000444
  train accuracy:		92.56 %
  valid loss:		3.007871
  valid accuracy:		92.57 %
Epoch 150 of 200 took 46.367s
  train loss:		0.000439
  train accuracy:		92.57 %
  valid loss:		3.010298
  valid accuracy:		92.58 %
Epoch 151 of 200 took 46.313s
  train loss:		0.000435
  train accuracy:		92.59 %
  valid loss:		3.012379
  valid accuracy:		92.59 %
Epoch 152 of 200 took 46.469s
  train loss:		0.000431
  train accuracy:		92.60 %
  valid loss:		3.014639
  valid accuracy:		92.61 %
Epoch 153 of 200 took 46.418s
  train loss:		0.000427
  train accuracy:		92.61 %
  valid loss:		3.016791
  valid accuracy:		92.62 %
Epoch 154 of 200 took 46.382s
  train loss:		0.000422
  train accuracy:		92.62 %
  valid loss:		3.018965
  valid accuracy:		92.63 %
Epoch 155 of 200 took 46.470s
  train loss:		0.000418
  train accuracy:		92.64 %
  valid loss:		3.021318
  valid accuracy:		92.64 %
Epoch 156 of 200 took 46.441s
  train loss:		0.000414
  train accuracy:		92.65 %
  valid loss:		3.023346
  valid accuracy:		92.66 %
Epoch 157 of 200 took 46.394s
  train loss:		0.000410
  train accuracy:		92.66 %
  valid loss:		3.025412
  valid accuracy:		92.67 %
Epoch 158 of 200 took 46.609s
  train loss:		0.000406
  train accuracy:		92.67 %
  valid loss:		3.027516
  valid accuracy:		92.68 %
Epoch 159 of 200 took 46.354s
  train loss:		0.000403
  train accuracy:		92.69 %
  valid loss:		3.029643
  valid accuracy:		92.69 %
Epoch 160 of 200 took 46.589s
  train loss:		0.000399
  train accuracy:		92.70 %
  valid loss:		3.031629
  valid accuracy:		92.70 %
Epoch 161 of 200 took 46.345s
  train loss:		0.000395
  train accuracy:		92.71 %
  valid loss:		3.033420
  valid accuracy:		92.72 %
Epoch 162 of 200 took 46.354s
  train loss:		0.000392
  train accuracy:		92.72 %
  valid loss:		3.035542
  valid accuracy:		92.73 %
Epoch 163 of 200 took 46.508s
  train loss:		0.000388
  train accuracy:		92.73 %
  valid loss:		3.037533
  valid accuracy:		92.74 %
Epoch 164 of 200 took 46.575s
  train loss:		0.000384
  train accuracy:		92.74 %
  valid loss:		3.039452
  valid accuracy:		92.75 %
Epoch 165 of 200 took 46.285s
  train loss:		0.000381
  train accuracy:		92.75 %
  valid loss:		3.041474
  valid accuracy:		92.76 %
Epoch 166 of 200 took 46.527s
  train loss:		0.000377
  train accuracy:		92.77 %
  valid loss:		3.043281
  valid accuracy:		92.77 %
Epoch 167 of 200 took 46.543s
  train loss:		0.000374
  train accuracy:		92.78 %
  valid loss:		3.045260
  valid accuracy:		92.78 %
Epoch 168 of 200 took 46.560s
  train loss:		0.000371
  train accuracy:		92.79 %
  valid loss:		3.047157
  valid accuracy:		92.79 %
Epoch 169 of 200 took 46.624s
  train loss:		0.000367
  train accuracy:		92.80 %
  valid loss:		3.049025
  valid accuracy:		92.80 %
Epoch 170 of 200 took 46.428s
  train loss:		0.000364
  train accuracy:		92.81 %
  valid loss:		3.050921
  valid accuracy:		92.81 %
Epoch 171 of 200 took 45.933s
  train loss:		0.000361
  train accuracy:		92.82 %
  valid loss:		3.052793
  valid accuracy:		92.82 %
Epoch 172 of 200 took 46.128s
  train loss:		0.000358
  train accuracy:		92.83 %
  valid loss:		3.054616
  valid accuracy:		92.83 %
Epoch 173 of 200 took 45.938s
  train loss:		0.000355
  train accuracy:		92.84 %
  valid loss:		3.056455
  valid accuracy:		92.84 %
Epoch 174 of 200 took 46.514s
  train loss:		0.000352
  train accuracy:		92.85 %
  valid loss:		3.058264
  valid accuracy:		92.85 %
Epoch 175 of 200 took 46.500s
  train loss:		0.000349
  train accuracy:		92.86 %
  valid loss:		3.060015
  valid accuracy:		92.86 %
Epoch 176 of 200 took 46.427s
  train loss:		0.000346
  train accuracy:		92.87 %
  valid loss:		3.061745
  valid accuracy:		92.87 %
Epoch 177 of 200 took 46.304s
  train loss:		0.000343
  train accuracy:		92.88 %
  valid loss:		3.063515
  valid accuracy:		92.88 %
Epoch 178 of 200 took 46.478s
  train loss:		0.000340
  train accuracy:		92.89 %
  valid loss:		3.065171
  valid accuracy:		92.89 %
Epoch 179 of 200 took 46.473s
  train loss:		0.000337
  train accuracy:		92.90 %
  valid loss:		3.066887
  valid accuracy:		92.90 %
Epoch 180 of 200 took 46.119s
  train loss:		0.000334
  train accuracy:		92.91 %
  valid loss:		3.068513
  valid accuracy:		92.91 %
Epoch 181 of 200 took 46.334s
  train loss:		0.000332
  train accuracy:		92.92 %
  valid loss:		3.070245
  valid accuracy:		92.92 %
Epoch 182 of 200 took 46.502s
  train loss:		0.000329
  train accuracy:		92.93 %
  valid loss:		3.072019
  valid accuracy:		92.93 %
Epoch 183 of 200 took 46.527s
  train loss:		0.000326
  train accuracy:		92.94 %
  valid loss:		3.073730
  valid accuracy:		92.94 %
Epoch 184 of 200 took 46.126s
  train loss:		0.000324
  train accuracy:		92.94 %
  valid loss:		3.075331
  valid accuracy:		92.95 %
Epoch 185 of 200 took 46.444s
  train loss:		0.000321
  train accuracy:		92.95 %
  valid loss:		3.077019
  valid accuracy:		92.96 %
Epoch 186 of 200 took 46.425s
  train loss:		0.000318
  train accuracy:		92.96 %
  valid loss:		3.078645
  valid accuracy:		92.97 %
Epoch 187 of 200 took 46.450s
  train loss:		0.000316
  train accuracy:		92.97 %
  valid loss:		3.080335
  valid accuracy:		92.98 %
Epoch 188 of 200 took 46.502s
  train loss:		0.000313
  train accuracy:		92.98 %
  valid loss:		3.081947
  valid accuracy:		92.98 %
Epoch 189 of 200 took 46.448s
  train loss:		0.000311
  train accuracy:		92.99 %
  valid loss:		3.083502
  valid accuracy:		92.99 %
Epoch 190 of 200 took 46.562s
  train loss:		0.000309
  train accuracy:		93.00 %
  valid loss:		3.085212
  valid accuracy:		93.00 %
Epoch 191 of 200 took 46.489s
  train loss:		0.000306
  train accuracy:		93.01 %
  valid loss:		3.086848
  valid accuracy:		93.01 %
Epoch 192 of 200 took 46.523s
  train loss:		0.000304
  train accuracy:		93.01 %
  valid loss:		3.088500
  valid accuracy:		93.02 %
Epoch 193 of 200 took 46.215s
  train loss:		0.000302
  train accuracy:		93.02 %
  valid loss:		3.090042
  valid accuracy:		93.03 %
Epoch 194 of 200 took 46.526s
  train loss:		0.000299
  train accuracy:		93.03 %
  valid loss:		3.091591
  valid accuracy:		93.03 %
Epoch 195 of 200 took 46.300s
  train loss:		0.000297
  train accuracy:		93.04 %
  valid loss:		3.093240
  valid accuracy:		93.04 %
Epoch 196 of 200 took 46.190s
  train loss:		0.000295
  train accuracy:		93.05 %
  valid loss:		3.094857
  valid accuracy:		93.05 %
Epoch 197 of 200 took 46.272s
  train loss:		0.000292
  train accuracy:		93.05 %
  valid loss:		3.096389
  valid accuracy:		93.06 %
Epoch 198 of 200 took 46.263s
  train loss:		0.000290
  train accuracy:		93.06 %
  valid loss:		3.098027
  valid accuracy:		93.07 %
Epoch 199 of 200 took 46.528s
  train loss:		0.000288
  train accuracy:		93.07 %
  valid loss:		3.099596
  valid accuracy:		93.07 %
Epoch 200 of 200 took 46.471s
  train loss:		0.000286
  train accuracy:		93.08 %
  valid loss:		3.101094
  valid accuracy:		93.08 %
Final results:
  test accuracy:		93.06 %
Achievement unlocked: mage 80 lvl

Hi, my name is Евгений Шлыков, and here's my story

A long ago in a galaxy far far away, when it was still more than an hour before deadline, i got an idea:

I gonna build a neural network, that

совпадает с какой-из нейросетей, нагугленных в инете. К счастью, на лекции так официально разрешили делать. Я использовал вот этот источник: http://www.aimechanic.com/2016/10/13/d242-tensorflow-cifar-10-tutorial-detailed-step-by-step-review-part-1/. Единственное, я убрал какие-то непонятные вещи из него (где считают всякие биасы, веса, локали и все такое).

How could i be so naive?!

Да запросто. Я нагугливал и другие архитектуры, но там такой говнокод, что понять, какие слои используются и с какими параметрами, вообще нереально.

One day, with no signs of warning,

я начал подгонять параметры. Так как у слоев нейросети параметров примерно 100500, то я даже не рискнул их трогать. Есть куда более разумные способы получить высший скор. Во-первых, батчсайз. Я попробовал его увеличить — скор упал. Попробовал уменьшить — скор возрос. Попробовал еще уменьшить — скор опять возрос. И так далее. Потом стало безумно бредово, и после первой эпохи скор был порядка 10-11%. Я расстроился и откатил все назад. В итоге выбрал батчсайз 16.

После 10 эпох мой скор был какой-то нормальный (порядка 70%), так что я увеличил число эпох до 25, и выбил около 80%. Затем я сделал 50, и там было уже 87%. Так как уже пора было спать, я не стал долго думать, поставил 200 эпох и уснул. Утром обнаружил 93%.

Finally, after ~30 iterations, ~100 mugs of успокоительного

наконец-то это кончилось.

У меня есть еще тру стори, как колаб дисконнектился в середине вычислений, или по 10 минут не мор инициализировался. И еще много всякой параши происходило во время этого задания.

Конец.


In [0]: