Image features exercise

We have seen that we can achieve reasonable performance on an image classification task by training a linear classifier on the pixels of the input image. In this exercise we will show that we can improve our classification performance by training linear classifiers not on raw pixels but on features that are computed from the raw pixels.

All of your work for this exercise will be done in this notebook.


In [1]:
import os
os.chdir(os.getcwd() + '/..')

# Run some setup code for this notebook
import random
import numpy as np
import matplotlib.pyplot as plt

from utils.data_utils import load_CIFAR10

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Some more magic so that the notebook will reload external python modules;
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

Load data

Similar to previous exercises, we will load CIFAR-10 data from disk.


In [2]:
from utils.data_utils import get_CIFAR10_data

cifar10_dir = 'datasets/cifar-10-batches-py'
X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data(cifar10_dir, num_training=49000, num_validation=1000, num_test=1000)

print (X_train.shape, y_train.shape, X_val.shape, y_val.shape, X_test.shape, y_test.shape)


((49000, 32, 32, 3), (49000,), (1000, 32, 32, 3), (1000,), (1000, 32, 32, 3), (1000,))

Extract Features

For each image we will compute a Histogram of Oriented Gradients (HOG) as well as a color histogram using the hue channel in HSV color space. We form our final feature vector for each image by concatenating the HOG and color histogram feature vectors.

Roughly speaking, HOG should capture the texture of the image while ignoring color information, and the color histogram represents the color of the input image while ignoring texture. As a result, we expect that using both together ought to work better than using either alone. Verifying this assumption would be a good thing to try for the bonus section.

The hog_feature and color_histogram_hsv functions both operate on a single image and return a feature vector for that image. The extract_features function takes a set of images and a list of feature functions and evaluates each feature function on each image, storing the results in a matrix where each column is the concatenation of all feature vectors for a single image.


In [3]:
from utils.features_utils import extract_features, hog_feature, color_histogram_hsv

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation.
# This ensures that each feature has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones([X_train_feats.shape[0], 1])])
X_val_feats = np.hstack([X_val_feats, np.ones([X_val_feats.shape[0], 1])])
X_test_feats = np.hstack([X_test_feats, np.ones([X_test_feats.shape[0], 1])])


Done extracting features for 1000 / 49000 images
Done extracting features for 2000 / 49000 images
Done extracting features for 3000 / 49000 images
Done extracting features for 4000 / 49000 images
Done extracting features for 5000 / 49000 images
Done extracting features for 6000 / 49000 images
Done extracting features for 7000 / 49000 images
Done extracting features for 8000 / 49000 images
Done extracting features for 9000 / 49000 images
Done extracting features for 10000 / 49000 images
Done extracting features for 11000 / 49000 images
Done extracting features for 12000 / 49000 images
Done extracting features for 13000 / 49000 images
Done extracting features for 14000 / 49000 images
Done extracting features for 15000 / 49000 images
Done extracting features for 16000 / 49000 images
Done extracting features for 17000 / 49000 images
Done extracting features for 18000 / 49000 images
Done extracting features for 19000 / 49000 images
Done extracting features for 20000 / 49000 images
Done extracting features for 21000 / 49000 images
Done extracting features for 22000 / 49000 images
Done extracting features for 23000 / 49000 images
Done extracting features for 24000 / 49000 images
Done extracting features for 25000 / 49000 images
Done extracting features for 26000 / 49000 images
Done extracting features for 27000 / 49000 images
Done extracting features for 28000 / 49000 images
Done extracting features for 29000 / 49000 images
Done extracting features for 30000 / 49000 images
Done extracting features for 31000 / 49000 images
Done extracting features for 32000 / 49000 images
Done extracting features for 33000 / 49000 images
Done extracting features for 34000 / 49000 images
Done extracting features for 35000 / 49000 images
Done extracting features for 36000 / 49000 images
Done extracting features for 37000 / 49000 images
Done extracting features for 38000 / 49000 images
Done extracting features for 39000 / 49000 images
Done extracting features for 40000 / 49000 images
Done extracting features for 41000 / 49000 images
Done extracting features for 42000 / 49000 images
Done extracting features for 43000 / 49000 images
Done extracting features for 44000 / 49000 images
Done extracting features for 45000 / 49000 images
Done extracting features for 46000 / 49000 images
Done extracting features for 47000 / 49000 images
Done extracting features for 48000 / 49000 images

Train SVM on features

Using the multiclass SVM code developed earlier in the assignment, train SVMs on top of the features extracted above; this should achieve better results than training SVMs directly on top of raw pixels.


In [4]:
# Use the validation set to tune the learning rate and regularization strength
# val accuracy should reach near 0.44

from classifiers.linear_classifier import LinearSVM

learning_rates = [1e-4, 3e-4, 9e-4, 1e-3, 3e-3, 9e-3, 1e-2, 3e-2, 9e-2, 1e-1]
regularization_strengths = [1e-1, 3e-1, 9e-1, 1, 3, 9]

# results[(learning_rate, reg)] = (train_accuracy, val_accuracy)
results = {}
best_val = -1
best_svm = None

for learning_rate in learning_rates:
    for reg in regularization_strengths:
        model = LinearSVM()
        model.train(X_train_feats, y_train, learning_rate=learning_rate, reg=reg, num_iters=5000,
                   batch_size=300, verbose=True)
        y_train_pred = model.predict(X_train_feats)
        train_accuracy = np.mean(y_train == y_train_pred)
        y_val_pred = model.predict(X_val_feats)
        val_accuracy = np.mean(y_val == y_val_pred)
        
        results[(learning_rate, reg)] = (train_accuracy, val_accuracy)
        if val_accuracy > best_val:
            best_val = val_accuracy
            best_svm = model
            
        print('lr %e reg %e train_accuracy: %f val_accuracy: %f' % (learning_rate, reg, train_accuracy, val_accuracy))
    print
    
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train_accuracy: %f val_accuracy: %f' % (lr, reg, train_accuracy, val_accuracy))
    
print('best validation accuracy achieved during cross-validation: %f' % best_val)


iteration 0 / 5000: loss 9.011121
iteration 100 / 5000: loss 8.322861
iteration 200 / 5000: loss 7.660865
iteration 300 / 5000: loss 7.051973
iteration 400 / 5000: loss 6.400993
iteration 500 / 5000: loss 5.951562
iteration 600 / 5000: loss 5.664219
iteration 700 / 5000: loss 5.181060
iteration 800 / 5000: loss 4.867892
iteration 900 / 5000: loss 5.361546
iteration 1000 / 5000: loss 4.646933
iteration 1100 / 5000: loss 5.065915
iteration 1200 / 5000: loss 4.581701
iteration 1300 / 5000: loss 4.595005
iteration 1400 / 5000: loss 4.524878
iteration 1500 / 5000: loss 4.588103
iteration 1600 / 5000: loss 4.277639
iteration 1700 / 5000: loss 4.535939
iteration 1800 / 5000: loss 4.643322
iteration 1900 / 5000: loss 4.470205
iteration 2000 / 5000: loss 4.352496
iteration 2100 / 5000: loss 4.076629
iteration 2200 / 5000: loss 4.208459
iteration 2300 / 5000: loss 4.198189
iteration 2400 / 5000: loss 4.140494
iteration 2500 / 5000: loss 3.574724
iteration 2600 / 5000: loss 4.060319
iteration 2700 / 5000: loss 4.148051
iteration 2800 / 5000: loss 4.038340
iteration 2900 / 5000: loss 4.292838
iteration 3000 / 5000: loss 3.953596
iteration 3100 / 5000: loss 3.959393
iteration 3200 / 5000: loss 3.720044
iteration 3300 / 5000: loss 4.066413
iteration 3400 / 5000: loss 3.921174
iteration 3500 / 5000: loss 3.901255
iteration 3600 / 5000: loss 3.560189
iteration 3700 / 5000: loss 3.711812
iteration 3800 / 5000: loss 3.623100
iteration 3900 / 5000: loss 3.859224
iteration 4000 / 5000: loss 3.940789
iteration 4100 / 5000: loss 3.924741
iteration 4200 / 5000: loss 3.367966
iteration 4300 / 5000: loss 4.029428
iteration 4400 / 5000: loss 3.605307
iteration 4500 / 5000: loss 3.560490
iteration 4600 / 5000: loss 3.420784
iteration 4700 / 5000: loss 3.915441
iteration 4800 / 5000: loss 3.460084
iteration 4900 / 5000: loss 3.642395
lr 1.000000e-04 reg 1.000000e-01 train_accuracy: 0.482980 val_accuracy: 0.470000
iteration 0 / 5000: loss 9.008062
iteration 100 / 5000: loss 8.257114
iteration 200 / 5000: loss 7.602145
iteration 300 / 5000: loss 6.859505
iteration 400 / 5000: loss 6.373375
iteration 500 / 5000: loss 5.925503
iteration 600 / 5000: loss 5.723212
iteration 700 / 5000: loss 5.319204
iteration 800 / 5000: loss 5.208642
iteration 900 / 5000: loss 4.884725
iteration 1000 / 5000: loss 4.990400
iteration 1100 / 5000: loss 4.962428
iteration 1200 / 5000: loss 5.110925
iteration 1300 / 5000: loss 4.919371
iteration 1400 / 5000: loss 4.542552
iteration 1500 / 5000: loss 4.262947
iteration 1600 / 5000: loss 4.611890
iteration 1700 / 5000: loss 4.379958
iteration 1800 / 5000: loss 4.163098
iteration 1900 / 5000: loss 4.285366
iteration 2000 / 5000: loss 4.374109
iteration 2100 / 5000: loss 4.374829
iteration 2200 / 5000: loss 4.315067
iteration 2300 / 5000: loss 3.932951
iteration 2400 / 5000: loss 4.280133
iteration 2500 / 5000: loss 4.006917
iteration 2600 / 5000: loss 3.988390
iteration 2700 / 5000: loss 4.627602
iteration 2800 / 5000: loss 4.122997
iteration 2900 / 5000: loss 4.120711
iteration 3000 / 5000: loss 4.368613
iteration 3100 / 5000: loss 4.139330
iteration 3200 / 5000: loss 3.936631
iteration 3300 / 5000: loss 4.310317
iteration 3400 / 5000: loss 3.953139
iteration 3500 / 5000: loss 4.198423
iteration 3600 / 5000: loss 3.948047
iteration 3700 / 5000: loss 4.035859
iteration 3800 / 5000: loss 4.263131
iteration 3900 / 5000: loss 3.930642
iteration 4000 / 5000: loss 4.130572
iteration 4100 / 5000: loss 3.725656
iteration 4200 / 5000: loss 3.952224
iteration 4300 / 5000: loss 3.750781
iteration 4400 / 5000: loss 4.001009
iteration 4500 / 5000: loss 3.914409
iteration 4600 / 5000: loss 4.412507
iteration 4700 / 5000: loss 4.345660
iteration 4800 / 5000: loss 4.201724
iteration 4900 / 5000: loss 4.101892
lr 1.000000e-04 reg 3.000000e-01 train_accuracy: 0.480918 val_accuracy: 0.471000
iteration 0 / 5000: loss 9.006913
iteration 100 / 5000: loss 8.281629
iteration 200 / 5000: loss 7.580926
iteration 300 / 5000: loss 7.168099
iteration 400 / 5000: loss 6.394368
iteration 500 / 5000: loss 6.199123
iteration 600 / 5000: loss 5.731412
iteration 700 / 5000: loss 5.767068
iteration 800 / 5000: loss 5.579586
iteration 900 / 5000: loss 5.723146
iteration 1000 / 5000: loss 5.171749
iteration 1100 / 5000: loss 4.927776
iteration 1200 / 5000: loss 5.058825
iteration 1300 / 5000: loss 4.808329
iteration 1400 / 5000: loss 4.985149
iteration 1500 / 5000: loss 5.183081
iteration 1600 / 5000: loss 5.303075
iteration 1700 / 5000: loss 5.055458
iteration 1800 / 5000: loss 4.866942
iteration 1900 / 5000: loss 4.996401
iteration 2000 / 5000: loss 4.810507
iteration 2100 / 5000: loss 4.728088
iteration 2200 / 5000: loss 4.885414
iteration 2300 / 5000: loss 4.908816
iteration 2400 / 5000: loss 4.728537
iteration 2500 / 5000: loss 4.782700
iteration 2600 / 5000: loss 4.747870
iteration 2700 / 5000: loss 4.558491
iteration 2800 / 5000: loss 4.918322
iteration 2900 / 5000: loss 4.646101
iteration 3000 / 5000: loss 4.707029
iteration 3100 / 5000: loss 5.197823
iteration 3200 / 5000: loss 4.534728
iteration 3300 / 5000: loss 4.682108
iteration 3400 / 5000: loss 4.802523
iteration 3500 / 5000: loss 4.937351
iteration 3600 / 5000: loss 4.934286
iteration 3700 / 5000: loss 4.761371
iteration 3800 / 5000: loss 4.787477
iteration 3900 / 5000: loss 5.033294
iteration 4000 / 5000: loss 4.899330
iteration 4100 / 5000: loss 4.734486
iteration 4200 / 5000: loss 4.535734
iteration 4300 / 5000: loss 4.550977
iteration 4400 / 5000: loss 4.966996
iteration 4500 / 5000: loss 4.711624
iteration 4600 / 5000: loss 4.544864
iteration 4700 / 5000: loss 4.606895
iteration 4800 / 5000: loss 4.817227
iteration 4900 / 5000: loss 4.706253
lr 1.000000e-04 reg 9.000000e-01 train_accuracy: 0.476939 val_accuracy: 0.469000
iteration 0 / 5000: loss 9.002671
iteration 100 / 5000: loss 8.407975
iteration 200 / 5000: loss 7.667274
iteration 300 / 5000: loss 7.076175
iteration 400 / 5000: loss 6.642934
iteration 500 / 5000: loss 6.520902
iteration 600 / 5000: loss 6.097346
iteration 700 / 5000: loss 5.732721
iteration 800 / 5000: loss 5.711592
iteration 900 / 5000: loss 5.325815
iteration 1000 / 5000: loss 5.192896
iteration 1100 / 5000: loss 5.220947
iteration 1200 / 5000: loss 5.412432
iteration 1300 / 5000: loss 4.981358
iteration 1400 / 5000: loss 5.185169
iteration 1500 / 5000: loss 5.077620
iteration 1600 / 5000: loss 4.958803
iteration 1700 / 5000: loss 5.303598
iteration 1800 / 5000: loss 4.843970
iteration 1900 / 5000: loss 4.881217
iteration 2000 / 5000: loss 4.902324
iteration 2100 / 5000: loss 5.040259
iteration 2200 / 5000: loss 4.971828
iteration 2300 / 5000: loss 4.799078
iteration 2400 / 5000: loss 5.041038
iteration 2500 / 5000: loss 4.697200
iteration 2600 / 5000: loss 4.883943
iteration 2700 / 5000: loss 4.746022
iteration 2800 / 5000: loss 4.680255
iteration 2900 / 5000: loss 5.221343
iteration 3000 / 5000: loss 4.754926
iteration 3100 / 5000: loss 5.018869
iteration 3200 / 5000: loss 4.532600
iteration 3300 / 5000: loss 4.910681
iteration 3400 / 5000: loss 4.872589
iteration 3500 / 5000: loss 4.803135
iteration 3600 / 5000: loss 4.896631
iteration 3700 / 5000: loss 4.452104
iteration 3800 / 5000: loss 4.965696
iteration 3900 / 5000: loss 4.600231
iteration 4000 / 5000: loss 4.661466
iteration 4100 / 5000: loss 5.295606
iteration 4200 / 5000: loss 4.984881
iteration 4300 / 5000: loss 4.711014
iteration 4400 / 5000: loss 4.515455
iteration 4500 / 5000: loss 4.693066
iteration 4600 / 5000: loss 4.831590
iteration 4700 / 5000: loss 4.805518
iteration 4800 / 5000: loss 5.014491
iteration 4900 / 5000: loss 4.906529
lr 1.000000e-04 reg 1.000000e+00 train_accuracy: 0.476367 val_accuracy: 0.472000
iteration 0 / 5000: loss 9.012133
iteration 100 / 5000: loss 8.338550
iteration 200 / 5000: loss 7.880813
iteration 300 / 5000: loss 7.223454
iteration 400 / 5000: loss 6.865179
iteration 500 / 5000: loss 6.643329
iteration 600 / 5000: loss 6.209759
iteration 700 / 5000: loss 5.942405
iteration 800 / 5000: loss 6.076697
iteration 900 / 5000: loss 5.992205
iteration 1000 / 5000: loss 6.186149
iteration 1100 / 5000: loss 6.062720
iteration 1200 / 5000: loss 6.009157
iteration 1300 / 5000: loss 5.814459
iteration 1400 / 5000: loss 6.203763
iteration 1500 / 5000: loss 6.227564
iteration 1600 / 5000: loss 5.756563
iteration 1700 / 5000: loss 6.087474
iteration 1800 / 5000: loss 5.762805
iteration 1900 / 5000: loss 6.025159
iteration 2000 / 5000: loss 5.854575
iteration 2100 / 5000: loss 5.918142
iteration 2200 / 5000: loss 5.898154
iteration 2300 / 5000: loss 5.878481
iteration 2400 / 5000: loss 5.711482
iteration 2500 / 5000: loss 5.720599
iteration 2600 / 5000: loss 6.084430
iteration 2700 / 5000: loss 5.932264
iteration 2800 / 5000: loss 5.937327
iteration 2900 / 5000: loss 6.241900
iteration 3000 / 5000: loss 5.596333
iteration 3100 / 5000: loss 6.071472
iteration 3200 / 5000: loss 5.826016
iteration 3300 / 5000: loss 6.086967
iteration 3400 / 5000: loss 6.169968
iteration 3500 / 5000: loss 6.027554
iteration 3600 / 5000: loss 6.030776
iteration 3700 / 5000: loss 6.019872
iteration 3800 / 5000: loss 5.658924
iteration 3900 / 5000: loss 6.130779
iteration 4000 / 5000: loss 5.911752
iteration 4100 / 5000: loss 5.607180
iteration 4200 / 5000: loss 5.946064
iteration 4300 / 5000: loss 6.012735
iteration 4400 / 5000: loss 5.869213
iteration 4500 / 5000: loss 5.578682
iteration 4600 / 5000: loss 5.639235
iteration 4700 / 5000: loss 5.829208
iteration 4800 / 5000: loss 5.966284
iteration 4900 / 5000: loss 5.930718
lr 1.000000e-04 reg 3.000000e+00 train_accuracy: 0.463429 val_accuracy: 0.452000
iteration 0 / 5000: loss 9.007030
iteration 100 / 5000: loss 8.460200
iteration 200 / 5000: loss 7.968176
iteration 300 / 5000: loss 7.711774
iteration 400 / 5000: loss 7.511892
iteration 500 / 5000: loss 7.392604
iteration 600 / 5000: loss 7.444445
iteration 700 / 5000: loss 7.471065
iteration 800 / 5000: loss 7.200200
iteration 900 / 5000: loss 7.311069
iteration 1000 / 5000: loss 7.128407
iteration 1100 / 5000: loss 7.381830
iteration 1200 / 5000: loss 7.259279
iteration 1300 / 5000: loss 7.332434
iteration 1400 / 5000: loss 7.253150
iteration 1500 / 5000: loss 7.178465
iteration 1600 / 5000: loss 6.997716
iteration 1700 / 5000: loss 7.319644
iteration 1800 / 5000: loss 7.052120
iteration 1900 / 5000: loss 7.105978
iteration 2000 / 5000: loss 7.247064
iteration 2100 / 5000: loss 7.324638
iteration 2200 / 5000: loss 7.235180
iteration 2300 / 5000: loss 7.069485
iteration 2400 / 5000: loss 7.191152
iteration 2500 / 5000: loss 7.389645
iteration 2600 / 5000: loss 7.196881
iteration 2700 / 5000: loss 7.229962
iteration 2800 / 5000: loss 7.441918
iteration 2900 / 5000: loss 7.240986
iteration 3000 / 5000: loss 7.366584
iteration 3100 / 5000: loss 7.357161
iteration 3200 / 5000: loss 7.365338
iteration 3300 / 5000: loss 7.186570
iteration 3400 / 5000: loss 7.174792
iteration 3500 / 5000: loss 7.215821
iteration 3600 / 5000: loss 7.345234
iteration 3700 / 5000: loss 7.479987
iteration 3800 / 5000: loss 7.181720
iteration 3900 / 5000: loss 7.270108
iteration 4000 / 5000: loss 7.254691
iteration 4100 / 5000: loss 7.381982
iteration 4200 / 5000: loss 7.126797
iteration 4300 / 5000: loss 7.435358
iteration 4400 / 5000: loss 7.266508
iteration 4500 / 5000: loss 7.182316
iteration 4600 / 5000: loss 7.288028
iteration 4700 / 5000: loss 7.386532
iteration 4800 / 5000: loss 6.997930
iteration 4900 / 5000: loss 7.381073
lr 1.000000e-04 reg 9.000000e+00 train_accuracy: 0.432837 val_accuracy: 0.435000

iteration 0 / 5000: loss 9.001530
iteration 100 / 5000: loss 6.802867
iteration 200 / 5000: loss 5.308577
iteration 300 / 5000: loss 5.173270
iteration 400 / 5000: loss 4.679287
iteration 500 / 5000: loss 4.536472
iteration 600 / 5000: loss 4.165849
iteration 700 / 5000: loss 4.303058
iteration 800 / 5000: loss 4.154401
iteration 900 / 5000: loss 4.164500
iteration 1000 / 5000: loss 4.044511
iteration 1100 / 5000: loss 3.808058
iteration 1200 / 5000: loss 3.692022
iteration 1300 / 5000: loss 3.764127
iteration 1400 / 5000: loss 3.699774
iteration 1500 / 5000: loss 3.664952
iteration 1600 / 5000: loss 3.718658
iteration 1700 / 5000: loss 3.678614
iteration 1800 / 5000: loss 3.558100
iteration 1900 / 5000: loss 3.809526
iteration 2000 / 5000: loss 3.446541
iteration 2100 / 5000: loss 3.829258
iteration 2200 / 5000: loss 3.246094
iteration 2300 / 5000: loss 3.521057
iteration 2400 / 5000: loss 3.499995
iteration 2500 / 5000: loss 3.359670
iteration 2600 / 5000: loss 3.353909
iteration 2700 / 5000: loss 3.636665
iteration 2800 / 5000: loss 3.847947
iteration 2900 / 5000: loss 3.645178
iteration 3000 / 5000: loss 3.239010
iteration 3100 / 5000: loss 3.034572
iteration 3200 / 5000: loss 3.354849
iteration 3300 / 5000: loss 3.431327
iteration 3400 / 5000: loss 3.753212
iteration 3500 / 5000: loss 3.539538
iteration 3600 / 5000: loss 3.383447
iteration 3700 / 5000: loss 3.354109
iteration 3800 / 5000: loss 3.616208
iteration 3900 / 5000: loss 3.509512
iteration 4000 / 5000: loss 3.337065
iteration 4100 / 5000: loss 3.878410
iteration 4200 / 5000: loss 3.769304
iteration 4300 / 5000: loss 3.630432
iteration 4400 / 5000: loss 3.377122
iteration 4500 / 5000: loss 3.399310
iteration 4600 / 5000: loss 3.623147
iteration 4700 / 5000: loss 3.271751
iteration 4800 / 5000: loss 3.523998
iteration 4900 / 5000: loss 3.523878
lr 3.000000e-04 reg 1.000000e-01 train_accuracy: 0.500755 val_accuracy: 0.485000
iteration 0 / 5000: loss 9.010686
iteration 100 / 5000: loss 6.910009
iteration 200 / 5000: loss 5.666444
iteration 300 / 5000: loss 5.350084
iteration 400 / 5000: loss 4.977872
iteration 500 / 5000: loss 4.432248
iteration 600 / 5000: loss 4.516044
iteration 700 / 5000: loss 4.346262
iteration 800 / 5000: loss 4.194108
iteration 900 / 5000: loss 4.231003
iteration 1000 / 5000: loss 4.308087
iteration 1100 / 5000: loss 3.890618
iteration 1200 / 5000: loss 4.110279
iteration 1300 / 5000: loss 4.065779
iteration 1400 / 5000: loss 4.121892
iteration 1500 / 5000: loss 4.077883
iteration 1600 / 5000: loss 3.909958
iteration 1700 / 5000: loss 3.868617
iteration 1800 / 5000: loss 4.185413
iteration 1900 / 5000: loss 3.708853
iteration 2000 / 5000: loss 3.955669
iteration 2100 / 5000: loss 3.903215
iteration 2200 / 5000: loss 3.877410
iteration 2300 / 5000: loss 4.366438
iteration 2400 / 5000: loss 4.077712
iteration 2500 / 5000: loss 3.792103
iteration 2600 / 5000: loss 4.079186
iteration 2700 / 5000: loss 3.665970
iteration 2800 / 5000: loss 3.723151
iteration 2900 / 5000: loss 3.824384
iteration 3000 / 5000: loss 3.737298
iteration 3100 / 5000: loss 3.989093
iteration 3200 / 5000: loss 3.823445
iteration 3300 / 5000: loss 3.898737
iteration 3400 / 5000: loss 3.551977
iteration 3500 / 5000: loss 4.401653
iteration 3600 / 5000: loss 3.918771
iteration 3700 / 5000: loss 3.966763
iteration 3800 / 5000: loss 3.683273
iteration 3900 / 5000: loss 3.966898
iteration 4000 / 5000: loss 3.993982
iteration 4100 / 5000: loss 3.692123
iteration 4200 / 5000: loss 3.657507
iteration 4300 / 5000: loss 4.107392
iteration 4400 / 5000: loss 3.803797
iteration 4500 / 5000: loss 4.192694
iteration 4600 / 5000: loss 3.972329
iteration 4700 / 5000: loss 3.853325
iteration 4800 / 5000: loss 3.799280
iteration 4900 / 5000: loss 3.582009
lr 3.000000e-04 reg 3.000000e-01 train_accuracy: 0.495714 val_accuracy: 0.485000
iteration 0 / 5000: loss 9.004112
iteration 100 / 5000: loss 6.933397
iteration 200 / 5000: loss 6.036991
iteration 300 / 5000: loss 5.122916
iteration 400 / 5000: loss 5.244693
iteration 500 / 5000: loss 5.422022
iteration 600 / 5000: loss 5.133341
iteration 700 / 5000: loss 5.162409
iteration 800 / 5000: loss 4.709183
iteration 900 / 5000: loss 4.251600
iteration 1000 / 5000: loss 4.845076
iteration 1100 / 5000: loss 5.097551
iteration 1200 / 5000: loss 5.033492
iteration 1300 / 5000: loss 4.586179
iteration 1400 / 5000: loss 4.758203
iteration 1500 / 5000: loss 4.643774
iteration 1600 / 5000: loss 4.893128
iteration 1700 / 5000: loss 4.950951
iteration 1800 / 5000: loss 4.669821
iteration 1900 / 5000: loss 4.639115
iteration 2000 / 5000: loss 4.648262
iteration 2100 / 5000: loss 4.505578
iteration 2200 / 5000: loss 4.531009
iteration 2300 / 5000: loss 4.771813
iteration 2400 / 5000: loss 4.324162
iteration 2500 / 5000: loss 4.462768
iteration 2600 / 5000: loss 4.327351
iteration 2700 / 5000: loss 4.739615
iteration 2800 / 5000: loss 4.777578
iteration 2900 / 5000: loss 4.549056
iteration 3000 / 5000: loss 4.795587
iteration 3100 / 5000: loss 4.223460
iteration 3200 / 5000: loss 4.761673
iteration 3300 / 5000: loss 4.715286
iteration 3400 / 5000: loss 4.552484
iteration 3500 / 5000: loss 4.814415
iteration 3600 / 5000: loss 5.169885
iteration 3700 / 5000: loss 4.791775
iteration 3800 / 5000: loss 4.441898
iteration 3900 / 5000: loss 4.579004
iteration 4000 / 5000: loss 4.362920
iteration 4100 / 5000: loss 4.696713
iteration 4200 / 5000: loss 4.685221
iteration 4300 / 5000: loss 4.774104
iteration 4400 / 5000: loss 4.341464
iteration 4500 / 5000: loss 4.604356
iteration 4600 / 5000: loss 4.520933
iteration 4700 / 5000: loss 4.896592
iteration 4800 / 5000: loss 4.727781
iteration 4900 / 5000: loss 4.867384
lr 3.000000e-04 reg 9.000000e-01 train_accuracy: 0.485837 val_accuracy: 0.476000
iteration 0 / 5000: loss 9.003882
iteration 100 / 5000: loss 6.966711
iteration 200 / 5000: loss 5.855982
iteration 300 / 5000: loss 5.294660
iteration 400 / 5000: loss 5.070448
iteration 500 / 5000: loss 5.175304
iteration 600 / 5000: loss 5.008480
iteration 700 / 5000: loss 4.927979
iteration 800 / 5000: loss 4.546883
iteration 900 / 5000: loss 4.822472
iteration 1000 / 5000: loss 4.881257
iteration 1100 / 5000: loss 4.548004
iteration 1200 / 5000: loss 4.941455
iteration 1300 / 5000: loss 4.714686
iteration 1400 / 5000: loss 4.812455
iteration 1500 / 5000: loss 5.006938
iteration 1600 / 5000: loss 4.642311
iteration 1700 / 5000: loss 4.731502
iteration 1800 / 5000: loss 4.775062
iteration 1900 / 5000: loss 4.443890
iteration 2000 / 5000: loss 4.821267
iteration 2100 / 5000: loss 4.800739
iteration 2200 / 5000: loss 4.834993
iteration 2300 / 5000: loss 4.724496
iteration 2400 / 5000: loss 5.029483
iteration 2500 / 5000: loss 4.819818
iteration 2600 / 5000: loss 5.169633
iteration 2700 / 5000: loss 4.659929
iteration 2800 / 5000: loss 4.846169
iteration 2900 / 5000: loss 4.568425
iteration 3000 / 5000: loss 4.900743
iteration 3100 / 5000: loss 4.471423
iteration 3200 / 5000: loss 4.785186
iteration 3300 / 5000: loss 4.860137
iteration 3400 / 5000: loss 4.608338
iteration 3500 / 5000: loss 4.890700
iteration 3600 / 5000: loss 4.772804
iteration 3700 / 5000: loss 4.687344
iteration 3800 / 5000: loss 4.795947
iteration 3900 / 5000: loss 4.739491
iteration 4000 / 5000: loss 4.636092
iteration 4100 / 5000: loss 4.923171
iteration 4200 / 5000: loss 5.114442
iteration 4300 / 5000: loss 4.929242
iteration 4400 / 5000: loss 4.577101
iteration 4500 / 5000: loss 4.870762
iteration 4600 / 5000: loss 4.781738
iteration 4700 / 5000: loss 5.113045
iteration 4800 / 5000: loss 4.874486
iteration 4900 / 5000: loss 4.848090
lr 3.000000e-04 reg 1.000000e+00 train_accuracy: 0.483571 val_accuracy: 0.477000
iteration 0 / 5000: loss 8.993859
iteration 100 / 5000: loss 7.255756
iteration 200 / 5000: loss 6.286021
iteration 300 / 5000: loss 6.183504
iteration 400 / 5000: loss 5.814022
iteration 500 / 5000: loss 6.055566
iteration 600 / 5000: loss 6.035622
iteration 700 / 5000: loss 6.189532
iteration 800 / 5000: loss 5.922679
iteration 900 / 5000: loss 6.089134
iteration 1000 / 5000: loss 6.246639
iteration 1100 / 5000: loss 6.075492
iteration 1200 / 5000: loss 6.335456
iteration 1300 / 5000: loss 5.978153
iteration 1400 / 5000: loss 5.824932
iteration 1500 / 5000: loss 5.858482
iteration 1600 / 5000: loss 5.800207
iteration 1700 / 5000: loss 5.885925
iteration 1800 / 5000: loss 5.810783
iteration 1900 / 5000: loss 6.054265
iteration 2000 / 5000: loss 6.020472
iteration 2100 / 5000: loss 5.942410
iteration 2200 / 5000: loss 5.739938
iteration 2300 / 5000: loss 5.771111
iteration 2400 / 5000: loss 5.924191
iteration 2500 / 5000: loss 5.991678
iteration 2600 / 5000: loss 5.686000
iteration 2700 / 5000: loss 5.862965
iteration 2800 / 5000: loss 5.894663
iteration 2900 / 5000: loss 5.974972
iteration 3000 / 5000: loss 6.015528
iteration 3100 / 5000: loss 5.895905
iteration 3200 / 5000: loss 5.740080
iteration 3300 / 5000: loss 5.911518
iteration 3400 / 5000: loss 6.085203
iteration 3500 / 5000: loss 5.830187
iteration 3600 / 5000: loss 6.047077
iteration 3700 / 5000: loss 6.206254
iteration 3800 / 5000: loss 5.604407
iteration 3900 / 5000: loss 5.902337
iteration 4000 / 5000: loss 5.751415
iteration 4100 / 5000: loss 5.883404
iteration 4200 / 5000: loss 6.165182
iteration 4300 / 5000: loss 6.147478
iteration 4400 / 5000: loss 6.066890
iteration 4500 / 5000: loss 5.821415
iteration 4600 / 5000: loss 5.983919
iteration 4700 / 5000: loss 5.869827
iteration 4800 / 5000: loss 5.947831
iteration 4900 / 5000: loss 6.105490
lr 3.000000e-04 reg 3.000000e+00 train_accuracy: 0.463653 val_accuracy: 0.457000
iteration 0 / 5000: loss 9.013753
iteration 100 / 5000: loss 7.734710
iteration 200 / 5000: loss 7.215643
iteration 300 / 5000: loss 7.297517
iteration 400 / 5000: loss 7.079487
iteration 500 / 5000: loss 7.175254
iteration 600 / 5000: loss 7.299345
iteration 700 / 5000: loss 7.300006
iteration 800 / 5000: loss 7.360926
iteration 900 / 5000: loss 7.461970
iteration 1000 / 5000: loss 7.255247
iteration 1100 / 5000: loss 7.232743
iteration 1200 / 5000: loss 7.325865
iteration 1300 / 5000: loss 7.208862
iteration 1400 / 5000: loss 7.435928
iteration 1500 / 5000: loss 7.297607
iteration 1600 / 5000: loss 7.395829
iteration 1700 / 5000: loss 7.265931
iteration 1800 / 5000: loss 7.373708
iteration 1900 / 5000: loss 7.433094
iteration 2000 / 5000: loss 7.397351
iteration 2100 / 5000: loss 7.047597
iteration 2200 / 5000: loss 6.995927
iteration 2300 / 5000: loss 7.347552
iteration 2400 / 5000: loss 7.188284
iteration 2500 / 5000: loss 7.078042
iteration 2600 / 5000: loss 7.350635
iteration 2700 / 5000: loss 7.263474
iteration 2800 / 5000: loss 7.458040
iteration 2900 / 5000: loss 7.397123
iteration 3000 / 5000: loss 7.230869
iteration 3100 / 5000: loss 7.191593
iteration 3200 / 5000: loss 7.147508
iteration 3300 / 5000: loss 7.099929
iteration 3400 / 5000: loss 7.333906
iteration 3500 / 5000: loss 7.352728
iteration 3600 / 5000: loss 7.105784
iteration 3700 / 5000: loss 7.004345
iteration 3800 / 5000: loss 7.455221
iteration 3900 / 5000: loss 7.469847
iteration 4000 / 5000: loss 7.213932
iteration 4100 / 5000: loss 7.363261
iteration 4200 / 5000: loss 7.143243
iteration 4300 / 5000: loss 6.997458
iteration 4400 / 5000: loss 7.415551
iteration 4500 / 5000: loss 7.163470
iteration 4600 / 5000: loss 7.437285
iteration 4700 / 5000: loss 7.397166
iteration 4800 / 5000: loss 7.604950
iteration 4900 / 5000: loss 7.227410
lr 3.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.435000

iteration 0 / 5000: loss 8.982943
iteration 100 / 5000: loss 5.283418
iteration 200 / 5000: loss 4.492825
iteration 300 / 5000: loss 4.148766
iteration 400 / 5000: loss 3.777606
iteration 500 / 5000: loss 3.835951
iteration 600 / 5000: loss 3.617915
iteration 700 / 5000: loss 3.823073
iteration 800 / 5000: loss 3.259406
iteration 900 / 5000: loss 3.419839
iteration 1000 / 5000: loss 3.330701
iteration 1100 / 5000: loss 3.596134
iteration 1200 / 5000: loss 3.387942
iteration 1300 / 5000: loss 3.662270
iteration 1400 / 5000: loss 3.521761
iteration 1500 / 5000: loss 3.310246
iteration 1600 / 5000: loss 3.414042
iteration 1700 / 5000: loss 3.604558
iteration 1800 / 5000: loss 3.294482
iteration 1900 / 5000: loss 3.698763
iteration 2000 / 5000: loss 3.530443
iteration 2100 / 5000: loss 3.410218
iteration 2200 / 5000: loss 3.431010
iteration 2300 / 5000: loss 3.176397
iteration 2400 / 5000: loss 3.448981
iteration 2500 / 5000: loss 3.534855
iteration 2600 / 5000: loss 3.404318
iteration 2700 / 5000: loss 3.168434
iteration 2800 / 5000: loss 3.216473
iteration 2900 / 5000: loss 3.399825
iteration 3000 / 5000: loss 4.313775
iteration 3100 / 5000: loss 3.266096
iteration 3200 / 5000: loss 3.174880
iteration 3300 / 5000: loss 3.138287
iteration 3400 / 5000: loss 3.272146
iteration 3500 / 5000: loss 3.634309
iteration 3600 / 5000: loss 3.658314
iteration 3700 / 5000: loss 3.044552
iteration 3800 / 5000: loss 2.987190
iteration 3900 / 5000: loss 3.396378
iteration 4000 / 5000: loss 3.276171
iteration 4100 / 5000: loss 3.316656
iteration 4200 / 5000: loss 3.655331
iteration 4300 / 5000: loss 3.295843
iteration 4400 / 5000: loss 3.100392
iteration 4500 / 5000: loss 3.493620
iteration 4600 / 5000: loss 3.318138
iteration 4700 / 5000: loss 3.248826
iteration 4800 / 5000: loss 3.689555
iteration 4900 / 5000: loss 3.290015
lr 9.000000e-04 reg 1.000000e-01 train_accuracy: 0.507776 val_accuracy: 0.496000
iteration 0 / 5000: loss 8.995971
iteration 100 / 5000: loss 5.337219
iteration 200 / 5000: loss 4.336038
iteration 300 / 5000: loss 3.959948
iteration 400 / 5000: loss 4.219987
iteration 500 / 5000: loss 4.494342
iteration 600 / 5000: loss 4.055506
iteration 700 / 5000: loss 3.899504
iteration 800 / 5000: loss 3.883347
iteration 900 / 5000: loss 3.801306
iteration 1000 / 5000: loss 4.319853
iteration 1100 / 5000: loss 4.293054
iteration 1200 / 5000: loss 3.792687
iteration 1300 / 5000: loss 4.126374
iteration 1400 / 5000: loss 4.160672
iteration 1500 / 5000: loss 4.052981
iteration 1600 / 5000: loss 3.934496
iteration 1700 / 5000: loss 4.101318
iteration 1800 / 5000: loss 3.983665
iteration 1900 / 5000: loss 4.108541
iteration 2000 / 5000: loss 3.984265
iteration 2100 / 5000: loss 3.871380
iteration 2200 / 5000: loss 3.920818
iteration 2300 / 5000: loss 3.990288
iteration 2400 / 5000: loss 4.138441
iteration 2500 / 5000: loss 3.755502
iteration 2600 / 5000: loss 3.773861
iteration 2700 / 5000: loss 4.025285
iteration 2800 / 5000: loss 4.011859
iteration 2900 / 5000: loss 3.989367
iteration 3000 / 5000: loss 3.815759
iteration 3100 / 5000: loss 3.827168
iteration 3200 / 5000: loss 3.620530
iteration 3300 / 5000: loss 4.025962
iteration 3400 / 5000: loss 3.701210
iteration 3500 / 5000: loss 3.937467
iteration 3600 / 5000: loss 3.778427
iteration 3700 / 5000: loss 3.950450
iteration 3800 / 5000: loss 3.534999
iteration 3900 / 5000: loss 3.889293
iteration 4000 / 5000: loss 4.494117
iteration 4100 / 5000: loss 4.100527
iteration 4200 / 5000: loss 4.018529
iteration 4300 / 5000: loss 3.401654
iteration 4400 / 5000: loss 3.776039
iteration 4500 / 5000: loss 4.127721
iteration 4600 / 5000: loss 3.642135
iteration 4700 / 5000: loss 3.847248
iteration 4800 / 5000: loss 3.926020
iteration 4900 / 5000: loss 3.798690
lr 9.000000e-04 reg 3.000000e-01 train_accuracy: 0.499918 val_accuracy: 0.489000
iteration 0 / 5000: loss 9.011502
iteration 100 / 5000: loss 5.535606
iteration 200 / 5000: loss 4.999910
iteration 300 / 5000: loss 4.640235
iteration 400 / 5000: loss 4.715614
iteration 500 / 5000: loss 4.789768
iteration 600 / 5000: loss 4.575110
iteration 700 / 5000: loss 4.952478
iteration 800 / 5000: loss 4.616504
iteration 900 / 5000: loss 4.684435
iteration 1000 / 5000: loss 4.705714
iteration 1100 / 5000: loss 4.692858
iteration 1200 / 5000: loss 4.730745
iteration 1300 / 5000: loss 4.372179
iteration 1400 / 5000: loss 4.723524
iteration 1500 / 5000: loss 4.699062
iteration 1600 / 5000: loss 4.618188
iteration 1700 / 5000: loss 4.735581
iteration 1800 / 5000: loss 4.663297
iteration 1900 / 5000: loss 4.364772
iteration 2000 / 5000: loss 4.940052
iteration 2100 / 5000: loss 4.746039
iteration 2200 / 5000: loss 4.707392
iteration 2300 / 5000: loss 4.639838
iteration 2400 / 5000: loss 4.746254
iteration 2500 / 5000: loss 4.799092
iteration 2600 / 5000: loss 4.781200
iteration 2700 / 5000: loss 4.817152
iteration 2800 / 5000: loss 4.921944
iteration 2900 / 5000: loss 4.796558
iteration 3000 / 5000: loss 4.734236
iteration 3100 / 5000: loss 4.783195
iteration 3200 / 5000: loss 4.532552
iteration 3300 / 5000: loss 4.619603
iteration 3400 / 5000: loss 4.784881
iteration 3500 / 5000: loss 4.884726
iteration 3600 / 5000: loss 4.610081
iteration 3700 / 5000: loss 5.262959
iteration 3800 / 5000: loss 4.285129
iteration 3900 / 5000: loss 4.648068
iteration 4000 / 5000: loss 4.856122
iteration 4100 / 5000: loss 4.407242
iteration 4200 / 5000: loss 4.473232
iteration 4300 / 5000: loss 4.771678
iteration 4400 / 5000: loss 5.105510
iteration 4500 / 5000: loss 4.399891
iteration 4600 / 5000: loss 4.244798
iteration 4700 / 5000: loss 4.955644
iteration 4800 / 5000: loss 4.740414
iteration 4900 / 5000: loss 4.468873
lr 9.000000e-04 reg 9.000000e-01 train_accuracy: 0.485122 val_accuracy: 0.475000
iteration 0 / 5000: loss 9.016071
iteration 100 / 5000: loss 5.249223
iteration 200 / 5000: loss 5.215455
iteration 300 / 5000: loss 4.638456
iteration 400 / 5000: loss 5.332061
iteration 500 / 5000: loss 4.803471
iteration 600 / 5000: loss 4.590619
iteration 700 / 5000: loss 4.825728
iteration 800 / 5000: loss 4.512489
iteration 900 / 5000: loss 4.773024
iteration 1000 / 5000: loss 4.827949
iteration 1100 / 5000: loss 4.529617
iteration 1200 / 5000: loss 4.485752
iteration 1300 / 5000: loss 4.854490
iteration 1400 / 5000: loss 4.727455
iteration 1500 / 5000: loss 4.762934
iteration 1600 / 5000: loss 4.769823
iteration 1700 / 5000: loss 4.800915
iteration 1800 / 5000: loss 4.600965
iteration 1900 / 5000: loss 4.847466
iteration 2000 / 5000: loss 4.968712
iteration 2100 / 5000: loss 4.714321
iteration 2200 / 5000: loss 4.548804
iteration 2300 / 5000: loss 4.532073
iteration 2400 / 5000: loss 4.781296
iteration 2500 / 5000: loss 4.816672
iteration 2600 / 5000: loss 4.939205
iteration 2700 / 5000: loss 4.641197
iteration 2800 / 5000: loss 4.839898
iteration 2900 / 5000: loss 4.725420
iteration 3000 / 5000: loss 4.984570
iteration 3100 / 5000: loss 4.536114
iteration 3200 / 5000: loss 4.977365
iteration 3300 / 5000: loss 4.744647
iteration 3400 / 5000: loss 4.847269
iteration 3500 / 5000: loss 4.721274
iteration 3600 / 5000: loss 4.727851
iteration 3700 / 5000: loss 4.678910
iteration 3800 / 5000: loss 4.904265
iteration 3900 / 5000: loss 4.995918
iteration 4000 / 5000: loss 5.007054
iteration 4100 / 5000: loss 4.832375
iteration 4200 / 5000: loss 4.493800
iteration 4300 / 5000: loss 4.628524
iteration 4400 / 5000: loss 4.751690
iteration 4500 / 5000: loss 5.113462
iteration 4600 / 5000: loss 4.745994
iteration 4700 / 5000: loss 5.105001
iteration 4800 / 5000: loss 4.854896
iteration 4900 / 5000: loss 4.853554
lr 9.000000e-04 reg 1.000000e+00 train_accuracy: 0.484367 val_accuracy: 0.480000
iteration 0 / 5000: loss 9.011474
iteration 100 / 5000: loss 6.156308
iteration 200 / 5000: loss 6.087746
iteration 300 / 5000: loss 6.225676
iteration 400 / 5000: loss 5.709850
iteration 500 / 5000: loss 6.075728
iteration 600 / 5000: loss 5.959734
iteration 700 / 5000: loss 6.132115
iteration 800 / 5000: loss 6.139659
iteration 900 / 5000: loss 5.706766
iteration 1000 / 5000: loss 6.056897
iteration 1100 / 5000: loss 5.789426
iteration 1200 / 5000: loss 6.009466
iteration 1300 / 5000: loss 5.778071
iteration 1400 / 5000: loss 6.326448
iteration 1500 / 5000: loss 5.776719
iteration 1600 / 5000: loss 5.915532
iteration 1700 / 5000: loss 5.620150
iteration 1800 / 5000: loss 5.913342
iteration 1900 / 5000: loss 5.950729
iteration 2000 / 5000: loss 6.185558
iteration 2100 / 5000: loss 5.760242
iteration 2200 / 5000: loss 5.892972
iteration 2300 / 5000: loss 6.009873
iteration 2400 / 5000: loss 5.822083
iteration 2500 / 5000: loss 5.707517
iteration 2600 / 5000: loss 5.846403
iteration 2700 / 5000: loss 5.853371
iteration 2800 / 5000: loss 5.825951
iteration 2900 / 5000: loss 5.695647
iteration 3000 / 5000: loss 5.903419
iteration 3100 / 5000: loss 5.888748
iteration 3200 / 5000: loss 5.699707
iteration 3300 / 5000: loss 5.743983
iteration 3400 / 5000: loss 6.073477
iteration 3500 / 5000: loss 5.580436
iteration 3600 / 5000: loss 5.865799
iteration 3700 / 5000: loss 5.819109
iteration 3800 / 5000: loss 6.184346
iteration 3900 / 5000: loss 5.809622
iteration 4000 / 5000: loss 6.035030
iteration 4100 / 5000: loss 5.736679
iteration 4200 / 5000: loss 5.872247
iteration 4300 / 5000: loss 5.974199
iteration 4400 / 5000: loss 6.047040
iteration 4500 / 5000: loss 5.754904
iteration 4600 / 5000: loss 5.918655
iteration 4700 / 5000: loss 6.017762
iteration 4800 / 5000: loss 5.983073
iteration 4900 / 5000: loss 5.972787
lr 9.000000e-04 reg 3.000000e+00 train_accuracy: 0.464286 val_accuracy: 0.460000
iteration 0 / 5000: loss 9.024835
iteration 100 / 5000: loss 7.278838
iteration 200 / 5000: loss 7.305090
iteration 300 / 5000: loss 7.332941
iteration 400 / 5000: loss 7.235641
iteration 500 / 5000: loss 7.068605
iteration 600 / 5000: loss 7.090493
iteration 700 / 5000: loss 7.417437
iteration 800 / 5000: loss 7.427046
iteration 900 / 5000: loss 7.236273
iteration 1000 / 5000: loss 7.216414
iteration 1100 / 5000: loss 7.482114
iteration 1200 / 5000: loss 7.124256
iteration 1300 / 5000: loss 7.276430
iteration 1400 / 5000: loss 7.285581
iteration 1500 / 5000: loss 7.157154
iteration 1600 / 5000: loss 7.285261
iteration 1700 / 5000: loss 7.317175
iteration 1800 / 5000: loss 7.412383
iteration 1900 / 5000: loss 7.269028
iteration 2000 / 5000: loss 7.281183
iteration 2100 / 5000: loss 7.194891
iteration 2200 / 5000: loss 7.171257
iteration 2300 / 5000: loss 7.159145
iteration 2400 / 5000: loss 7.348496
iteration 2500 / 5000: loss 7.451948
iteration 2600 / 5000: loss 7.062116
iteration 2700 / 5000: loss 7.222397
iteration 2800 / 5000: loss 7.423821
iteration 2900 / 5000: loss 6.875897
iteration 3000 / 5000: loss 7.408164
iteration 3100 / 5000: loss 7.181808
iteration 3200 / 5000: loss 7.354317
iteration 3300 / 5000: loss 7.415129
iteration 3400 / 5000: loss 7.162390
iteration 3500 / 5000: loss 7.290106
iteration 3600 / 5000: loss 7.264554
iteration 3700 / 5000: loss 7.333960
iteration 3800 / 5000: loss 7.176182
iteration 3900 / 5000: loss 7.109390
iteration 4000 / 5000: loss 7.379279
iteration 4100 / 5000: loss 7.386717
iteration 4200 / 5000: loss 7.369770
iteration 4300 / 5000: loss 7.178538
iteration 4400 / 5000: loss 7.207089
iteration 4500 / 5000: loss 7.495102
iteration 4600 / 5000: loss 7.259093
iteration 4700 / 5000: loss 7.268352
iteration 4800 / 5000: loss 7.176944
iteration 4900 / 5000: loss 7.298682
lr 9.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.436000

iteration 0 / 5000: loss 8.992686
iteration 100 / 5000: loss 4.849981
iteration 200 / 5000: loss 4.041593
iteration 300 / 5000: loss 4.004287
iteration 400 / 5000: loss 3.825332
iteration 500 / 5000: loss 3.556036
iteration 600 / 5000: loss 3.553934
iteration 700 / 5000: loss 3.863782
iteration 800 / 5000: loss 3.384939
iteration 900 / 5000: loss 3.574625
iteration 1000 / 5000: loss 3.453232
iteration 1100 / 5000: loss 3.489262
iteration 1200 / 5000: loss 3.611470
iteration 1300 / 5000: loss 3.798229
iteration 1400 / 5000: loss 3.170947
iteration 1500 / 5000: loss 3.580376
iteration 1600 / 5000: loss 3.156719
iteration 1700 / 5000: loss 3.283313
iteration 1800 / 5000: loss 3.248209
iteration 1900 / 5000: loss 3.533318
iteration 2000 / 5000: loss 3.556883
iteration 2100 / 5000: loss 3.656687
iteration 2200 / 5000: loss 3.620316
iteration 2300 / 5000: loss 3.274635
iteration 2400 / 5000: loss 3.101564
iteration 2500 / 5000: loss 3.376377
iteration 2600 / 5000: loss 3.199268
iteration 2700 / 5000: loss 3.285049
iteration 2800 / 5000: loss 3.240141
iteration 2900 / 5000: loss 3.403663
iteration 3000 / 5000: loss 3.437697
iteration 3100 / 5000: loss 3.246741
iteration 3200 / 5000: loss 3.559419
iteration 3300 / 5000: loss 3.152634
iteration 3400 / 5000: loss 3.349065
iteration 3500 / 5000: loss 3.653547
iteration 3600 / 5000: loss 3.539300
iteration 3700 / 5000: loss 3.344229
iteration 3800 / 5000: loss 3.459817
iteration 3900 / 5000: loss 3.339881
iteration 4000 / 5000: loss 3.571686
iteration 4100 / 5000: loss 3.497885
iteration 4200 / 5000: loss 2.986426
iteration 4300 / 5000: loss 3.376126
iteration 4400 / 5000: loss 3.719380
iteration 4500 / 5000: loss 3.571280
iteration 4600 / 5000: loss 3.675990
iteration 4700 / 5000: loss 3.049677
iteration 4800 / 5000: loss 3.375443
iteration 4900 / 5000: loss 3.279644
lr 1.000000e-03 reg 1.000000e-01 train_accuracy: 0.508755 val_accuracy: 0.496000
iteration 0 / 5000: loss 9.017257
iteration 100 / 5000: loss 5.041676
iteration 200 / 5000: loss 4.326165
iteration 300 / 5000: loss 3.999510
iteration 400 / 5000: loss 4.241395
iteration 500 / 5000: loss 4.332428
iteration 600 / 5000: loss 4.422929
iteration 700 / 5000: loss 4.121897
iteration 800 / 5000: loss 3.980772
iteration 900 / 5000: loss 4.175956
iteration 1000 / 5000: loss 4.046578
iteration 1100 / 5000: loss 3.822513
iteration 1200 / 5000: loss 3.983523
iteration 1300 / 5000: loss 3.907774
iteration 1400 / 5000: loss 3.825638
iteration 1500 / 5000: loss 3.897005
iteration 1600 / 5000: loss 3.940814
iteration 1700 / 5000: loss 3.914063
iteration 1800 / 5000: loss 4.002890
iteration 1900 / 5000: loss 4.032257
iteration 2000 / 5000: loss 3.837165
iteration 2100 / 5000: loss 3.690379
iteration 2200 / 5000: loss 3.601831
iteration 2300 / 5000: loss 3.735256
iteration 2400 / 5000: loss 3.860678
iteration 2500 / 5000: loss 3.891935
iteration 2600 / 5000: loss 3.802613
iteration 2700 / 5000: loss 3.934678
iteration 2800 / 5000: loss 4.163417
iteration 2900 / 5000: loss 3.847776
iteration 3000 / 5000: loss 3.818980
iteration 3100 / 5000: loss 4.115169
iteration 3200 / 5000: loss 3.720907
iteration 3300 / 5000: loss 3.918015
iteration 3400 / 5000: loss 3.768097
iteration 3500 / 5000: loss 3.444560
iteration 3600 / 5000: loss 3.774876
iteration 3700 / 5000: loss 4.316459
iteration 3800 / 5000: loss 4.016332
iteration 3900 / 5000: loss 4.041114
iteration 4000 / 5000: loss 3.671846
iteration 4100 / 5000: loss 4.134650
iteration 4200 / 5000: loss 3.768341
iteration 4300 / 5000: loss 3.755929
iteration 4400 / 5000: loss 3.892435
iteration 4500 / 5000: loss 4.180594
iteration 4600 / 5000: loss 3.847744
iteration 4700 / 5000: loss 3.675013
iteration 4800 / 5000: loss 4.074078
iteration 4900 / 5000: loss 3.196106
lr 1.000000e-03 reg 3.000000e-01 train_accuracy: 0.500388 val_accuracy: 0.482000
iteration 0 / 5000: loss 9.004143
iteration 100 / 5000: loss 5.562463
iteration 200 / 5000: loss 5.205538
iteration 300 / 5000: loss 4.470902
iteration 400 / 5000: loss 4.851259
iteration 500 / 5000: loss 4.737036
iteration 600 / 5000: loss 4.908021
iteration 700 / 5000: loss 4.694204
iteration 800 / 5000: loss 4.554393
iteration 900 / 5000: loss 4.861211
iteration 1000 / 5000: loss 4.617075
iteration 1100 / 5000: loss 4.689092
iteration 1200 / 5000: loss 4.635346
iteration 1300 / 5000: loss 4.277077
iteration 1400 / 5000: loss 4.954767
iteration 1500 / 5000: loss 4.900643
iteration 1600 / 5000: loss 4.680341
iteration 1700 / 5000: loss 4.629965
iteration 1800 / 5000: loss 4.489026
iteration 1900 / 5000: loss 4.455859
iteration 2000 / 5000: loss 4.927052
iteration 2100 / 5000: loss 5.085522
iteration 2200 / 5000: loss 4.797649
iteration 2300 / 5000: loss 4.659013
iteration 2400 / 5000: loss 4.980374
iteration 2500 / 5000: loss 4.597401
iteration 2600 / 5000: loss 4.583338
iteration 2700 / 5000: loss 4.384852
iteration 2800 / 5000: loss 4.626649
iteration 2900 / 5000: loss 4.539631
iteration 3000 / 5000: loss 4.698335
iteration 3100 / 5000: loss 4.680091
iteration 3200 / 5000: loss 4.589397
iteration 3300 / 5000: loss 4.591379
iteration 3400 / 5000: loss 4.972946
iteration 3500 / 5000: loss 4.451445
iteration 3600 / 5000: loss 4.416458
iteration 3700 / 5000: loss 4.631989
iteration 3800 / 5000: loss 4.588365
iteration 3900 / 5000: loss 5.099420
iteration 4000 / 5000: loss 4.530006
iteration 4100 / 5000: loss 4.722660
iteration 4200 / 5000: loss 4.475723
iteration 4300 / 5000: loss 4.287107
iteration 4400 / 5000: loss 4.646872
iteration 4500 / 5000: loss 4.892148
iteration 4600 / 5000: loss 4.652171
iteration 4700 / 5000: loss 4.395890
iteration 4800 / 5000: loss 4.913671
iteration 4900 / 5000: loss 4.591936
lr 1.000000e-03 reg 9.000000e-01 train_accuracy: 0.486000 val_accuracy: 0.479000
iteration 0 / 5000: loss 8.986140
iteration 100 / 5000: loss 5.302994
iteration 200 / 5000: loss 4.959137
iteration 300 / 5000: loss 5.136392
iteration 400 / 5000: loss 4.939293
iteration 500 / 5000: loss 5.038469
iteration 600 / 5000: loss 4.770046
iteration 700 / 5000: loss 4.687595
iteration 800 / 5000: loss 4.670669
iteration 900 / 5000: loss 4.589962
iteration 1000 / 5000: loss 5.091857
iteration 1100 / 5000: loss 4.791553
iteration 1200 / 5000: loss 4.783631
iteration 1300 / 5000: loss 4.707839
iteration 1400 / 5000: loss 4.658593
iteration 1500 / 5000: loss 4.861506
iteration 1600 / 5000: loss 4.701618
iteration 1700 / 5000: loss 4.670404
iteration 1800 / 5000: loss 4.600604
iteration 1900 / 5000: loss 4.772823
iteration 2000 / 5000: loss 4.841285
iteration 2100 / 5000: loss 4.489588
iteration 2200 / 5000: loss 4.996695
iteration 2300 / 5000: loss 4.636411
iteration 2400 / 5000: loss 4.636521
iteration 2500 / 5000: loss 4.970593
iteration 2600 / 5000: loss 4.940586
iteration 2700 / 5000: loss 4.658838
iteration 2800 / 5000: loss 4.659173
iteration 2900 / 5000: loss 4.679209
iteration 3000 / 5000: loss 4.593786
iteration 3100 / 5000: loss 4.613389
iteration 3200 / 5000: loss 4.862857
iteration 3300 / 5000: loss 4.861347
iteration 3400 / 5000: loss 4.948390
iteration 3500 / 5000: loss 4.887241
iteration 3600 / 5000: loss 4.968177
iteration 3700 / 5000: loss 4.698326
iteration 3800 / 5000: loss 4.705191
iteration 3900 / 5000: loss 5.226396
iteration 4000 / 5000: loss 4.726874
iteration 4100 / 5000: loss 4.534896
iteration 4200 / 5000: loss 4.644005
iteration 4300 / 5000: loss 4.834404
iteration 4400 / 5000: loss 4.821037
iteration 4500 / 5000: loss 4.903962
iteration 4600 / 5000: loss 4.624451
iteration 4700 / 5000: loss 4.280641
iteration 4800 / 5000: loss 4.678200
iteration 4900 / 5000: loss 4.635208
lr 1.000000e-03 reg 1.000000e+00 train_accuracy: 0.482878 val_accuracy: 0.473000
iteration 0 / 5000: loss 9.017813
iteration 100 / 5000: loss 6.228706
iteration 200 / 5000: loss 5.778267
iteration 300 / 5000: loss 5.827308
iteration 400 / 5000: loss 5.952733
iteration 500 / 5000: loss 5.842409
iteration 600 / 5000: loss 5.844480
iteration 700 / 5000: loss 5.923912
iteration 800 / 5000: loss 5.826155
iteration 900 / 5000: loss 5.930268
iteration 1000 / 5000: loss 6.042721
iteration 1100 / 5000: loss 5.965232
iteration 1200 / 5000: loss 6.150538
iteration 1300 / 5000: loss 5.654621
iteration 1400 / 5000: loss 6.159278
iteration 1500 / 5000: loss 6.004379
iteration 1600 / 5000: loss 5.749448
iteration 1700 / 5000: loss 5.837377
iteration 1800 / 5000: loss 6.032467
iteration 1900 / 5000: loss 5.835108
iteration 2000 / 5000: loss 5.973061
iteration 2100 / 5000: loss 5.698584
iteration 2200 / 5000: loss 5.650160
iteration 2300 / 5000: loss 6.211155
iteration 2400 / 5000: loss 5.694353
iteration 2500 / 5000: loss 5.914565
iteration 2600 / 5000: loss 5.994465
iteration 2700 / 5000: loss 6.092071
iteration 2800 / 5000: loss 5.894791
iteration 2900 / 5000: loss 6.114530
iteration 3000 / 5000: loss 5.873771
iteration 3100 / 5000: loss 5.885868
iteration 3200 / 5000: loss 5.789780
iteration 3300 / 5000: loss 5.705847
iteration 3400 / 5000: loss 6.062514
iteration 3500 / 5000: loss 6.007804
iteration 3600 / 5000: loss 5.762809
iteration 3700 / 5000: loss 6.377874
iteration 3800 / 5000: loss 5.903624
iteration 3900 / 5000: loss 5.646450
iteration 4000 / 5000: loss 5.922924
iteration 4100 / 5000: loss 6.084884
iteration 4200 / 5000: loss 5.828749
iteration 4300 / 5000: loss 5.838338
iteration 4400 / 5000: loss 5.803508
iteration 4500 / 5000: loss 6.079819
iteration 4600 / 5000: loss 5.866769
iteration 4700 / 5000: loss 5.716137
iteration 4800 / 5000: loss 5.732743
iteration 4900 / 5000: loss 5.846434
lr 1.000000e-03 reg 3.000000e+00 train_accuracy: 0.462551 val_accuracy: 0.455000
iteration 0 / 5000: loss 9.017593
iteration 100 / 5000: loss 7.278168
iteration 200 / 5000: loss 7.120267
iteration 300 / 5000: loss 7.435892
iteration 400 / 5000: loss 7.148106
iteration 500 / 5000: loss 7.281306
iteration 600 / 5000: loss 7.425998
iteration 700 / 5000: loss 7.242494
iteration 800 / 5000: loss 7.405566
iteration 900 / 5000: loss 7.351175
iteration 1000 / 5000: loss 7.303417
iteration 1100 / 5000: loss 7.208500
iteration 1200 / 5000: loss 7.150963
iteration 1300 / 5000: loss 7.238850
iteration 1400 / 5000: loss 7.344014
iteration 1500 / 5000: loss 7.354543
iteration 1600 / 5000: loss 7.196430
iteration 1700 / 5000: loss 7.280288
iteration 1800 / 5000: loss 7.366095
iteration 1900 / 5000: loss 7.345238
iteration 2000 / 5000: loss 7.069840
iteration 2100 / 5000: loss 7.415041
iteration 2200 / 5000: loss 7.249721
iteration 2300 / 5000: loss 7.331670
iteration 2400 / 5000: loss 7.271335
iteration 2500 / 5000: loss 7.082086
iteration 2600 / 5000: loss 7.054815
iteration 2700 / 5000: loss 7.400436
iteration 2800 / 5000: loss 7.149763
iteration 2900 / 5000: loss 7.318828
iteration 3000 / 5000: loss 7.246667
iteration 3100 / 5000: loss 7.448453
iteration 3200 / 5000: loss 7.304238
iteration 3300 / 5000: loss 7.269694
iteration 3400 / 5000: loss 7.480752
iteration 3500 / 5000: loss 7.505915
iteration 3600 / 5000: loss 6.954790
iteration 3700 / 5000: loss 7.259455
iteration 3800 / 5000: loss 7.099920
iteration 3900 / 5000: loss 7.350530
iteration 4000 / 5000: loss 7.353918
iteration 4100 / 5000: loss 7.136195
iteration 4200 / 5000: loss 6.949374
iteration 4300 / 5000: loss 7.259147
iteration 4400 / 5000: loss 7.504834
iteration 4500 / 5000: loss 7.090403
iteration 4600 / 5000: loss 7.225015
iteration 4700 / 5000: loss 7.218855
iteration 4800 / 5000: loss 7.196233
iteration 4900 / 5000: loss 7.425907
lr 1.000000e-03 reg 9.000000e+00 train_accuracy: 0.433571 val_accuracy: 0.436000

iteration 0 / 5000: loss 8.998697
iteration 100 / 5000: loss 4.100012
iteration 200 / 5000: loss 3.705494
iteration 300 / 5000: loss 3.117422
iteration 400 / 5000: loss 3.495823
iteration 500 / 5000: loss 3.741742
iteration 600 / 5000: loss 3.768714
iteration 700 / 5000: loss 3.406223
iteration 800 / 5000: loss 3.243594
iteration 900 / 5000: loss 3.373861
iteration 1000 / 5000: loss 3.335172
iteration 1100 / 5000: loss 3.044267
iteration 1200 / 5000: loss 3.332883
iteration 1300 / 5000: loss 3.299683
iteration 1400 / 5000: loss 3.482045
iteration 1500 / 5000: loss 3.724521
iteration 1600 / 5000: loss 3.417143
iteration 1700 / 5000: loss 3.645075
iteration 1800 / 5000: loss 3.426477
iteration 1900 / 5000: loss 3.291517
iteration 2000 / 5000: loss 3.058570
iteration 2100 / 5000: loss 3.591058
iteration 2200 / 5000: loss 3.819759
iteration 2300 / 5000: loss 3.396520
iteration 2400 / 5000: loss 3.379577
iteration 2500 / 5000: loss 3.261831
iteration 2600 / 5000: loss 2.958490
iteration 2700 / 5000: loss 3.290901
iteration 2800 / 5000: loss 3.463167
iteration 2900 / 5000: loss 3.392271
iteration 3000 / 5000: loss 3.288081
iteration 3100 / 5000: loss 3.182479
iteration 3200 / 5000: loss 3.376173
iteration 3300 / 5000: loss 3.549471
iteration 3400 / 5000: loss 3.251894
iteration 3500 / 5000: loss 3.252839
iteration 3600 / 5000: loss 3.403083
iteration 3700 / 5000: loss 3.449381
iteration 3800 / 5000: loss 3.461615
iteration 3900 / 5000: loss 3.293033
iteration 4000 / 5000: loss 3.296881
iteration 4100 / 5000: loss 3.476460
iteration 4200 / 5000: loss 3.404636
iteration 4300 / 5000: loss 3.364647
iteration 4400 / 5000: loss 3.227580
iteration 4500 / 5000: loss 3.142296
iteration 4600 / 5000: loss 3.430904
iteration 4700 / 5000: loss 3.816550
iteration 4800 / 5000: loss 3.548812
iteration 4900 / 5000: loss 3.204088
lr 3.000000e-03 reg 1.000000e-01 train_accuracy: 0.509102 val_accuracy: 0.498000
iteration 0 / 5000: loss 9.012979
iteration 100 / 5000: loss 3.990054
iteration 200 / 5000: loss 3.855879
iteration 300 / 5000: loss 4.113040
iteration 400 / 5000: loss 3.992527
iteration 500 / 5000: loss 3.809679
iteration 600 / 5000: loss 4.110441
iteration 700 / 5000: loss 3.602152
iteration 800 / 5000: loss 4.224510
iteration 900 / 5000: loss 3.840969
iteration 1000 / 5000: loss 3.894152
iteration 1100 / 5000: loss 3.921851
iteration 1200 / 5000: loss 3.780120
iteration 1300 / 5000: loss 3.584745
iteration 1400 / 5000: loss 3.933115
iteration 1500 / 5000: loss 3.956934
iteration 1600 / 5000: loss 3.827716
iteration 1700 / 5000: loss 3.972084
iteration 1800 / 5000: loss 3.969353
iteration 1900 / 5000: loss 4.037436
iteration 2000 / 5000: loss 3.551836
iteration 2100 / 5000: loss 3.915499
iteration 2200 / 5000: loss 4.120408
iteration 2300 / 5000: loss 3.854567
iteration 2400 / 5000: loss 4.220907
iteration 2500 / 5000: loss 4.021625
iteration 2600 / 5000: loss 4.028863
iteration 2700 / 5000: loss 3.848696
iteration 2800 / 5000: loss 4.067331
iteration 2900 / 5000: loss 3.928933
iteration 3000 / 5000: loss 3.743866
iteration 3100 / 5000: loss 4.190518
iteration 3200 / 5000: loss 3.637456
iteration 3300 / 5000: loss 3.810270
iteration 3400 / 5000: loss 3.994298
iteration 3500 / 5000: loss 3.825897
iteration 3600 / 5000: loss 3.860765
iteration 3700 / 5000: loss 3.939996
iteration 3800 / 5000: loss 3.979099
iteration 3900 / 5000: loss 3.812642
iteration 4000 / 5000: loss 3.881916
iteration 4100 / 5000: loss 3.909914
iteration 4200 / 5000: loss 3.944609
iteration 4300 / 5000: loss 3.925904
iteration 4400 / 5000: loss 4.068157
iteration 4500 / 5000: loss 4.306296
iteration 4600 / 5000: loss 4.142323
iteration 4700 / 5000: loss 4.100973
iteration 4800 / 5000: loss 4.003787
iteration 4900 / 5000: loss 4.088924
lr 3.000000e-03 reg 3.000000e-01 train_accuracy: 0.499898 val_accuracy: 0.480000
iteration 0 / 5000: loss 9.013086
iteration 100 / 5000: loss 4.638160
iteration 200 / 5000: loss 4.802614
iteration 300 / 5000: loss 4.578182
iteration 400 / 5000: loss 4.673700
iteration 500 / 5000: loss 4.915970
iteration 600 / 5000: loss 4.715734
iteration 700 / 5000: loss 4.779964
iteration 800 / 5000: loss 4.660843
iteration 900 / 5000: loss 4.647183
iteration 1000 / 5000: loss 4.470984
iteration 1100 / 5000: loss 4.930100
iteration 1200 / 5000: loss 4.737154
iteration 1300 / 5000: loss 4.537754
iteration 1400 / 5000: loss 4.608332
iteration 1500 / 5000: loss 4.863897
iteration 1600 / 5000: loss 4.692012
iteration 1700 / 5000: loss 4.671620
iteration 1800 / 5000: loss 4.787306
iteration 1900 / 5000: loss 4.711505
iteration 2000 / 5000: loss 4.517047
iteration 2100 / 5000: loss 4.634097
iteration 2200 / 5000: loss 4.632342
iteration 2300 / 5000: loss 4.524606
iteration 2400 / 5000: loss 4.397150
iteration 2500 / 5000: loss 4.591168
iteration 2600 / 5000: loss 5.104235
iteration 2700 / 5000: loss 4.919387
iteration 2800 / 5000: loss 4.508477
iteration 2900 / 5000: loss 4.600351
iteration 3000 / 5000: loss 5.040510
iteration 3100 / 5000: loss 4.992269
iteration 3200 / 5000: loss 4.491447
iteration 3300 / 5000: loss 4.584938
iteration 3400 / 5000: loss 4.899923
iteration 3500 / 5000: loss 4.596921
iteration 3600 / 5000: loss 4.460049
iteration 3700 / 5000: loss 4.532867
iteration 3800 / 5000: loss 4.653620
iteration 3900 / 5000: loss 4.900404
iteration 4000 / 5000: loss 4.540948
iteration 4100 / 5000: loss 4.528217
iteration 4200 / 5000: loss 4.561108
iteration 4300 / 5000: loss 4.754876
iteration 4400 / 5000: loss 4.825432
iteration 4500 / 5000: loss 4.828772
iteration 4600 / 5000: loss 4.680832
iteration 4700 / 5000: loss 4.554962
iteration 4800 / 5000: loss 4.783739
iteration 4900 / 5000: loss 5.054334
lr 3.000000e-03 reg 9.000000e-01 train_accuracy: 0.486367 val_accuracy: 0.474000
iteration 0 / 5000: loss 8.991917
iteration 100 / 5000: loss 4.750044
iteration 200 / 5000: loss 4.833043
iteration 300 / 5000: loss 4.907590
iteration 400 / 5000: loss 4.715502
iteration 500 / 5000: loss 4.612156
iteration 600 / 5000: loss 4.661180
iteration 700 / 5000: loss 4.828432
iteration 800 / 5000: loss 4.580963
iteration 900 / 5000: loss 4.775987
iteration 1000 / 5000: loss 4.871596
iteration 1100 / 5000: loss 4.656938
iteration 1200 / 5000: loss 4.904256
iteration 1300 / 5000: loss 4.890523
iteration 1400 / 5000: loss 4.443491
iteration 1500 / 5000: loss 4.924470
iteration 1600 / 5000: loss 5.146599
iteration 1700 / 5000: loss 4.488330
iteration 1800 / 5000: loss 4.616853
iteration 1900 / 5000: loss 4.969304
iteration 2000 / 5000: loss 5.165386
iteration 2100 / 5000: loss 4.784187
iteration 2200 / 5000: loss 4.636955
iteration 2300 / 5000: loss 5.056632
iteration 2400 / 5000: loss 4.537183
iteration 2500 / 5000: loss 4.693444
iteration 2600 / 5000: loss 4.826939
iteration 2700 / 5000: loss 4.812870
iteration 2800 / 5000: loss 4.985569
iteration 2900 / 5000: loss 4.785201
iteration 3000 / 5000: loss 4.723944
iteration 3100 / 5000: loss 4.790155
iteration 3200 / 5000: loss 4.880886
iteration 3300 / 5000: loss 4.787889
iteration 3400 / 5000: loss 4.767660
iteration 3500 / 5000: loss 4.694860
iteration 3600 / 5000: loss 4.765594
iteration 3700 / 5000: loss 4.820640
iteration 3800 / 5000: loss 4.989978
iteration 3900 / 5000: loss 5.273707
iteration 4000 / 5000: loss 4.697598
iteration 4100 / 5000: loss 4.997748
iteration 4200 / 5000: loss 4.731689
iteration 4300 / 5000: loss 4.986296
iteration 4400 / 5000: loss 4.626707
iteration 4500 / 5000: loss 4.665072
iteration 4600 / 5000: loss 4.768169
iteration 4700 / 5000: loss 4.575487
iteration 4800 / 5000: loss 4.610763
iteration 4900 / 5000: loss 4.716756
lr 3.000000e-03 reg 1.000000e+00 train_accuracy: 0.481633 val_accuracy: 0.472000
iteration 0 / 5000: loss 9.006829
iteration 100 / 5000: loss 6.218047
iteration 200 / 5000: loss 5.843111
iteration 300 / 5000: loss 6.033828
iteration 400 / 5000: loss 5.583902
iteration 500 / 5000: loss 5.765844
iteration 600 / 5000: loss 5.922734
iteration 700 / 5000: loss 6.044463
iteration 800 / 5000: loss 5.788766
iteration 900 / 5000: loss 5.927363
iteration 1000 / 5000: loss 6.077712
iteration 1100 / 5000: loss 5.647404
iteration 1200 / 5000: loss 5.779649
iteration 1300 / 5000: loss 6.188250
iteration 1400 / 5000: loss 6.211847
iteration 1500 / 5000: loss 5.782879
iteration 1600 / 5000: loss 5.894844
iteration 1700 / 5000: loss 5.716508
iteration 1800 / 5000: loss 5.825690
iteration 1900 / 5000: loss 6.104457
iteration 2000 / 5000: loss 5.687027
iteration 2100 / 5000: loss 5.810180
iteration 2200 / 5000: loss 5.653487
iteration 2300 / 5000: loss 5.864894
iteration 2400 / 5000: loss 5.656294
iteration 2500 / 5000: loss 5.796789
iteration 2600 / 5000: loss 6.079064
iteration 2700 / 5000: loss 6.085285
iteration 2800 / 5000: loss 5.878402
iteration 2900 / 5000: loss 5.690079
iteration 3000 / 5000: loss 5.896991
iteration 3100 / 5000: loss 5.927437
iteration 3200 / 5000: loss 5.854007
iteration 3300 / 5000: loss 6.030766
iteration 3400 / 5000: loss 6.105755
iteration 3500 / 5000: loss 6.121425
iteration 3600 / 5000: loss 5.739092
iteration 3700 / 5000: loss 6.087685
iteration 3800 / 5000: loss 6.137044
iteration 3900 / 5000: loss 5.606633
iteration 4000 / 5000: loss 5.790806
iteration 4100 / 5000: loss 5.746897
iteration 4200 / 5000: loss 5.864677
iteration 4300 / 5000: loss 6.108961
iteration 4400 / 5000: loss 6.024774
iteration 4500 / 5000: loss 6.057019
iteration 4600 / 5000: loss 5.849692
iteration 4700 / 5000: loss 5.874280
iteration 4800 / 5000: loss 5.772142
iteration 4900 / 5000: loss 5.839295
lr 3.000000e-03 reg 3.000000e+00 train_accuracy: 0.460939 val_accuracy: 0.454000
iteration 0 / 5000: loss 9.026932
iteration 100 / 5000: loss 7.461518
iteration 200 / 5000: loss 7.457052
iteration 300 / 5000: loss 7.518952
iteration 400 / 5000: loss 7.297335
iteration 500 / 5000: loss 7.430047
iteration 600 / 5000: loss 7.268610
iteration 700 / 5000: loss 7.561062
iteration 800 / 5000: loss 7.263763
iteration 900 / 5000: loss 7.401934
iteration 1000 / 5000: loss 7.559940
iteration 1100 / 5000: loss 7.528845
iteration 1200 / 5000: loss 7.413636
iteration 1300 / 5000: loss 7.411010
iteration 1400 / 5000: loss 7.269214
iteration 1500 / 5000: loss 7.183534
iteration 1600 / 5000: loss 7.372436
iteration 1700 / 5000: loss 7.292567
iteration 1800 / 5000: loss 7.274848
iteration 1900 / 5000: loss 7.396608
iteration 2000 / 5000: loss 7.339145
iteration 2100 / 5000: loss 7.247316
iteration 2200 / 5000: loss 7.193802
iteration 2300 / 5000: loss 7.427634
iteration 2400 / 5000: loss 7.170054
iteration 2500 / 5000: loss 7.564724
iteration 2600 / 5000: loss 7.317119
iteration 2700 / 5000: loss 7.239553
iteration 2800 / 5000: loss 7.460492
iteration 2900 / 5000: loss 7.362631
iteration 3000 / 5000: loss 7.391489
iteration 3100 / 5000: loss 7.553461
iteration 3200 / 5000: loss 7.445643
iteration 3300 / 5000: loss 7.100724
iteration 3400 / 5000: loss 7.442854
iteration 3500 / 5000: loss 7.538606
iteration 3600 / 5000: loss 7.250456
iteration 3700 / 5000: loss 7.217068
iteration 3800 / 5000: loss 7.095448
iteration 3900 / 5000: loss 7.449043
iteration 4000 / 5000: loss 7.219236
iteration 4100 / 5000: loss 7.316683
iteration 4200 / 5000: loss 7.058803
iteration 4300 / 5000: loss 7.502503
iteration 4400 / 5000: loss 7.199226
iteration 4500 / 5000: loss 7.411256
iteration 4600 / 5000: loss 7.472776
iteration 4700 / 5000: loss 7.144692
iteration 4800 / 5000: loss 7.457710
iteration 4900 / 5000: loss 7.176090
lr 3.000000e-03 reg 9.000000e+00 train_accuracy: 0.437122 val_accuracy: 0.431000

iteration 0 / 5000: loss 8.984716
iteration 100 / 5000: loss 3.334674
iteration 200 / 5000: loss 3.483267
iteration 300 / 5000: loss 3.423185
iteration 400 / 5000: loss 3.610490
iteration 500 / 5000: loss 3.127435
iteration 600 / 5000: loss 3.362485
iteration 700 / 5000: loss 3.306634
iteration 800 / 5000: loss 3.738598
iteration 900 / 5000: loss 3.383301
iteration 1000 / 5000: loss 2.986826
iteration 1100 / 5000: loss 3.130989
iteration 1200 / 5000: loss 3.274088
iteration 1300 / 5000: loss 3.427303
iteration 1400 / 5000: loss 3.541098
iteration 1500 / 5000: loss 3.568226
iteration 1600 / 5000: loss 3.615773
iteration 1700 / 5000: loss 3.382652
iteration 1800 / 5000: loss 3.608825
iteration 1900 / 5000: loss 3.391117
iteration 2000 / 5000: loss 3.185779
iteration 2100 / 5000: loss 3.554307
iteration 2200 / 5000: loss 3.065793
iteration 2300 / 5000: loss 3.625690
iteration 2400 / 5000: loss 3.175530
iteration 2500 / 5000: loss 3.641832
iteration 2600 / 5000: loss 3.301569
iteration 2700 / 5000: loss 4.062162
iteration 2800 / 5000: loss 3.227648
iteration 2900 / 5000: loss 3.140973
iteration 3000 / 5000: loss 3.306540
iteration 3100 / 5000: loss 3.265288
iteration 3200 / 5000: loss 3.439004
iteration 3300 / 5000: loss 3.461639
iteration 3400 / 5000: loss 3.640456
iteration 3500 / 5000: loss 3.607445
iteration 3600 / 5000: loss 3.051184
iteration 3700 / 5000: loss 3.328352
iteration 3800 / 5000: loss 3.432220
iteration 3900 / 5000: loss 3.470961
iteration 4000 / 5000: loss 3.506964
iteration 4100 / 5000: loss 3.291385
iteration 4200 / 5000: loss 3.155244
iteration 4300 / 5000: loss 3.335861
iteration 4400 / 5000: loss 3.633458
iteration 4500 / 5000: loss 3.418231
iteration 4600 / 5000: loss 3.390337
iteration 4700 / 5000: loss 3.231440
iteration 4800 / 5000: loss 3.508734
iteration 4900 / 5000: loss 3.152421
lr 9.000000e-03 reg 1.000000e-01 train_accuracy: 0.505673 val_accuracy: 0.486000
iteration 0 / 5000: loss 8.996759
iteration 100 / 5000: loss 3.773162
iteration 200 / 5000: loss 4.022680
iteration 300 / 5000: loss 3.988994
iteration 400 / 5000: loss 4.134425
iteration 500 / 5000: loss 4.164354
iteration 600 / 5000: loss 3.908821
iteration 700 / 5000: loss 3.986374
iteration 800 / 5000: loss 3.695451
iteration 900 / 5000: loss 3.939625
iteration 1000 / 5000: loss 3.671025
iteration 1100 / 5000: loss 3.854139
iteration 1200 / 5000: loss 3.880300
iteration 1300 / 5000: loss 3.651760
iteration 1400 / 5000: loss 3.868702
iteration 1500 / 5000: loss 3.988567
iteration 1600 / 5000: loss 4.006362
iteration 1700 / 5000: loss 3.747438
iteration 1800 / 5000: loss 4.009531
iteration 1900 / 5000: loss 3.780692
iteration 2000 / 5000: loss 4.186237
iteration 2100 / 5000: loss 4.131511
iteration 2200 / 5000: loss 3.806930
iteration 2300 / 5000: loss 4.368171
iteration 2400 / 5000: loss 3.546551
iteration 2500 / 5000: loss 3.933108
iteration 2600 / 5000: loss 4.059995
iteration 2700 / 5000: loss 3.568148
iteration 2800 / 5000: loss 3.834903
iteration 2900 / 5000: loss 4.146120
iteration 3000 / 5000: loss 4.109359
iteration 3100 / 5000: loss 3.975992
iteration 3200 / 5000: loss 4.144932
iteration 3300 / 5000: loss 3.730685
iteration 3400 / 5000: loss 3.723834
iteration 3500 / 5000: loss 4.009655
iteration 3600 / 5000: loss 4.120012
iteration 3700 / 5000: loss 4.049844
iteration 3800 / 5000: loss 3.872518
iteration 3900 / 5000: loss 4.266584
iteration 4000 / 5000: loss 3.775907
iteration 4100 / 5000: loss 3.696454
iteration 4200 / 5000: loss 4.243620
iteration 4300 / 5000: loss 4.086038
iteration 4400 / 5000: loss 3.815707
iteration 4500 / 5000: loss 4.011758
iteration 4600 / 5000: loss 3.831693
iteration 4700 / 5000: loss 3.779136
iteration 4800 / 5000: loss 3.993504
iteration 4900 / 5000: loss 4.085376
lr 9.000000e-03 reg 3.000000e-01 train_accuracy: 0.497469 val_accuracy: 0.483000
iteration 0 / 5000: loss 9.008199
iteration 100 / 5000: loss 4.695384
iteration 200 / 5000: loss 4.796630
iteration 300 / 5000: loss 4.677446
iteration 400 / 5000: loss 4.775944
iteration 500 / 5000: loss 4.797669
iteration 600 / 5000: loss 4.731255
iteration 700 / 5000: loss 4.994088
iteration 800 / 5000: loss 4.721622
iteration 900 / 5000: loss 4.752960
iteration 1000 / 5000: loss 4.693706
iteration 1100 / 5000: loss 4.724051
iteration 1200 / 5000: loss 4.564057
iteration 1300 / 5000: loss 4.763649
iteration 1400 / 5000: loss 4.677898
iteration 1500 / 5000: loss 4.941732
iteration 1600 / 5000: loss 4.653538
iteration 1700 / 5000: loss 4.834483
iteration 1800 / 5000: loss 5.016694
iteration 1900 / 5000: loss 5.056944
iteration 2000 / 5000: loss 4.649248
iteration 2100 / 5000: loss 5.153167
iteration 2200 / 5000: loss 4.574302
iteration 2300 / 5000: loss 4.793213
iteration 2400 / 5000: loss 4.839072
iteration 2500 / 5000: loss 4.976803
iteration 2600 / 5000: loss 4.863339
iteration 2700 / 5000: loss 4.872250
iteration 2800 / 5000: loss 4.680161
iteration 2900 / 5000: loss 4.518371
iteration 3000 / 5000: loss 4.652588
iteration 3100 / 5000: loss 4.732362
iteration 3200 / 5000: loss 4.702774
iteration 3300 / 5000: loss 4.749666
iteration 3400 / 5000: loss 4.994353
iteration 3500 / 5000: loss 4.758665
iteration 3600 / 5000: loss 4.428190
iteration 3700 / 5000: loss 4.798437
iteration 3800 / 5000: loss 4.575233
iteration 3900 / 5000: loss 5.084306
iteration 4000 / 5000: loss 4.695965
iteration 4100 / 5000: loss 4.517107
iteration 4200 / 5000: loss 4.796656
iteration 4300 / 5000: loss 4.746200
iteration 4400 / 5000: loss 4.443758
iteration 4500 / 5000: loss 4.682429
iteration 4600 / 5000: loss 4.840508
iteration 4700 / 5000: loss 4.412134
iteration 4800 / 5000: loss 4.802553
iteration 4900 / 5000: loss 4.925744
lr 9.000000e-03 reg 9.000000e-01 train_accuracy: 0.482224 val_accuracy: 0.470000
iteration 0 / 5000: loss 8.989052
iteration 100 / 5000: loss 4.700965
iteration 200 / 5000: loss 4.944649
iteration 300 / 5000: loss 4.854761
iteration 400 / 5000: loss 4.997635
iteration 500 / 5000: loss 4.603227
iteration 600 / 5000: loss 4.817780
iteration 700 / 5000: loss 4.709168
iteration 800 / 5000: loss 4.921365
iteration 900 / 5000: loss 5.018564
iteration 1000 / 5000: loss 4.785480
iteration 1100 / 5000: loss 4.814628
iteration 1200 / 5000: loss 4.744852
iteration 1300 / 5000: loss 4.560787
iteration 1400 / 5000: loss 4.702625
iteration 1500 / 5000: loss 5.013163
iteration 1600 / 5000: loss 4.888148
iteration 1700 / 5000: loss 4.776632
iteration 1800 / 5000: loss 4.721399
iteration 1900 / 5000: loss 5.198690
iteration 2000 / 5000: loss 4.879849
iteration 2100 / 5000: loss 4.532061
iteration 2200 / 5000: loss 4.879430
iteration 2300 / 5000: loss 4.892786
iteration 2400 / 5000: loss 4.714645
iteration 2500 / 5000: loss 5.000189
iteration 2600 / 5000: loss 4.704178
iteration 2700 / 5000: loss 4.824296
iteration 2800 / 5000: loss 4.611751
iteration 2900 / 5000: loss 4.649686
iteration 3000 / 5000: loss 4.942817
iteration 3100 / 5000: loss 4.830212
iteration 3200 / 5000: loss 4.895941
iteration 3300 / 5000: loss 4.579266
iteration 3400 / 5000: loss 5.000155
iteration 3500 / 5000: loss 4.771816
iteration 3600 / 5000: loss 4.943767
iteration 3700 / 5000: loss 4.961086
iteration 3800 / 5000: loss 4.413020
iteration 3900 / 5000: loss 4.744845
iteration 4000 / 5000: loss 4.807304
iteration 4100 / 5000: loss 4.767942
iteration 4200 / 5000: loss 4.813331
iteration 4300 / 5000: loss 4.749406
iteration 4400 / 5000: loss 4.673811
iteration 4500 / 5000: loss 4.493039
iteration 4600 / 5000: loss 4.608103
iteration 4700 / 5000: loss 4.876858
iteration 4800 / 5000: loss 5.033584
iteration 4900 / 5000: loss 4.981163
lr 9.000000e-03 reg 1.000000e+00 train_accuracy: 0.481918 val_accuracy: 0.469000
iteration 0 / 5000: loss 8.976198
iteration 100 / 5000: loss 6.076721
iteration 200 / 5000: loss 6.070392
iteration 300 / 5000: loss 6.004449
iteration 400 / 5000: loss 6.093823
iteration 500 / 5000: loss 5.568528
iteration 600 / 5000: loss 5.713333
iteration 700 / 5000: loss 6.028721
iteration 800 / 5000: loss 5.804025
iteration 900 / 5000: loss 5.890334
iteration 1000 / 5000: loss 6.090786
iteration 1100 / 5000: loss 5.866357
iteration 1200 / 5000: loss 5.623958
iteration 1300 / 5000: loss 6.006825
iteration 1400 / 5000: loss 5.822462
iteration 1500 / 5000: loss 6.250363
iteration 1600 / 5000: loss 5.898995
iteration 1700 / 5000: loss 6.133689
iteration 1800 / 5000: loss 5.951735
iteration 1900 / 5000: loss 5.947778
iteration 2000 / 5000: loss 6.273517
iteration 2100 / 5000: loss 6.107167
iteration 2200 / 5000: loss 6.120640
iteration 2300 / 5000: loss 6.027133
iteration 2400 / 5000: loss 6.422902
iteration 2500 / 5000: loss 5.819259
iteration 2600 / 5000: loss 6.320004
iteration 2700 / 5000: loss 5.892902
iteration 2800 / 5000: loss 5.842457
iteration 2900 / 5000: loss 5.891278
iteration 3000 / 5000: loss 6.047475
iteration 3100 / 5000: loss 5.967645
iteration 3200 / 5000: loss 6.032148
iteration 3300 / 5000: loss 6.137564
iteration 3400 / 5000: loss 6.000397
iteration 3500 / 5000: loss 5.979616
iteration 3600 / 5000: loss 5.905196
iteration 3700 / 5000: loss 5.982443
iteration 3800 / 5000: loss 5.716428
iteration 3900 / 5000: loss 5.824481
iteration 4000 / 5000: loss 6.004507
iteration 4100 / 5000: loss 6.118313
iteration 4200 / 5000: loss 5.942336
iteration 4300 / 5000: loss 6.044541
iteration 4400 / 5000: loss 6.050129
iteration 4500 / 5000: loss 5.735284
iteration 4600 / 5000: loss 6.058324
iteration 4700 / 5000: loss 6.401032
iteration 4800 / 5000: loss 5.836698
iteration 4900 / 5000: loss 5.943696
lr 9.000000e-03 reg 3.000000e+00 train_accuracy: 0.458816 val_accuracy: 0.456000
iteration 0 / 5000: loss 9.016239
iteration 100 / 5000: loss 7.439558
iteration 200 / 5000: loss 7.155084
iteration 300 / 5000: loss 7.550033
iteration 400 / 5000: loss 7.312672
iteration 500 / 5000: loss 7.231526
iteration 600 / 5000: loss 7.501892
iteration 700 / 5000: loss 7.219823
iteration 800 / 5000: loss 7.490644
iteration 900 / 5000: loss 7.376652
iteration 1000 / 5000: loss 7.147310
iteration 1100 / 5000: loss 7.554918
iteration 1200 / 5000: loss 7.540031
iteration 1300 / 5000: loss 7.302874
iteration 1400 / 5000: loss 7.140160
iteration 1500 / 5000: loss 7.451826
iteration 1600 / 5000: loss 7.417618
iteration 1700 / 5000: loss 7.470007
iteration 1800 / 5000: loss 7.541980
iteration 1900 / 5000: loss 7.249245
iteration 2000 / 5000: loss 7.260155
iteration 2100 / 5000: loss 7.645624
iteration 2200 / 5000: loss 7.314969
iteration 2300 / 5000: loss 7.298185
iteration 2400 / 5000: loss 7.336711
iteration 2500 / 5000: loss 7.568265
iteration 2600 / 5000: loss 7.557245
iteration 2700 / 5000: loss 7.574946
iteration 2800 / 5000: loss 7.221084
iteration 2900 / 5000: loss 7.264100
iteration 3000 / 5000: loss 7.091453
iteration 3100 / 5000: loss 7.486601
iteration 3200 / 5000: loss 7.251692
iteration 3300 / 5000: loss 7.266600
iteration 3400 / 5000: loss 7.376325
iteration 3500 / 5000: loss 7.329524
iteration 3600 / 5000: loss 7.406730
iteration 3700 / 5000: loss 7.251617
iteration 3800 / 5000: loss 7.199844
iteration 3900 / 5000: loss 7.330123
iteration 4000 / 5000: loss 7.269234
iteration 4100 / 5000: loss 7.338653
iteration 4200 / 5000: loss 7.117865
iteration 4300 / 5000: loss 7.475045
iteration 4400 / 5000: loss 7.404935
iteration 4500 / 5000: loss 7.468732
iteration 4600 / 5000: loss 7.518911
iteration 4700 / 5000: loss 7.425318
iteration 4800 / 5000: loss 7.329830
iteration 4900 / 5000: loss 7.311644
lr 9.000000e-03 reg 9.000000e+00 train_accuracy: 0.422857 val_accuracy: 0.424000

iteration 0 / 5000: loss 9.026930
iteration 100 / 5000: loss 3.426388
iteration 200 / 5000: loss 3.191458
iteration 300 / 5000: loss 3.136437
iteration 400 / 5000: loss 3.589722
iteration 500 / 5000: loss 3.255798
iteration 600 / 5000: loss 3.524185
iteration 700 / 5000: loss 3.250999
iteration 800 / 5000: loss 3.411978
iteration 900 / 5000: loss 3.840268
iteration 1000 / 5000: loss 3.384328
iteration 1100 / 5000: loss 3.487213
iteration 1200 / 5000: loss 3.331341
iteration 1300 / 5000: loss 3.543276
iteration 1400 / 5000: loss 3.336113
iteration 1500 / 5000: loss 3.277373
iteration 1600 / 5000: loss 3.499312
iteration 1700 / 5000: loss 3.270287
iteration 1800 / 5000: loss 3.644129
iteration 1900 / 5000: loss 3.452336
iteration 2000 / 5000: loss 3.236099
iteration 2100 / 5000: loss 3.289910
iteration 2200 / 5000: loss 3.670541
iteration 2300 / 5000: loss 3.303787
iteration 2400 / 5000: loss 3.473508
iteration 2500 / 5000: loss 3.209384
iteration 2600 / 5000: loss 3.466431
iteration 2700 / 5000: loss 3.860656
iteration 2800 / 5000: loss 3.456889
iteration 2900 / 5000: loss 3.427049
iteration 3000 / 5000: loss 3.661403
iteration 3100 / 5000: loss 3.291952
iteration 3200 / 5000: loss 3.298313
iteration 3300 / 5000: loss 3.288082
iteration 3400 / 5000: loss 3.814649
iteration 3500 / 5000: loss 3.147375
iteration 3600 / 5000: loss 2.987141
iteration 3700 / 5000: loss 3.534306
iteration 3800 / 5000: loss 3.325372
iteration 3900 / 5000: loss 3.469265
iteration 4000 / 5000: loss 3.104989
iteration 4100 / 5000: loss 3.344094
iteration 4200 / 5000: loss 3.432921
iteration 4300 / 5000: loss 3.831655
iteration 4400 / 5000: loss 3.442675
iteration 4500 / 5000: loss 3.284784
iteration 4600 / 5000: loss 3.462677
iteration 4700 / 5000: loss 3.637769
iteration 4800 / 5000: loss 3.489840
iteration 4900 / 5000: loss 3.617561
lr 1.000000e-02 reg 1.000000e-01 train_accuracy: 0.504531 val_accuracy: 0.492000
iteration 0 / 5000: loss 9.018530
iteration 100 / 5000: loss 3.592359
iteration 200 / 5000: loss 3.919440
iteration 300 / 5000: loss 3.984610
iteration 400 / 5000: loss 3.871707
iteration 500 / 5000: loss 3.779794
iteration 600 / 5000: loss 3.916213
iteration 700 / 5000: loss 3.943069
iteration 800 / 5000: loss 3.828665
iteration 900 / 5000: loss 3.782993
iteration 1000 / 5000: loss 3.740951
iteration 1100 / 5000: loss 3.852941
iteration 1200 / 5000: loss 3.904499
iteration 1300 / 5000: loss 3.716884
iteration 1400 / 5000: loss 3.723842
iteration 1500 / 5000: loss 4.153696
iteration 1600 / 5000: loss 4.128903
iteration 1700 / 5000: loss 3.994321
iteration 1800 / 5000: loss 3.591316
iteration 1900 / 5000: loss 3.448666
iteration 2000 / 5000: loss 4.012628
iteration 2100 / 5000: loss 3.900877
iteration 2200 / 5000: loss 3.677671
iteration 2300 / 5000: loss 4.323150
iteration 2400 / 5000: loss 3.967333
iteration 2500 / 5000: loss 3.934780
iteration 2600 / 5000: loss 3.884066
iteration 2700 / 5000: loss 3.889871
iteration 2800 / 5000: loss 4.231455
iteration 2900 / 5000: loss 4.194393
iteration 3000 / 5000: loss 3.881235
iteration 3100 / 5000: loss 3.941641
iteration 3200 / 5000: loss 3.770885
iteration 3300 / 5000: loss 3.990211
iteration 3400 / 5000: loss 4.070882
iteration 3500 / 5000: loss 4.097463
iteration 3600 / 5000: loss 4.003016
iteration 3700 / 5000: loss 3.838489
iteration 3800 / 5000: loss 4.224869
iteration 3900 / 5000: loss 3.902403
iteration 4000 / 5000: loss 3.972206
iteration 4100 / 5000: loss 3.964670
iteration 4200 / 5000: loss 4.185352
iteration 4300 / 5000: loss 4.034355
iteration 4400 / 5000: loss 3.991047
iteration 4500 / 5000: loss 3.986716
iteration 4600 / 5000: loss 4.026209
iteration 4700 / 5000: loss 3.973472
iteration 4800 / 5000: loss 4.120059
iteration 4900 / 5000: loss 3.918655
lr 1.000000e-02 reg 3.000000e-01 train_accuracy: 0.494694 val_accuracy: 0.488000
iteration 0 / 5000: loss 9.015311
iteration 100 / 5000: loss 4.541415
iteration 200 / 5000: loss 4.678004
iteration 300 / 5000: loss 4.764209
iteration 400 / 5000: loss 4.868024
iteration 500 / 5000: loss 4.514108
iteration 600 / 5000: loss 4.624731
iteration 700 / 5000: loss 4.763835
iteration 800 / 5000: loss 4.573851
iteration 900 / 5000: loss 4.788728
iteration 1000 / 5000: loss 4.686814
iteration 1100 / 5000: loss 4.597175
iteration 1200 / 5000: loss 4.826300
iteration 1300 / 5000: loss 4.445137
iteration 1400 / 5000: loss 4.423713
iteration 1500 / 5000: loss 4.624067
iteration 1600 / 5000: loss 4.874775
iteration 1700 / 5000: loss 4.883271
iteration 1800 / 5000: loss 4.747659
iteration 1900 / 5000: loss 4.924466
iteration 2000 / 5000: loss 4.734803
iteration 2100 / 5000: loss 4.785756
iteration 2200 / 5000: loss 5.103619
iteration 2300 / 5000: loss 4.809778
iteration 2400 / 5000: loss 4.765144
iteration 2500 / 5000: loss 4.646264
iteration 2600 / 5000: loss 4.744567
iteration 2700 / 5000: loss 4.590653
iteration 2800 / 5000: loss 4.861631
iteration 2900 / 5000: loss 4.699412
iteration 3000 / 5000: loss 4.621236
iteration 3100 / 5000: loss 4.817581
iteration 3200 / 5000: loss 4.590081
iteration 3300 / 5000: loss 4.773332
iteration 3400 / 5000: loss 4.789445
iteration 3500 / 5000: loss 4.771616
iteration 3600 / 5000: loss 4.737478
iteration 3700 / 5000: loss 4.890333
iteration 3800 / 5000: loss 4.303153
iteration 3900 / 5000: loss 4.731569
iteration 4000 / 5000: loss 4.776529
iteration 4100 / 5000: loss 4.476903
iteration 4200 / 5000: loss 4.907544
iteration 4300 / 5000: loss 4.611584
iteration 4400 / 5000: loss 4.746619
iteration 4500 / 5000: loss 4.750557
iteration 4600 / 5000: loss 4.731685
iteration 4700 / 5000: loss 4.615782
iteration 4800 / 5000: loss 5.031203
iteration 4900 / 5000: loss 4.811939
lr 1.000000e-02 reg 9.000000e-01 train_accuracy: 0.481122 val_accuracy: 0.486000
iteration 0 / 5000: loss 9.015159
iteration 100 / 5000: loss 5.166214
iteration 200 / 5000: loss 4.924538
iteration 300 / 5000: loss 4.929631
iteration 400 / 5000: loss 5.045435
iteration 500 / 5000: loss 4.877979
iteration 600 / 5000: loss 4.645149
iteration 700 / 5000: loss 4.580576
iteration 800 / 5000: loss 4.740631
iteration 900 / 5000: loss 4.644215
iteration 1000 / 5000: loss 4.980508
iteration 1100 / 5000: loss 4.664470
iteration 1200 / 5000: loss 4.864926
iteration 1300 / 5000: loss 4.810230
iteration 1400 / 5000: loss 4.804115
iteration 1500 / 5000: loss 4.776641
iteration 1600 / 5000: loss 4.471200
iteration 1700 / 5000: loss 4.762462
iteration 1800 / 5000: loss 4.919286
iteration 1900 / 5000: loss 4.901341
iteration 2000 / 5000: loss 5.017758
iteration 2100 / 5000: loss 4.568858
iteration 2200 / 5000: loss 4.811672
iteration 2300 / 5000: loss 4.816774
iteration 2400 / 5000: loss 4.755116
iteration 2500 / 5000: loss 4.913320
iteration 2600 / 5000: loss 4.576023
iteration 2700 / 5000: loss 4.941674
iteration 2800 / 5000: loss 4.879613
iteration 2900 / 5000: loss 5.251541
iteration 3000 / 5000: loss 4.738724
iteration 3100 / 5000: loss 4.590703
iteration 3200 / 5000: loss 4.507461
iteration 3300 / 5000: loss 4.971978
iteration 3400 / 5000: loss 4.754172
iteration 3500 / 5000: loss 4.661377
iteration 3600 / 5000: loss 5.039711
iteration 3700 / 5000: loss 4.946490
iteration 3800 / 5000: loss 4.908614
iteration 3900 / 5000: loss 5.017508
iteration 4000 / 5000: loss 4.758461
iteration 4100 / 5000: loss 4.813967
iteration 4200 / 5000: loss 4.462900
iteration 4300 / 5000: loss 4.590998
iteration 4400 / 5000: loss 4.918409
iteration 4500 / 5000: loss 4.927976
iteration 4600 / 5000: loss 4.685365
iteration 4700 / 5000: loss 5.176892
iteration 4800 / 5000: loss 4.399264
iteration 4900 / 5000: loss 4.676127
lr 1.000000e-02 reg 1.000000e+00 train_accuracy: 0.479959 val_accuracy: 0.469000
iteration 0 / 5000: loss 8.997035
iteration 100 / 5000: loss 5.843221
iteration 200 / 5000: loss 5.997841
iteration 300 / 5000: loss 5.907040
iteration 400 / 5000: loss 6.075411
iteration 500 / 5000: loss 6.015745
iteration 600 / 5000: loss 5.668903
iteration 700 / 5000: loss 6.124172
iteration 800 / 5000: loss 6.012853
iteration 900 / 5000: loss 5.905423
iteration 1000 / 5000: loss 5.751761
iteration 1100 / 5000: loss 6.103768
iteration 1200 / 5000: loss 5.669411
iteration 1300 / 5000: loss 5.827976
iteration 1400 / 5000: loss 5.876980
iteration 1500 / 5000: loss 6.226225
iteration 1600 / 5000: loss 6.143148
iteration 1700 / 5000: loss 6.003961
iteration 1800 / 5000: loss 6.005933
iteration 1900 / 5000: loss 6.036098
iteration 2000 / 5000: loss 6.151027
iteration 2100 / 5000: loss 5.783414
iteration 2200 / 5000: loss 5.911420
iteration 2300 / 5000: loss 6.045338
iteration 2400 / 5000: loss 6.232038
iteration 2500 / 5000: loss 5.949231
iteration 2600 / 5000: loss 6.090792
iteration 2700 / 5000: loss 6.012041
iteration 2800 / 5000: loss 6.001966
iteration 2900 / 5000: loss 5.810524
iteration 3000 / 5000: loss 6.184625
iteration 3100 / 5000: loss 6.229187
iteration 3200 / 5000: loss 6.104957
iteration 3300 / 5000: loss 6.051062
iteration 3400 / 5000: loss 6.025026
iteration 3500 / 5000: loss 5.625090
iteration 3600 / 5000: loss 5.794204
iteration 3700 / 5000: loss 5.987555
iteration 3800 / 5000: loss 5.670532
iteration 3900 / 5000: loss 6.056741
iteration 4000 / 5000: loss 5.917743
iteration 4100 / 5000: loss 6.174645
iteration 4200 / 5000: loss 5.973949
iteration 4300 / 5000: loss 6.020719
iteration 4400 / 5000: loss 5.597507
iteration 4500 / 5000: loss 5.988808
iteration 4600 / 5000: loss 5.975981
iteration 4700 / 5000: loss 5.711747
iteration 4800 / 5000: loss 6.197458
iteration 4900 / 5000: loss 6.039167
lr 1.000000e-02 reg 3.000000e+00 train_accuracy: 0.461082 val_accuracy: 0.450000
iteration 0 / 5000: loss 9.006339
iteration 100 / 5000: loss 7.466709
iteration 200 / 5000: loss 7.382438
iteration 300 / 5000: loss 7.415950
iteration 400 / 5000: loss 7.429417
iteration 500 / 5000: loss 7.460489
iteration 600 / 5000: loss 7.189807
iteration 700 / 5000: loss 7.360419
iteration 800 / 5000: loss 7.377778
iteration 900 / 5000: loss 7.219027
iteration 1000 / 5000: loss 7.213698
iteration 1100 / 5000: loss 7.453368
iteration 1200 / 5000: loss 7.719393
iteration 1300 / 5000: loss 7.617933
iteration 1400 / 5000: loss 7.405832
iteration 1500 / 5000: loss 7.353340
iteration 1600 / 5000: loss 7.498221
iteration 1700 / 5000: loss 7.179200
iteration 1800 / 5000: loss 7.348476
iteration 1900 / 5000: loss 7.381360
iteration 2000 / 5000: loss 7.725725
iteration 2100 / 5000: loss 7.309944
iteration 2200 / 5000: loss 7.263549
iteration 2300 / 5000: loss 7.097933
iteration 2400 / 5000: loss 7.607391
iteration 2500 / 5000: loss 7.380006
iteration 2600 / 5000: loss 7.399159
iteration 2700 / 5000: loss 7.603333
iteration 2800 / 5000: loss 7.401287
iteration 2900 / 5000: loss 7.185649
iteration 3000 / 5000: loss 7.200209
iteration 3100 / 5000: loss 7.159148
iteration 3200 / 5000: loss 7.430979
iteration 3300 / 5000: loss 7.347316
iteration 3400 / 5000: loss 7.072108
iteration 3500 / 5000: loss 7.402644
iteration 3600 / 5000: loss 7.136581
iteration 3700 / 5000: loss 7.355703
iteration 3800 / 5000: loss 7.231955
iteration 3900 / 5000: loss 7.340327
iteration 4000 / 5000: loss 7.486244
iteration 4100 / 5000: loss 7.480286
iteration 4200 / 5000: loss 7.185331
iteration 4300 / 5000: loss 7.042138
iteration 4400 / 5000: loss 7.394749
iteration 4500 / 5000: loss 7.484358
iteration 4600 / 5000: loss 7.359065
iteration 4700 / 5000: loss 7.438508
iteration 4800 / 5000: loss 7.177767
iteration 4900 / 5000: loss 7.293398
lr 1.000000e-02 reg 9.000000e+00 train_accuracy: 0.410143 val_accuracy: 0.400000

iteration 0 / 5000: loss 9.024649
iteration 100 / 5000: loss 3.495652
iteration 200 / 5000: loss 3.400967
iteration 300 / 5000: loss 3.430366
iteration 400 / 5000: loss 3.621128
iteration 500 / 5000: loss 3.299503
iteration 600 / 5000: loss 3.554025
iteration 700 / 5000: loss 3.645478
iteration 800 / 5000: loss 3.511873
iteration 900 / 5000: loss 3.754857
iteration 1000 / 5000: loss 3.575806
iteration 1100 / 5000: loss 3.409546
iteration 1200 / 5000: loss 3.388418
iteration 1300 / 5000: loss 3.596604
iteration 1400 / 5000: loss 3.411689
iteration 1500 / 5000: loss 3.584095
iteration 1600 / 5000: loss 3.466606
iteration 1700 / 5000: loss 3.376632
iteration 1800 / 5000: loss 3.475877
iteration 1900 / 5000: loss 3.642242
iteration 2000 / 5000: loss 3.566914
iteration 2100 / 5000: loss 3.499711
iteration 2200 / 5000: loss 3.715451
iteration 2300 / 5000: loss 3.650959
iteration 2400 / 5000: loss 3.598651
iteration 2500 / 5000: loss 3.851910
iteration 2600 / 5000: loss 3.692822
iteration 2700 / 5000: loss 2.993619
iteration 2800 / 5000: loss 3.225693
iteration 2900 / 5000: loss 3.558256
iteration 3000 / 5000: loss 3.486034
iteration 3100 / 5000: loss 3.513697
iteration 3200 / 5000: loss 3.488847
iteration 3300 / 5000: loss 3.316503
iteration 3400 / 5000: loss 3.690808
iteration 3500 / 5000: loss 3.510623
iteration 3600 / 5000: loss 3.403585
iteration 3700 / 5000: loss 3.438669
iteration 3800 / 5000: loss 3.616557
iteration 3900 / 5000: loss 3.504805
iteration 4000 / 5000: loss 3.297352
iteration 4100 / 5000: loss 3.627029
iteration 4200 / 5000: loss 3.361474
iteration 4300 / 5000: loss 3.305979
iteration 4400 / 5000: loss 3.105292
iteration 4500 / 5000: loss 3.603225
iteration 4600 / 5000: loss 3.377814
iteration 4700 / 5000: loss 3.368074
iteration 4800 / 5000: loss 3.499787
iteration 4900 / 5000: loss 3.312771
lr 3.000000e-02 reg 1.000000e-01 train_accuracy: 0.496163 val_accuracy: 0.496000
iteration 0 / 5000: loss 9.005784
iteration 100 / 5000: loss 4.145720
iteration 200 / 5000: loss 4.019196
iteration 300 / 5000: loss 3.985029
iteration 400 / 5000: loss 4.069482
iteration 500 / 5000: loss 3.957903
iteration 600 / 5000: loss 3.763464
iteration 700 / 5000: loss 3.969347
iteration 800 / 5000: loss 4.245621
iteration 900 / 5000: loss 4.153715
iteration 1000 / 5000: loss 3.779031
iteration 1100 / 5000: loss 3.899397
iteration 1200 / 5000: loss 4.204455
iteration 1300 / 5000: loss 3.974229
iteration 1400 / 5000: loss 4.184440
iteration 1500 / 5000: loss 3.937073
iteration 1600 / 5000: loss 4.059356
iteration 1700 / 5000: loss 4.012441
iteration 1800 / 5000: loss 4.220724
iteration 1900 / 5000: loss 4.164030
iteration 2000 / 5000: loss 3.938456
iteration 2100 / 5000: loss 4.072913
iteration 2200 / 5000: loss 4.155041
iteration 2300 / 5000: loss 3.737635
iteration 2400 / 5000: loss 4.098384
iteration 2500 / 5000: loss 4.106274
iteration 2600 / 5000: loss 3.795910
iteration 2700 / 5000: loss 4.132892
iteration 2800 / 5000: loss 3.908182
iteration 2900 / 5000: loss 4.009743
iteration 3000 / 5000: loss 3.825700
iteration 3100 / 5000: loss 4.180204
iteration 3200 / 5000: loss 3.890908
iteration 3300 / 5000: loss 4.464553
iteration 3400 / 5000: loss 4.199809
iteration 3500 / 5000: loss 3.984978
iteration 3600 / 5000: loss 4.019114
iteration 3700 / 5000: loss 4.265255
iteration 3800 / 5000: loss 3.994733
iteration 3900 / 5000: loss 3.916794
iteration 4000 / 5000: loss 3.991109
iteration 4100 / 5000: loss 4.040022
iteration 4200 / 5000: loss 4.067578
iteration 4300 / 5000: loss 4.015643
iteration 4400 / 5000: loss 3.744799
iteration 4500 / 5000: loss 4.245806
iteration 4600 / 5000: loss 4.116121
iteration 4700 / 5000: loss 3.827599
iteration 4800 / 5000: loss 4.334154
iteration 4900 / 5000: loss 3.869227
lr 3.000000e-02 reg 3.000000e-01 train_accuracy: 0.490959 val_accuracy: 0.484000
iteration 0 / 5000: loss 9.007466
iteration 100 / 5000: loss 4.785802
iteration 200 / 5000: loss 4.797304
iteration 300 / 5000: loss 4.764431
iteration 400 / 5000: loss 4.711419
iteration 500 / 5000: loss 4.567169
iteration 600 / 5000: loss 5.020060
iteration 700 / 5000: loss 4.781586
iteration 800 / 5000: loss 4.848671
iteration 900 / 5000: loss 4.934785
iteration 1000 / 5000: loss 4.943235
iteration 1100 / 5000: loss 4.694187
iteration 1200 / 5000: loss 4.723806
iteration 1300 / 5000: loss 4.894803
iteration 1400 / 5000: loss 4.963300
iteration 1500 / 5000: loss 4.771956
iteration 1600 / 5000: loss 5.056611
iteration 1700 / 5000: loss 5.004536
iteration 1800 / 5000: loss 4.552701
iteration 1900 / 5000: loss 4.800281
iteration 2000 / 5000: loss 4.637618
iteration 2100 / 5000: loss 4.988251
iteration 2200 / 5000: loss 4.905779
iteration 2300 / 5000: loss 4.922316
iteration 2400 / 5000: loss 4.639805
iteration 2500 / 5000: loss 5.308065
iteration 2600 / 5000: loss 4.617761
iteration 2700 / 5000: loss 4.776515
iteration 2800 / 5000: loss 4.954821
iteration 2900 / 5000: loss 4.879465
iteration 3000 / 5000: loss 4.962719
iteration 3100 / 5000: loss 4.662515
iteration 3200 / 5000: loss 4.695933
iteration 3300 / 5000: loss 4.869959
iteration 3400 / 5000: loss 4.948366
iteration 3500 / 5000: loss 5.053163
iteration 3600 / 5000: loss 4.725928
iteration 3700 / 5000: loss 4.924982
iteration 3800 / 5000: loss 4.894923
iteration 3900 / 5000: loss 5.041162
iteration 4000 / 5000: loss 4.464270
iteration 4100 / 5000: loss 4.866778
iteration 4200 / 5000: loss 4.676706
iteration 4300 / 5000: loss 4.912365
iteration 4400 / 5000: loss 4.706949
iteration 4500 / 5000: loss 5.073586
iteration 4600 / 5000: loss 4.906230
iteration 4700 / 5000: loss 4.810932
iteration 4800 / 5000: loss 4.870954
iteration 4900 / 5000: loss 5.080807
lr 3.000000e-02 reg 9.000000e-01 train_accuracy: 0.470959 val_accuracy: 0.465000
iteration 0 / 5000: loss 8.990552
iteration 100 / 5000: loss 4.624786
iteration 200 / 5000: loss 5.210547
iteration 300 / 5000: loss 5.301135
iteration 400 / 5000: loss 4.984939
iteration 500 / 5000: loss 4.718938
iteration 600 / 5000: loss 5.045975
iteration 700 / 5000: loss 4.986672
iteration 800 / 5000: loss 4.926749
iteration 900 / 5000: loss 4.633659
iteration 1000 / 5000: loss 4.997198
iteration 1100 / 5000: loss 4.894407
iteration 1200 / 5000: loss 5.068533
iteration 1300 / 5000: loss 5.027690
iteration 1400 / 5000: loss 5.400768
iteration 1500 / 5000: loss 5.074342
iteration 1600 / 5000: loss 5.118100
iteration 1700 / 5000: loss 4.648422
iteration 1800 / 5000: loss 5.040199
iteration 1900 / 5000: loss 4.734811
iteration 2000 / 5000: loss 4.652929
iteration 2100 / 5000: loss 5.136832
iteration 2200 / 5000: loss 4.764198
iteration 2300 / 5000: loss 4.846451
iteration 2400 / 5000: loss 5.062284
iteration 2500 / 5000: loss 5.211421
iteration 2600 / 5000: loss 4.994556
iteration 2700 / 5000: loss 5.210913
iteration 2800 / 5000: loss 4.768790
iteration 2900 / 5000: loss 4.804464
iteration 3000 / 5000: loss 5.085106
iteration 3100 / 5000: loss 5.119723
iteration 3200 / 5000: loss 4.708387
iteration 3300 / 5000: loss 4.660712
iteration 3400 / 5000: loss 4.896666
iteration 3500 / 5000: loss 4.549913
iteration 3600 / 5000: loss 4.664884
iteration 3700 / 5000: loss 4.857754
iteration 3800 / 5000: loss 5.296881
iteration 3900 / 5000: loss 5.082804
iteration 4000 / 5000: loss 5.016835
iteration 4100 / 5000: loss 4.851344
iteration 4200 / 5000: loss 4.695574
iteration 4300 / 5000: loss 5.029625
iteration 4400 / 5000: loss 4.846924
iteration 4500 / 5000: loss 4.965506
iteration 4600 / 5000: loss 5.068939
iteration 4700 / 5000: loss 4.796460
iteration 4800 / 5000: loss 4.677138
iteration 4900 / 5000: loss 5.052642
lr 3.000000e-02 reg 1.000000e+00 train_accuracy: 0.465816 val_accuracy: 0.472000
iteration 0 / 5000: loss 9.001759
iteration 100 / 5000: loss 6.428148
iteration 200 / 5000: loss 6.370850
iteration 300 / 5000: loss 5.950575
iteration 400 / 5000: loss 6.313059
iteration 500 / 5000: loss 6.069484
iteration 600 / 5000: loss 6.296304
iteration 700 / 5000: loss 5.996113
iteration 800 / 5000: loss 6.283708
iteration 900 / 5000: loss 6.093954
iteration 1000 / 5000: loss 6.201705
iteration 1100 / 5000: loss 6.207888
iteration 1200 / 5000: loss 6.343364
iteration 1300 / 5000: loss 5.914000
iteration 1400 / 5000: loss 5.924373
iteration 1500 / 5000: loss 6.512131
iteration 1600 / 5000: loss 6.231083
iteration 1700 / 5000: loss 6.214275
iteration 1800 / 5000: loss 6.106806
iteration 1900 / 5000: loss 6.408332
iteration 2000 / 5000: loss 6.079231
iteration 2100 / 5000: loss 6.142896
iteration 2200 / 5000: loss 5.819200
iteration 2300 / 5000: loss 6.095591
iteration 2400 / 5000: loss 6.086364
iteration 2500 / 5000: loss 6.047742
iteration 2600 / 5000: loss 6.286931
iteration 2700 / 5000: loss 6.112118
iteration 2800 / 5000: loss 6.094709
iteration 2900 / 5000: loss 5.962891
iteration 3000 / 5000: loss 5.985313
iteration 3100 / 5000: loss 6.216039
iteration 3200 / 5000: loss 6.122448
iteration 3300 / 5000: loss 6.192602
iteration 3400 / 5000: loss 6.409335
iteration 3500 / 5000: loss 6.266842
iteration 3600 / 5000: loss 5.970354
iteration 3700 / 5000: loss 6.218444
iteration 3800 / 5000: loss 6.186254
iteration 3900 / 5000: loss 6.485269
iteration 4000 / 5000: loss 6.125687
iteration 4100 / 5000: loss 6.628030
iteration 4200 / 5000: loss 6.013200
iteration 4300 / 5000: loss 6.287231
iteration 4400 / 5000: loss 6.044820
iteration 4500 / 5000: loss 6.280581
iteration 4600 / 5000: loss 6.138379
iteration 4700 / 5000: loss 6.550295
iteration 4800 / 5000: loss 5.966823
iteration 4900 / 5000: loss 6.321261
lr 3.000000e-02 reg 3.000000e+00 train_accuracy: 0.439939 val_accuracy: 0.451000
iteration 0 / 5000: loss 9.020980
iteration 100 / 5000: loss 7.748047
iteration 200 / 5000: loss 7.830755
iteration 300 / 5000: loss 7.750583
iteration 400 / 5000: loss 7.631802
iteration 500 / 5000: loss 7.736118
iteration 600 / 5000: loss 7.682005
iteration 700 / 5000: loss 7.811244
iteration 800 / 5000: loss 7.937837
iteration 900 / 5000: loss 7.549304
iteration 1000 / 5000: loss 7.425014
iteration 1100 / 5000: loss 7.446669
iteration 1200 / 5000: loss 7.818173
iteration 1300 / 5000: loss 7.876827
iteration 1400 / 5000: loss 7.899425
iteration 1500 / 5000: loss 7.633582
iteration 1600 / 5000: loss 7.645114
iteration 1700 / 5000: loss 7.793663
iteration 1800 / 5000: loss 7.650702
iteration 1900 / 5000: loss 7.909158
iteration 2000 / 5000: loss 7.491395
iteration 2100 / 5000: loss 7.554958
iteration 2200 / 5000: loss 7.771035
iteration 2300 / 5000: loss 7.755209
iteration 2400 / 5000: loss 7.653679
iteration 2500 / 5000: loss 7.354919
iteration 2600 / 5000: loss 7.715613
iteration 2700 / 5000: loss 7.847216
iteration 2800 / 5000: loss 7.696524
iteration 2900 / 5000: loss 7.714006
iteration 3000 / 5000: loss 7.463933
iteration 3100 / 5000: loss 7.693472
iteration 3200 / 5000: loss 7.592172
iteration 3300 / 5000: loss 7.668340
iteration 3400 / 5000: loss 7.950128
iteration 3500 / 5000: loss 7.598237
iteration 3600 / 5000: loss 7.873103
iteration 3700 / 5000: loss 7.737397
iteration 3800 / 5000: loss 7.731957
iteration 3900 / 5000: loss 7.612335
iteration 4000 / 5000: loss 7.609233
iteration 4100 / 5000: loss 7.709183
iteration 4200 / 5000: loss 8.061469
iteration 4300 / 5000: loss 7.749936
iteration 4400 / 5000: loss 7.865318
iteration 4500 / 5000: loss 7.633126
iteration 4600 / 5000: loss 7.639399
iteration 4700 / 5000: loss 7.590911
iteration 4800 / 5000: loss 7.751212
iteration 4900 / 5000: loss 7.565723
lr 3.000000e-02 reg 9.000000e+00 train_accuracy: 0.393918 val_accuracy: 0.391000

iteration 0 / 5000: loss 9.003909
iteration 100 / 5000: loss 3.483607
iteration 200 / 5000: loss 3.590212
iteration 300 / 5000: loss 3.479825
iteration 400 / 5000: loss 3.211656
iteration 500 / 5000: loss 3.962322
iteration 600 / 5000: loss 3.832054
iteration 700 / 5000: loss 3.539543
iteration 800 / 5000: loss 3.787776
iteration 900 / 5000: loss 3.365309
iteration 1000 / 5000: loss 3.389566
iteration 1100 / 5000: loss 3.501695
iteration 1200 / 5000: loss 3.453372
iteration 1300 / 5000: loss 3.640366
iteration 1400 / 5000: loss 3.957190
iteration 1500 / 5000: loss 3.584225
iteration 1600 / 5000: loss 3.773115
iteration 1700 / 5000: loss 3.737974
iteration 1800 / 5000: loss 3.485993
iteration 1900 / 5000: loss 3.620082
iteration 2000 / 5000: loss 3.346134
iteration 2100 / 5000: loss 3.519767
iteration 2200 / 5000: loss 3.346888
iteration 2300 / 5000: loss 3.322010
iteration 2400 / 5000: loss 3.866210
iteration 2500 / 5000: loss 3.257378
iteration 2600 / 5000: loss 3.644761
iteration 2700 / 5000: loss 3.697390
iteration 2800 / 5000: loss 3.650781
iteration 2900 / 5000: loss 3.524858
iteration 3000 / 5000: loss 3.793308
iteration 3100 / 5000: loss 3.378376
iteration 3200 / 5000: loss 3.420396
iteration 3300 / 5000: loss 3.337381
iteration 3400 / 5000: loss 3.366468
iteration 3500 / 5000: loss 3.451091
iteration 3600 / 5000: loss 3.906481
iteration 3700 / 5000: loss 3.655675
iteration 3800 / 5000: loss 3.448786
iteration 3900 / 5000: loss 3.585363
iteration 4000 / 5000: loss 3.683759
iteration 4100 / 5000: loss 3.677524
iteration 4200 / 5000: loss 3.609100
iteration 4300 / 5000: loss 3.508398
iteration 4400 / 5000: loss 3.719953
iteration 4500 / 5000: loss 3.625574
iteration 4600 / 5000: loss 3.720860
iteration 4700 / 5000: loss 3.411540
iteration 4800 / 5000: loss 3.317961
iteration 4900 / 5000: loss 3.610685
lr 9.000000e-02 reg 1.000000e-01 train_accuracy: 0.476755 val_accuracy: 0.490000
iteration 0 / 5000: loss 8.992084
iteration 100 / 5000: loss 4.263589
iteration 200 / 5000: loss 4.075241
iteration 300 / 5000: loss 4.075227
iteration 400 / 5000: loss 4.191248
iteration 500 / 5000: loss 4.253262
iteration 600 / 5000: loss 4.158030
iteration 700 / 5000: loss 4.310224
iteration 800 / 5000: loss 4.269081
iteration 900 / 5000: loss 4.387166
iteration 1000 / 5000: loss 4.311082
iteration 1100 / 5000: loss 4.425164
iteration 1200 / 5000: loss 4.392960
iteration 1300 / 5000: loss 4.562031
iteration 1400 / 5000: loss 4.466252
iteration 1500 / 5000: loss 4.760254
iteration 1600 / 5000: loss 4.318528
iteration 1700 / 5000: loss 4.410461
iteration 1800 / 5000: loss 4.674437
iteration 1900 / 5000: loss 3.938098
iteration 2000 / 5000: loss 4.159418
iteration 2100 / 5000: loss 4.375605
iteration 2200 / 5000: loss 4.303190
iteration 2300 / 5000: loss 4.377586
iteration 2400 / 5000: loss 3.946243
iteration 2500 / 5000: loss 4.644820
iteration 2600 / 5000: loss 4.592194
iteration 2700 / 5000: loss 4.255906
iteration 2800 / 5000: loss 4.485708
iteration 2900 / 5000: loss 4.287797
iteration 3000 / 5000: loss 4.451449
iteration 3100 / 5000: loss 4.070862
iteration 3200 / 5000: loss 4.534213
iteration 3300 / 5000: loss 4.441392
iteration 3400 / 5000: loss 4.336078
iteration 3500 / 5000: loss 4.018143
iteration 3600 / 5000: loss 4.169587
iteration 3700 / 5000: loss 4.539037
iteration 3800 / 5000: loss 4.456948
iteration 3900 / 5000: loss 3.989816
iteration 4000 / 5000: loss 4.386345
iteration 4100 / 5000: loss 4.053544
iteration 4200 / 5000: loss 4.145897
iteration 4300 / 5000: loss 4.125062
iteration 4400 / 5000: loss 3.867048
iteration 4500 / 5000: loss 4.332177
iteration 4600 / 5000: loss 4.246165
iteration 4700 / 5000: loss 4.445153
iteration 4800 / 5000: loss 4.329971
iteration 4900 / 5000: loss 4.431659
lr 9.000000e-02 reg 3.000000e-01 train_accuracy: 0.465673 val_accuracy: 0.451000
iteration 0 / 5000: loss 8.992842
iteration 100 / 5000: loss 5.591381
iteration 200 / 5000: loss 5.455543
iteration 300 / 5000: loss 5.520173
iteration 400 / 5000: loss 5.196610
iteration 500 / 5000: loss 5.507184
iteration 600 / 5000: loss 5.432123
iteration 700 / 5000: loss 5.327855
iteration 800 / 5000: loss 5.048078
iteration 900 / 5000: loss 5.489673
iteration 1000 / 5000: loss 4.808735
iteration 1100 / 5000: loss 5.402425
iteration 1200 / 5000: loss 5.743514
iteration 1300 / 5000: loss 5.311452
iteration 1400 / 5000: loss 4.898727
iteration 1500 / 5000: loss 5.182588
iteration 1600 / 5000: loss 5.138954
iteration 1700 / 5000: loss 5.485009
iteration 1800 / 5000: loss 5.195099
iteration 1900 / 5000: loss 5.270619
iteration 2000 / 5000: loss 5.189535
iteration 2100 / 5000: loss 5.245843
iteration 2200 / 5000: loss 5.209635
iteration 2300 / 5000: loss 5.422800
iteration 2400 / 5000: loss 5.233223
iteration 2500 / 5000: loss 5.493037
iteration 2600 / 5000: loss 5.356293
iteration 2700 / 5000: loss 5.213312
iteration 2800 / 5000: loss 5.156221
iteration 2900 / 5000: loss 5.114076
iteration 3000 / 5000: loss 5.195695
iteration 3100 / 5000: loss 5.267874
iteration 3200 / 5000: loss 5.197172
iteration 3300 / 5000: loss 5.092157
iteration 3400 / 5000: loss 5.019541
iteration 3500 / 5000: loss 4.953839
iteration 3600 / 5000: loss 5.022213
iteration 3700 / 5000: loss 5.326481
iteration 3800 / 5000: loss 5.029231
iteration 3900 / 5000: loss 5.074592
iteration 4000 / 5000: loss 5.490841
iteration 4100 / 5000: loss 5.437386
iteration 4200 / 5000: loss 5.210156
iteration 4300 / 5000: loss 5.278911
iteration 4400 / 5000: loss 5.496124
iteration 4500 / 5000: loss 5.289352
iteration 4600 / 5000: loss 5.053233
iteration 4700 / 5000: loss 4.969494
iteration 4800 / 5000: loss 5.316798
iteration 4900 / 5000: loss 5.161878
lr 9.000000e-02 reg 9.000000e-01 train_accuracy: 0.432796 val_accuracy: 0.427000
iteration 0 / 5000: loss 9.008254
iteration 100 / 5000: loss 5.352700
iteration 200 / 5000: loss 5.595005
iteration 300 / 5000: loss 5.077264
iteration 400 / 5000: loss 4.981355
iteration 500 / 5000: loss 5.342299
iteration 600 / 5000: loss 5.113128
iteration 700 / 5000: loss 5.473625
iteration 800 / 5000: loss 5.106494
iteration 900 / 5000: loss 5.165725
iteration 1000 / 5000: loss 5.211673
iteration 1100 / 5000: loss 5.352216
iteration 1200 / 5000: loss 5.386664
iteration 1300 / 5000: loss 5.122710
iteration 1400 / 5000: loss 5.443683
iteration 1500 / 5000: loss 4.926511
iteration 1600 / 5000: loss 5.393194
iteration 1700 / 5000: loss 5.672474
iteration 1800 / 5000: loss 5.243794
iteration 1900 / 5000: loss 5.098456
iteration 2000 / 5000: loss 5.523808
iteration 2100 / 5000: loss 5.228133
iteration 2200 / 5000: loss 4.913645
iteration 2300 / 5000: loss 4.944451
iteration 2400 / 5000: loss 5.532358
iteration 2500 / 5000: loss 5.268211
iteration 2600 / 5000: loss 5.277105
iteration 2700 / 5000: loss 5.437102
iteration 2800 / 5000: loss 5.659905
iteration 2900 / 5000: loss 5.321581
iteration 3000 / 5000: loss 5.565497
iteration 3100 / 5000: loss 5.286274
iteration 3200 / 5000: loss 5.049862
iteration 3300 / 5000: loss 5.699939
iteration 3400 / 5000: loss 5.321785
iteration 3500 / 5000: loss 5.243683
iteration 3600 / 5000: loss 5.346927
iteration 3700 / 5000: loss 5.558087
iteration 3800 / 5000: loss 5.024710
iteration 3900 / 5000: loss 5.638678
iteration 4000 / 5000: loss 5.119666
iteration 4100 / 5000: loss 5.582897
iteration 4200 / 5000: loss 5.300440
iteration 4300 / 5000: loss 5.152688
iteration 4400 / 5000: loss 5.418199
iteration 4500 / 5000: loss 5.293875
iteration 4600 / 5000: loss 5.287345
iteration 4700 / 5000: loss 5.278567
iteration 4800 / 5000: loss 5.205930
iteration 4900 / 5000: loss 5.424493
lr 9.000000e-02 reg 1.000000e+00 train_accuracy: 0.434878 val_accuracy: 0.423000
iteration 0 / 5000: loss 8.989239
iteration 100 / 5000: loss 6.732992
iteration 200 / 5000: loss 6.742985
iteration 300 / 5000: loss 7.175884
iteration 400 / 5000: loss 6.854049
iteration 500 / 5000: loss 7.315666
iteration 600 / 5000: loss 6.925786
iteration 700 / 5000: loss 6.915961
iteration 800 / 5000: loss 7.105986
iteration 900 / 5000: loss 7.396198
iteration 1000 / 5000: loss 6.987889
iteration 1100 / 5000: loss 6.790732
iteration 1200 / 5000: loss 7.027695
iteration 1300 / 5000: loss 7.071258
iteration 1400 / 5000: loss 7.217566
iteration 1500 / 5000: loss 7.016586
iteration 1600 / 5000: loss 7.330764
iteration 1700 / 5000: loss 7.135956
iteration 1800 / 5000: loss 6.724256
iteration 1900 / 5000: loss 6.897607
iteration 2000 / 5000: loss 7.028266
iteration 2100 / 5000: loss 6.850079
iteration 2200 / 5000: loss 6.707773
iteration 2300 / 5000: loss 6.881942
iteration 2400 / 5000: loss 6.907860
iteration 2500 / 5000: loss 7.090359
iteration 2600 / 5000: loss 6.721774
iteration 2700 / 5000: loss 6.683811
iteration 2800 / 5000: loss 6.831118
iteration 2900 / 5000: loss 6.585051
iteration 3000 / 5000: loss 6.989682
iteration 3100 / 5000: loss 6.952745
iteration 3200 / 5000: loss 7.319209
iteration 3300 / 5000: loss 6.710027
iteration 3400 / 5000: loss 7.185427
iteration 3500 / 5000: loss 7.143579
iteration 3600 / 5000: loss 7.138532
iteration 3700 / 5000: loss 7.116427
iteration 3800 / 5000: loss 7.096680
iteration 3900 / 5000: loss 6.773230
iteration 4000 / 5000: loss 7.087162
iteration 4100 / 5000: loss 6.729331
iteration 4200 / 5000: loss 6.765011
iteration 4300 / 5000: loss 6.932883
iteration 4400 / 5000: loss 6.823617
iteration 4500 / 5000: loss 7.161401
iteration 4600 / 5000: loss 6.769672
iteration 4700 / 5000: loss 6.902824
iteration 4800 / 5000: loss 7.133170
iteration 4900 / 5000: loss 6.783567
lr 9.000000e-02 reg 3.000000e+00 train_accuracy: 0.361551 val_accuracy: 0.337000
iteration 0 / 5000: loss 8.997321
iteration 100 / 5000: loss 25.820759
iteration 200 / 5000: loss 24.377258
iteration 300 / 5000: loss 24.388571
iteration 400 / 5000: loss 24.398874
iteration 500 / 5000: loss 24.534252
iteration 600 / 5000: loss 24.823339
iteration 700 / 5000: loss 24.053567
iteration 800 / 5000: loss 24.753346
iteration 900 / 5000: loss 24.423265
iteration 1000 / 5000: loss 23.566925
iteration 1100 / 5000: loss 24.038413
iteration 1200 / 5000: loss 25.757328
iteration 1300 / 5000: loss 24.678687
iteration 1400 / 5000: loss 25.350486
iteration 1500 / 5000: loss 25.549011
iteration 1600 / 5000: loss 24.183515
iteration 1700 / 5000: loss 25.053390
iteration 1800 / 5000: loss 24.794510
iteration 1900 / 5000: loss 23.070711
iteration 2000 / 5000: loss 24.594223
iteration 2100 / 5000: loss 25.142421
iteration 2200 / 5000: loss 25.248665
iteration 2300 / 5000: loss 24.509283
iteration 2400 / 5000: loss 24.362650
iteration 2500 / 5000: loss 25.239822
iteration 2600 / 5000: loss 24.603124
iteration 2700 / 5000: loss 25.684018
iteration 2800 / 5000: loss 24.347165
iteration 2900 / 5000: loss 25.650179
iteration 3000 / 5000: loss 26.215852
iteration 3100 / 5000: loss 22.526019
iteration 3200 / 5000: loss 25.312426
iteration 3300 / 5000: loss 25.462469
iteration 3400 / 5000: loss 21.483773
iteration 3500 / 5000: loss 22.969683
iteration 3600 / 5000: loss 22.209063
iteration 3700 / 5000: loss 24.852183
iteration 3800 / 5000: loss 25.014160
iteration 3900 / 5000: loss 24.115704
iteration 4000 / 5000: loss 24.365296
iteration 4100 / 5000: loss 23.927835
iteration 4200 / 5000: loss 25.679442
iteration 4300 / 5000: loss 24.197283
iteration 4400 / 5000: loss 24.001174
iteration 4500 / 5000: loss 25.296215
iteration 4600 / 5000: loss 26.043940
iteration 4700 / 5000: loss 26.031542
iteration 4800 / 5000: loss 24.115890
iteration 4900 / 5000: loss 23.381175
lr 9.000000e-02 reg 9.000000e+00 train_accuracy: 0.091878 val_accuracy: 0.094000

iteration 0 / 5000: loss 8.998553
iteration 100 / 5000: loss 3.650637
iteration 200 / 5000: loss 3.713241
iteration 300 / 5000: loss 3.771037
iteration 400 / 5000: loss 3.666281
iteration 500 / 5000: loss 3.838556
iteration 600 / 5000: loss 3.800231
iteration 700 / 5000: loss 3.943909
iteration 800 / 5000: loss 3.350277
iteration 900 / 5000: loss 3.612993
iteration 1000 / 5000: loss 3.945169
iteration 1100 / 5000: loss 3.522750
iteration 1200 / 5000: loss 3.794770
iteration 1300 / 5000: loss 3.756455
iteration 1400 / 5000: loss 3.654074
iteration 1500 / 5000: loss 3.715650
iteration 1600 / 5000: loss 3.470696
iteration 1700 / 5000: loss 3.885236
iteration 1800 / 5000: loss 3.499870
iteration 1900 / 5000: loss 3.525467
iteration 2000 / 5000: loss 3.869140
iteration 2100 / 5000: loss 3.640016
iteration 2200 / 5000: loss 4.063625
iteration 2300 / 5000: loss 3.891080
iteration 2400 / 5000: loss 3.820253
iteration 2500 / 5000: loss 3.887443
iteration 2600 / 5000: loss 3.909972
iteration 2700 / 5000: loss 3.547064
iteration 2800 / 5000: loss 3.981085
iteration 2900 / 5000: loss 3.614826
iteration 3000 / 5000: loss 3.827621
iteration 3100 / 5000: loss 3.914928
iteration 3200 / 5000: loss 3.417502
iteration 3300 / 5000: loss 3.774295
iteration 3400 / 5000: loss 3.925289
iteration 3500 / 5000: loss 3.006028
iteration 3600 / 5000: loss 3.803166
iteration 3700 / 5000: loss 3.915741
iteration 3800 / 5000: loss 3.552021
iteration 3900 / 5000: loss 3.770847
iteration 4000 / 5000: loss 3.778116
iteration 4100 / 5000: loss 3.903122
iteration 4200 / 5000: loss 3.490398
iteration 4300 / 5000: loss 3.529165
iteration 4400 / 5000: loss 3.571023
iteration 4500 / 5000: loss 3.667356
iteration 4600 / 5000: loss 3.643565
iteration 4700 / 5000: loss 3.693731
iteration 4800 / 5000: loss 4.142742
iteration 4900 / 5000: loss 3.839845
lr 1.000000e-01 reg 1.000000e-01 train_accuracy: 0.474857 val_accuracy: 0.449000
iteration 0 / 5000: loss 8.997548
iteration 100 / 5000: loss 4.299607
iteration 200 / 5000: loss 4.013786
iteration 300 / 5000: loss 4.464734
iteration 400 / 5000: loss 4.291689
iteration 500 / 5000: loss 4.221840
iteration 600 / 5000: loss 4.636206
iteration 700 / 5000: loss 4.422419
iteration 800 / 5000: loss 4.344226
iteration 900 / 5000: loss 4.287278
iteration 1000 / 5000: loss 4.338910
iteration 1100 / 5000: loss 4.075113
iteration 1200 / 5000: loss 4.155915
iteration 1300 / 5000: loss 4.378754
iteration 1400 / 5000: loss 3.907189
iteration 1500 / 5000: loss 4.320622
iteration 1600 / 5000: loss 3.934694
iteration 1700 / 5000: loss 4.430156
iteration 1800 / 5000: loss 4.272437
iteration 1900 / 5000: loss 4.290158
iteration 2000 / 5000: loss 4.056436
iteration 2100 / 5000: loss 4.499981
iteration 2200 / 5000: loss 4.191527
iteration 2300 / 5000: loss 4.307839
iteration 2400 / 5000: loss 4.438489
iteration 2500 / 5000: loss 4.511381
iteration 2600 / 5000: loss 4.368979
iteration 2700 / 5000: loss 4.016594
iteration 2800 / 5000: loss 4.571133
iteration 2900 / 5000: loss 4.284455
iteration 3000 / 5000: loss 4.090643
iteration 3100 / 5000: loss 3.952441
iteration 3200 / 5000: loss 4.438389
iteration 3300 / 5000: loss 4.583410
iteration 3400 / 5000: loss 4.173855
iteration 3500 / 5000: loss 4.419597
iteration 3600 / 5000: loss 4.190132
iteration 3700 / 5000: loss 4.317217
iteration 3800 / 5000: loss 4.031577
iteration 3900 / 5000: loss 4.543652
iteration 4000 / 5000: loss 4.317362
iteration 4100 / 5000: loss 4.536889
iteration 4200 / 5000: loss 3.986102
iteration 4300 / 5000: loss 4.277683
iteration 4400 / 5000: loss 4.552007
iteration 4500 / 5000: loss 4.382341
iteration 4600 / 5000: loss 4.641451
iteration 4700 / 5000: loss 3.909156
iteration 4800 / 5000: loss 4.116632
iteration 4900 / 5000: loss 4.531365
lr 1.000000e-01 reg 3.000000e-01 train_accuracy: 0.456694 val_accuracy: 0.455000
iteration 0 / 5000: loss 8.996154
iteration 100 / 5000: loss 5.221267
iteration 200 / 5000: loss 5.697152
iteration 300 / 5000: loss 5.181737
iteration 400 / 5000: loss 5.512568
iteration 500 / 5000: loss 5.659756
iteration 600 / 5000: loss 5.336472
iteration 700 / 5000: loss 4.929920
iteration 800 / 5000: loss 4.988014
iteration 900 / 5000: loss 5.072111
iteration 1000 / 5000: loss 5.440132
iteration 1100 / 5000: loss 4.977790
iteration 1200 / 5000: loss 4.955277
iteration 1300 / 5000: loss 5.612217
iteration 1400 / 5000: loss 4.933659
iteration 1500 / 5000: loss 5.576227
iteration 1600 / 5000: loss 5.397871
iteration 1700 / 5000: loss 5.417223
iteration 1800 / 5000: loss 5.476000
iteration 1900 / 5000: loss 5.498919
iteration 2000 / 5000: loss 5.261152
iteration 2100 / 5000: loss 5.719680
iteration 2200 / 5000: loss 5.338520
iteration 2300 / 5000: loss 5.524358
iteration 2400 / 5000: loss 5.456832
iteration 2500 / 5000: loss 5.502613
iteration 2600 / 5000: loss 5.418102
iteration 2700 / 5000: loss 5.468528
iteration 2800 / 5000: loss 5.314881
iteration 2900 / 5000: loss 5.227134
iteration 3000 / 5000: loss 5.429487
iteration 3100 / 5000: loss 5.490712
iteration 3200 / 5000: loss 5.018560
iteration 3300 / 5000: loss 5.547542
iteration 3400 / 5000: loss 5.082282
iteration 3500 / 5000: loss 5.217386
iteration 3600 / 5000: loss 5.439582
iteration 3700 / 5000: loss 5.778960
iteration 3800 / 5000: loss 5.139342
iteration 3900 / 5000: loss 5.549178
iteration 4000 / 5000: loss 5.151996
iteration 4100 / 5000: loss 5.674570
iteration 4200 / 5000: loss 5.353894
iteration 4300 / 5000: loss 5.253734
iteration 4400 / 5000: loss 5.366656
iteration 4500 / 5000: loss 5.484266
iteration 4600 / 5000: loss 5.101137
iteration 4700 / 5000: loss 5.311460
iteration 4800 / 5000: loss 5.448708
iteration 4900 / 5000: loss 5.395234
lr 1.000000e-01 reg 9.000000e-01 train_accuracy: 0.417469 val_accuracy: 0.411000
iteration 0 / 5000: loss 8.987770
iteration 100 / 5000: loss 5.109963
iteration 200 / 5000: loss 5.034596
iteration 300 / 5000: loss 5.240879
iteration 400 / 5000: loss 5.602025
iteration 500 / 5000: loss 5.800564
iteration 600 / 5000: loss 5.299615
iteration 700 / 5000: loss 5.350501
iteration 800 / 5000: loss 5.544149
iteration 900 / 5000: loss 5.823963
iteration 1000 / 5000: loss 5.552456
iteration 1100 / 5000: loss 5.315252
iteration 1200 / 5000: loss 5.233748
iteration 1300 / 5000: loss 5.718673
iteration 1400 / 5000: loss 5.624136
iteration 1500 / 5000: loss 5.387983
iteration 1600 / 5000: loss 5.486802
iteration 1700 / 5000: loss 5.204223
iteration 1800 / 5000: loss 5.570201
iteration 1900 / 5000: loss 5.410219
iteration 2000 / 5000: loss 5.703974
iteration 2100 / 5000: loss 6.002492
iteration 2200 / 5000: loss 5.425233
iteration 2300 / 5000: loss 5.052488
iteration 2400 / 5000: loss 5.382206
iteration 2500 / 5000: loss 5.774868
iteration 2600 / 5000: loss 5.542823
iteration 2700 / 5000: loss 5.262646
iteration 2800 / 5000: loss 5.389329
iteration 2900 / 5000: loss 5.592898
iteration 3000 / 5000: loss 5.267297
iteration 3100 / 5000: loss 5.448903
iteration 3200 / 5000: loss 5.462782
iteration 3300 / 5000: loss 4.927355
iteration 3400 / 5000: loss 5.360934
iteration 3500 / 5000: loss 5.492179
iteration 3600 / 5000: loss 5.663761
iteration 3700 / 5000: loss 5.279923
iteration 3800 / 5000: loss 5.294444
iteration 3900 / 5000: loss 5.089125
iteration 4000 / 5000: loss 5.380644
iteration 4100 / 5000: loss 5.686325
iteration 4200 / 5000: loss 5.497468
iteration 4300 / 5000: loss 5.413092
iteration 4400 / 5000: loss 5.131454
iteration 4500 / 5000: loss 5.512316
iteration 4600 / 5000: loss 5.467676
iteration 4700 / 5000: loss 5.821723
iteration 4800 / 5000: loss 5.417067
iteration 4900 / 5000: loss 5.497598
lr 1.000000e-01 reg 1.000000e+00 train_accuracy: 0.429531 val_accuracy: 0.415000
iteration 0 / 5000: loss 9.009836
iteration 100 / 5000: loss 7.212230
iteration 200 / 5000: loss 6.857761
iteration 300 / 5000: loss 7.586573
iteration 400 / 5000: loss 7.654984
iteration 500 / 5000: loss 7.451734
iteration 600 / 5000: loss 6.816671
iteration 700 / 5000: loss 7.011642
iteration 800 / 5000: loss 7.640332
iteration 900 / 5000: loss 6.985053
iteration 1000 / 5000: loss 7.645963
iteration 1100 / 5000: loss 7.297339
iteration 1200 / 5000: loss 7.296484
iteration 1300 / 5000: loss 6.758161
iteration 1400 / 5000: loss 6.987005
iteration 1500 / 5000: loss 7.297815
iteration 1600 / 5000: loss 7.155596
iteration 1700 / 5000: loss 7.115817
iteration 1800 / 5000: loss 7.177594
iteration 1900 / 5000: loss 7.788502
iteration 2000 / 5000: loss 7.138244
iteration 2100 / 5000: loss 7.113615
iteration 2200 / 5000: loss 7.270891
iteration 2300 / 5000: loss 7.453228
iteration 2400 / 5000: loss 7.509715
iteration 2500 / 5000: loss 7.169032
iteration 2600 / 5000: loss 7.476811
iteration 2700 / 5000: loss 7.628326
iteration 2800 / 5000: loss 7.000861
iteration 2900 / 5000: loss 7.291332
iteration 3000 / 5000: loss 7.161356
iteration 3100 / 5000: loss 7.055693
iteration 3200 / 5000: loss 7.380765
iteration 3300 / 5000: loss 6.806375
iteration 3400 / 5000: loss 6.851770
iteration 3500 / 5000: loss 6.888458
iteration 3600 / 5000: loss 7.239255
iteration 3700 / 5000: loss 7.044938
iteration 3800 / 5000: loss 7.326763
iteration 3900 / 5000: loss 7.306694
iteration 4000 / 5000: loss 6.703245
iteration 4100 / 5000: loss 7.454048
iteration 4200 / 5000: loss 6.971675
iteration 4300 / 5000: loss 7.015172
iteration 4400 / 5000: loss 7.118105
iteration 4500 / 5000: loss 7.424615
iteration 4600 / 5000: loss 7.390719
iteration 4700 / 5000: loss 7.177814
iteration 4800 / 5000: loss 7.215651
iteration 4900 / 5000: loss 7.414133
lr 1.000000e-01 reg 3.000000e+00 train_accuracy: 0.365531 val_accuracy: 0.396000
iteration 0 / 5000: loss 9.005496
iteration 100 / 5000: loss 80.725717
iteration 200 / 5000: loss 84.311633
iteration 300 / 5000: loss 79.678773
iteration 400 / 5000: loss 81.391446
iteration 500 / 5000: loss 84.764294
iteration 600 / 5000: loss 81.615564
iteration 700 / 5000: loss 83.805180
iteration 800 / 5000: loss 81.536541
iteration 900 / 5000: loss 81.272990
iteration 1000 / 5000: loss 77.856871
iteration 1100 / 5000: loss 84.175164
iteration 1200 / 5000: loss 87.101556
iteration 1300 / 5000: loss 84.655054
iteration 1400 / 5000: loss 85.052486
iteration 1500 / 5000: loss 80.408211
iteration 1600 / 5000: loss 83.875265
iteration 1700 / 5000: loss 82.195259
iteration 1800 / 5000: loss 81.010879
iteration 1900 / 5000: loss 81.390676
iteration 2000 / 5000: loss 83.964641
iteration 2100 / 5000: loss 84.197254
iteration 2200 / 5000: loss 83.349643
iteration 2300 / 5000: loss 85.613949
iteration 2400 / 5000: loss 81.049359
iteration 2500 / 5000: loss 88.751547
iteration 2600 / 5000: loss 82.267331
iteration 2700 / 5000: loss 86.495366
iteration 2800 / 5000: loss 80.674229
iteration 2900 / 5000: loss 84.094855
iteration 3000 / 5000: loss 76.374805
iteration 3100 / 5000: loss 86.141792
iteration 3200 / 5000: loss 81.813209
iteration 3300 / 5000: loss 87.702463
iteration 3400 / 5000: loss 80.875060
iteration 3500 / 5000: loss 81.895536
iteration 3600 / 5000: loss 86.381931
iteration 3700 / 5000: loss 80.875432
iteration 3800 / 5000: loss 81.891109
iteration 3900 / 5000: loss 82.919028
iteration 4000 / 5000: loss 84.490905
iteration 4100 / 5000: loss 79.420247
iteration 4200 / 5000: loss 88.202518
iteration 4300 / 5000: loss 83.832601
iteration 4400 / 5000: loss 77.098358
iteration 4500 / 5000: loss 84.632501
iteration 4600 / 5000: loss 84.141174
iteration 4700 / 5000: loss 78.106879
iteration 4800 / 5000: loss 84.990832
iteration 4900 / 5000: loss 82.910051
lr 1.000000e-01 reg 9.000000e+00 train_accuracy: 0.022796 val_accuracy: 0.021000

lr 1.000000e-04 reg 1.000000e-01 train_accuracy: 0.482980 val_accuracy: 0.470000
lr 1.000000e-04 reg 3.000000e-01 train_accuracy: 0.480918 val_accuracy: 0.471000
lr 1.000000e-04 reg 9.000000e-01 train_accuracy: 0.476939 val_accuracy: 0.469000
lr 1.000000e-04 reg 1.000000e+00 train_accuracy: 0.476367 val_accuracy: 0.472000
lr 1.000000e-04 reg 3.000000e+00 train_accuracy: 0.463429 val_accuracy: 0.452000
lr 1.000000e-04 reg 9.000000e+00 train_accuracy: 0.432837 val_accuracy: 0.435000
lr 3.000000e-04 reg 1.000000e-01 train_accuracy: 0.500755 val_accuracy: 0.485000
lr 3.000000e-04 reg 3.000000e-01 train_accuracy: 0.495714 val_accuracy: 0.485000
lr 3.000000e-04 reg 9.000000e-01 train_accuracy: 0.485837 val_accuracy: 0.476000
lr 3.000000e-04 reg 1.000000e+00 train_accuracy: 0.483571 val_accuracy: 0.477000
lr 3.000000e-04 reg 3.000000e+00 train_accuracy: 0.463653 val_accuracy: 0.457000
lr 3.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.435000
lr 9.000000e-04 reg 1.000000e-01 train_accuracy: 0.507776 val_accuracy: 0.496000
lr 9.000000e-04 reg 3.000000e-01 train_accuracy: 0.499918 val_accuracy: 0.489000
lr 9.000000e-04 reg 9.000000e-01 train_accuracy: 0.485122 val_accuracy: 0.475000
lr 9.000000e-04 reg 1.000000e+00 train_accuracy: 0.484367 val_accuracy: 0.480000
lr 9.000000e-04 reg 3.000000e+00 train_accuracy: 0.464286 val_accuracy: 0.460000
lr 9.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.436000
lr 1.000000e-03 reg 1.000000e-01 train_accuracy: 0.508755 val_accuracy: 0.496000
lr 1.000000e-03 reg 3.000000e-01 train_accuracy: 0.500388 val_accuracy: 0.482000
lr 1.000000e-03 reg 9.000000e-01 train_accuracy: 0.486000 val_accuracy: 0.479000
lr 1.000000e-03 reg 1.000000e+00 train_accuracy: 0.482878 val_accuracy: 0.473000
lr 1.000000e-03 reg 3.000000e+00 train_accuracy: 0.462551 val_accuracy: 0.455000
lr 1.000000e-03 reg 9.000000e+00 train_accuracy: 0.433571 val_accuracy: 0.436000
lr 3.000000e-03 reg 1.000000e-01 train_accuracy: 0.509102 val_accuracy: 0.498000
lr 3.000000e-03 reg 3.000000e-01 train_accuracy: 0.499898 val_accuracy: 0.480000
lr 3.000000e-03 reg 9.000000e-01 train_accuracy: 0.486367 val_accuracy: 0.474000
lr 3.000000e-03 reg 1.000000e+00 train_accuracy: 0.481633 val_accuracy: 0.472000
lr 3.000000e-03 reg 3.000000e+00 train_accuracy: 0.460939 val_accuracy: 0.454000
lr 3.000000e-03 reg 9.000000e+00 train_accuracy: 0.437122 val_accuracy: 0.431000
lr 9.000000e-03 reg 1.000000e-01 train_accuracy: 0.505673 val_accuracy: 0.486000
lr 9.000000e-03 reg 3.000000e-01 train_accuracy: 0.497469 val_accuracy: 0.483000
lr 9.000000e-03 reg 9.000000e-01 train_accuracy: 0.482224 val_accuracy: 0.470000
lr 9.000000e-03 reg 1.000000e+00 train_accuracy: 0.481918 val_accuracy: 0.469000
lr 9.000000e-03 reg 3.000000e+00 train_accuracy: 0.458816 val_accuracy: 0.456000
lr 9.000000e-03 reg 9.000000e+00 train_accuracy: 0.422857 val_accuracy: 0.424000
lr 1.000000e-02 reg 1.000000e-01 train_accuracy: 0.504531 val_accuracy: 0.492000
lr 1.000000e-02 reg 3.000000e-01 train_accuracy: 0.494694 val_accuracy: 0.488000
lr 1.000000e-02 reg 9.000000e-01 train_accuracy: 0.481122 val_accuracy: 0.486000
lr 1.000000e-02 reg 1.000000e+00 train_accuracy: 0.479959 val_accuracy: 0.469000
lr 1.000000e-02 reg 3.000000e+00 train_accuracy: 0.461082 val_accuracy: 0.450000
lr 1.000000e-02 reg 9.000000e+00 train_accuracy: 0.410143 val_accuracy: 0.400000
lr 3.000000e-02 reg 1.000000e-01 train_accuracy: 0.496163 val_accuracy: 0.496000
lr 3.000000e-02 reg 3.000000e-01 train_accuracy: 0.490959 val_accuracy: 0.484000
lr 3.000000e-02 reg 9.000000e-01 train_accuracy: 0.470959 val_accuracy: 0.465000
lr 3.000000e-02 reg 1.000000e+00 train_accuracy: 0.465816 val_accuracy: 0.472000
lr 3.000000e-02 reg 3.000000e+00 train_accuracy: 0.439939 val_accuracy: 0.451000
lr 3.000000e-02 reg 9.000000e+00 train_accuracy: 0.393918 val_accuracy: 0.391000
lr 9.000000e-02 reg 1.000000e-01 train_accuracy: 0.476755 val_accuracy: 0.490000
lr 9.000000e-02 reg 3.000000e-01 train_accuracy: 0.465673 val_accuracy: 0.451000
lr 9.000000e-02 reg 9.000000e-01 train_accuracy: 0.432796 val_accuracy: 0.427000
lr 9.000000e-02 reg 1.000000e+00 train_accuracy: 0.434878 val_accuracy: 0.423000
lr 9.000000e-02 reg 3.000000e+00 train_accuracy: 0.361551 val_accuracy: 0.337000
lr 9.000000e-02 reg 9.000000e+00 train_accuracy: 0.091878 val_accuracy: 0.094000
lr 1.000000e-01 reg 1.000000e-01 train_accuracy: 0.474857 val_accuracy: 0.449000
lr 1.000000e-01 reg 3.000000e-01 train_accuracy: 0.456694 val_accuracy: 0.455000
lr 1.000000e-01 reg 9.000000e-01 train_accuracy: 0.417469 val_accuracy: 0.411000
lr 1.000000e-01 reg 1.000000e+00 train_accuracy: 0.429531 val_accuracy: 0.415000
lr 1.000000e-01 reg 3.000000e+00 train_accuracy: 0.365531 val_accuracy: 0.396000
lr 1.000000e-01 reg 9.000000e+00 train_accuracy: 0.022796 val_accuracy: 0.021000
best validation accuracy achieved during cross-validation: 0.498000

In [5]:
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print('lr %e reg %e train_accuracy: %f val_accuracy: %f' % (lr, reg, train_accuracy, val_accuracy))


lr 1.000000e-04 reg 1.000000e-01 train_accuracy: 0.482980 val_accuracy: 0.470000
lr 1.000000e-04 reg 3.000000e-01 train_accuracy: 0.480918 val_accuracy: 0.471000
lr 1.000000e-04 reg 9.000000e-01 train_accuracy: 0.476939 val_accuracy: 0.469000
lr 1.000000e-04 reg 1.000000e+00 train_accuracy: 0.476367 val_accuracy: 0.472000
lr 1.000000e-04 reg 3.000000e+00 train_accuracy: 0.463429 val_accuracy: 0.452000
lr 1.000000e-04 reg 9.000000e+00 train_accuracy: 0.432837 val_accuracy: 0.435000
lr 3.000000e-04 reg 1.000000e-01 train_accuracy: 0.500755 val_accuracy: 0.485000
lr 3.000000e-04 reg 3.000000e-01 train_accuracy: 0.495714 val_accuracy: 0.485000
lr 3.000000e-04 reg 9.000000e-01 train_accuracy: 0.485837 val_accuracy: 0.476000
lr 3.000000e-04 reg 1.000000e+00 train_accuracy: 0.483571 val_accuracy: 0.477000
lr 3.000000e-04 reg 3.000000e+00 train_accuracy: 0.463653 val_accuracy: 0.457000
lr 3.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.435000
lr 9.000000e-04 reg 1.000000e-01 train_accuracy: 0.507776 val_accuracy: 0.496000
lr 9.000000e-04 reg 3.000000e-01 train_accuracy: 0.499918 val_accuracy: 0.489000
lr 9.000000e-04 reg 9.000000e-01 train_accuracy: 0.485122 val_accuracy: 0.475000
lr 9.000000e-04 reg 1.000000e+00 train_accuracy: 0.484367 val_accuracy: 0.480000
lr 9.000000e-04 reg 3.000000e+00 train_accuracy: 0.464286 val_accuracy: 0.460000
lr 9.000000e-04 reg 9.000000e+00 train_accuracy: 0.433776 val_accuracy: 0.436000
lr 1.000000e-03 reg 1.000000e-01 train_accuracy: 0.508755 val_accuracy: 0.496000
lr 1.000000e-03 reg 3.000000e-01 train_accuracy: 0.500388 val_accuracy: 0.482000
lr 1.000000e-03 reg 9.000000e-01 train_accuracy: 0.486000 val_accuracy: 0.479000
lr 1.000000e-03 reg 1.000000e+00 train_accuracy: 0.482878 val_accuracy: 0.473000
lr 1.000000e-03 reg 3.000000e+00 train_accuracy: 0.462551 val_accuracy: 0.455000
lr 1.000000e-03 reg 9.000000e+00 train_accuracy: 0.433571 val_accuracy: 0.436000
lr 3.000000e-03 reg 1.000000e-01 train_accuracy: 0.509102 val_accuracy: 0.498000
lr 3.000000e-03 reg 3.000000e-01 train_accuracy: 0.499898 val_accuracy: 0.480000
lr 3.000000e-03 reg 9.000000e-01 train_accuracy: 0.486367 val_accuracy: 0.474000
lr 3.000000e-03 reg 1.000000e+00 train_accuracy: 0.481633 val_accuracy: 0.472000
lr 3.000000e-03 reg 3.000000e+00 train_accuracy: 0.460939 val_accuracy: 0.454000
lr 3.000000e-03 reg 9.000000e+00 train_accuracy: 0.437122 val_accuracy: 0.431000
lr 9.000000e-03 reg 1.000000e-01 train_accuracy: 0.505673 val_accuracy: 0.486000
lr 9.000000e-03 reg 3.000000e-01 train_accuracy: 0.497469 val_accuracy: 0.483000
lr 9.000000e-03 reg 9.000000e-01 train_accuracy: 0.482224 val_accuracy: 0.470000
lr 9.000000e-03 reg 1.000000e+00 train_accuracy: 0.481918 val_accuracy: 0.469000
lr 9.000000e-03 reg 3.000000e+00 train_accuracy: 0.458816 val_accuracy: 0.456000
lr 9.000000e-03 reg 9.000000e+00 train_accuracy: 0.422857 val_accuracy: 0.424000
lr 1.000000e-02 reg 1.000000e-01 train_accuracy: 0.504531 val_accuracy: 0.492000
lr 1.000000e-02 reg 3.000000e-01 train_accuracy: 0.494694 val_accuracy: 0.488000
lr 1.000000e-02 reg 9.000000e-01 train_accuracy: 0.481122 val_accuracy: 0.486000
lr 1.000000e-02 reg 1.000000e+00 train_accuracy: 0.479959 val_accuracy: 0.469000
lr 1.000000e-02 reg 3.000000e+00 train_accuracy: 0.461082 val_accuracy: 0.450000
lr 1.000000e-02 reg 9.000000e+00 train_accuracy: 0.410143 val_accuracy: 0.400000
lr 3.000000e-02 reg 1.000000e-01 train_accuracy: 0.496163 val_accuracy: 0.496000
lr 3.000000e-02 reg 3.000000e-01 train_accuracy: 0.490959 val_accuracy: 0.484000
lr 3.000000e-02 reg 9.000000e-01 train_accuracy: 0.470959 val_accuracy: 0.465000
lr 3.000000e-02 reg 1.000000e+00 train_accuracy: 0.465816 val_accuracy: 0.472000
lr 3.000000e-02 reg 3.000000e+00 train_accuracy: 0.439939 val_accuracy: 0.451000
lr 3.000000e-02 reg 9.000000e+00 train_accuracy: 0.393918 val_accuracy: 0.391000
lr 9.000000e-02 reg 1.000000e-01 train_accuracy: 0.476755 val_accuracy: 0.490000
lr 9.000000e-02 reg 3.000000e-01 train_accuracy: 0.465673 val_accuracy: 0.451000
lr 9.000000e-02 reg 9.000000e-01 train_accuracy: 0.432796 val_accuracy: 0.427000
lr 9.000000e-02 reg 1.000000e+00 train_accuracy: 0.434878 val_accuracy: 0.423000
lr 9.000000e-02 reg 3.000000e+00 train_accuracy: 0.361551 val_accuracy: 0.337000
lr 9.000000e-02 reg 9.000000e+00 train_accuracy: 0.091878 val_accuracy: 0.094000
lr 1.000000e-01 reg 1.000000e-01 train_accuracy: 0.474857 val_accuracy: 0.449000
lr 1.000000e-01 reg 3.000000e-01 train_accuracy: 0.456694 val_accuracy: 0.455000
lr 1.000000e-01 reg 9.000000e-01 train_accuracy: 0.417469 val_accuracy: 0.411000
lr 1.000000e-01 reg 1.000000e+00 train_accuracy: 0.429531 val_accuracy: 0.415000
lr 1.000000e-01 reg 3.000000e+00 train_accuracy: 0.365531 val_accuracy: 0.396000
lr 1.000000e-01 reg 9.000000e+00 train_accuracy: 0.022796 val_accuracy: 0.021000

In [6]:
# Evaluate the best svm on test set
y_test_pred = best_svm.predict(X_test_feats)
test_accuracy = np.mean(y_test == y_test_pred)
print('linear SVM on raw pixels final test set accuracy: %f' % test_accuracy)


linear SVM on raw pixels final test set accuracy: 0.489000

In [7]:
# An important way to gain intuition about how an algorithm works is to
# visualize the mistakes that it makes. In this visualization, we show examples
# of images that are misclassified by our current system. The first column
# shows images that our system labeled as "plane" but whose true label is
# something other than "plane".

examples_per_class = 8
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
for cls, cls_name in enumerate(classes):
    idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]
    idxs = np.random.choice(idxs, examples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)
        plt.imshow(X_test[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls_name)
            
plt.show()


Inline question 1:

Describe the misclassification results that you see. Do they make sense? not obvious sense

Neural Network on image features

Earlier in this assigment we saw that training a two-layer neural network on raw pixels achieved better classification performance than linear classifiers on raw pixels. In this notebook we have seen that linear classifiers on image features outperform linear classifiers on raw pixels.

For completeness, we should also try training a neural network on image features. This approach should outperform all previous approaches: you should easily be able to achieve over 55% classification accuracy on the test set; our best model achieves about 60% classification accuracy.


In [8]:
print(X_train_feats.shape)


(49000, 155)

In [22]:
from classifiers.neural_net import TwoLayerNet

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10

learning_rates = [3e-1, 9e-1, 1]
regularization_strengths = [3e-3, 4e-3, 5e-3, 6e-3, 7e-3, 8e-3, 9e-3, 1e-2]

results = {}

best_model = None
best_val = -1

for lr in learning_rates:
    for reg in regularization_strengths:
        model = TwoLayerNet(input_dim, hidden_dim, num_classes, std=1e-1)
        stats = model.train(X_train_feats, y_train, X_val_feats, y_val,
                         learning_rate=lr, learning_rate_decay=0.95,
                          reg=reg, num_iters=5000, batch_size=200, verbose=True)

        train_acc = (model.predict(X_train_feats) == y_train).mean()
        val_acc = (model.predict(X_val_feats) == y_val).mean()
        print('lr: %e, reg: %e, train_acc: %f, val_acc: %f' % (lr, reg, train_acc, val_acc))

        results[(lr, reg)] = (train_acc, val_acc)
        if val_acc > best_val:
            best_val = val_acc
            best_model = model
print
     
print('best val_acc: %f' % (best_val))
    
old_lr = -1
for lr, reg in sorted(results):
    if old_lr != lr:
        old_lr = lr
        print
        
    train_acc, val_acc = results[(lr, reg)]
    print('lr: %e, reg: %e, train_acc: %f, val_acc: %f' % (lr, reg, train_acc, val_acc))


iteration 1 / 5000: loss 5.976893
iteration 101 / 5000: loss 3.176106
iteration 201 / 5000: loss 2.409488
iteration 301 / 5000: loss 2.191569
iteration 401 / 5000: loss 1.842398
iteration 501 / 5000: loss 1.844388
iteration 601 / 5000: loss 1.563744
iteration 701 / 5000: loss 1.605641
iteration 801 / 5000: loss 1.553624
iteration 901 / 5000: loss 1.399533
iteration 1001 / 5000: loss 1.343294
iteration 1101 / 5000: loss 1.312609
iteration 1201 / 5000: loss 1.338947
iteration 1301 / 5000: loss 1.336176
iteration 1401 / 5000: loss 1.384308
iteration 1501 / 5000: loss 1.358256
iteration 1601 / 5000: loss 1.393873
iteration 1701 / 5000: loss 1.275851
iteration 1801 / 5000: loss 1.271719
iteration 1901 / 5000: loss 1.446877
iteration 2001 / 5000: loss 1.340202
iteration 2101 / 5000: loss 1.263369
iteration 2201 / 5000: loss 1.347515
iteration 2301 / 5000: loss 1.301203
iteration 2401 / 5000: loss 1.300777
iteration 2501 / 5000: loss 1.158752
iteration 2601 / 5000: loss 1.310510
iteration 2701 / 5000: loss 1.364277
iteration 2801 / 5000: loss 1.453515
iteration 2901 / 5000: loss 1.287860
iteration 3001 / 5000: loss 1.183359
iteration 3101 / 5000: loss 1.295005
iteration 3201 / 5000: loss 1.327975
iteration 3301 / 5000: loss 1.260126
iteration 3401 / 5000: loss 1.327847
iteration 3501 / 5000: loss 1.339530
iteration 3601 / 5000: loss 1.269920
iteration 3701 / 5000: loss 1.312160
iteration 3801 / 5000: loss 1.373669
iteration 3901 / 5000: loss 1.335743
iteration 4001 / 5000: loss 1.296956
iteration 4101 / 5000: loss 1.327416
iteration 4201 / 5000: loss 1.219179
iteration 4301 / 5000: loss 1.217967
iteration 4401 / 5000: loss 1.365919
iteration 4501 / 5000: loss 1.381735
iteration 4601 / 5000: loss 1.279982
iteration 4701 / 5000: loss 1.360686
iteration 4801 / 5000: loss 1.365884
iteration 4901 / 5000: loss 1.324592
lr: 3.000000e-01, reg: 3.000000e-03, train_acc: 0.653959, val_acc: 0.608000
iteration 1 / 5000: loss 6.608914
iteration 101 / 5000: loss 3.486755
iteration 201 / 5000: loss 2.620449
iteration 301 / 5000: loss 2.180909
iteration 401 / 5000: loss 1.664800
iteration 501 / 5000: loss 1.831278
iteration 601 / 5000: loss 1.774015
iteration 701 / 5000: loss 1.511479
iteration 801 / 5000: loss 1.528965
iteration 901 / 5000: loss 1.477823
iteration 1001 / 5000: loss 1.430294
iteration 1101 / 5000: loss 1.468080
iteration 1201 / 5000: loss 1.409400
iteration 1301 / 5000: loss 1.331770
iteration 1401 / 5000: loss 1.314646
iteration 1501 / 5000: loss 1.514493
iteration 1601 / 5000: loss 1.503904
iteration 1701 / 5000: loss 1.438992
iteration 1801 / 5000: loss 1.408476
iteration 1901 / 5000: loss 1.314768
iteration 2001 / 5000: loss 1.409249
iteration 2101 / 5000: loss 1.350146
iteration 2201 / 5000: loss 1.327511
iteration 2301 / 5000: loss 1.377120
iteration 2401 / 5000: loss 1.324493
iteration 2501 / 5000: loss 1.414821
iteration 2601 / 5000: loss 1.350065
iteration 2701 / 5000: loss 1.365519
iteration 2801 / 5000: loss 1.434727
iteration 2901 / 5000: loss 1.374253
iteration 3001 / 5000: loss 1.371185
iteration 3101 / 5000: loss 1.313715
iteration 3201 / 5000: loss 1.429455
iteration 3301 / 5000: loss 1.323189
iteration 3401 / 5000: loss 1.307528
iteration 3501 / 5000: loss 1.385178
iteration 3601 / 5000: loss 1.427703
iteration 3701 / 5000: loss 1.263476
iteration 3801 / 5000: loss 1.375188
iteration 3901 / 5000: loss 1.386094
iteration 4001 / 5000: loss 1.356756
iteration 4101 / 5000: loss 1.386348
iteration 4201 / 5000: loss 1.353201
iteration 4301 / 5000: loss 1.335807
iteration 4401 / 5000: loss 1.246183
iteration 4501 / 5000: loss 1.355685
iteration 4601 / 5000: loss 1.288583
iteration 4701 / 5000: loss 1.327174
iteration 4801 / 5000: loss 1.328294
iteration 4901 / 5000: loss 1.361624
lr: 3.000000e-01, reg: 4.000000e-03, train_acc: 0.621939, val_acc: 0.578000
iteration 1 / 5000: loss 7.797288
iteration 101 / 5000: loss 3.624969
iteration 201 / 5000: loss 2.629429
iteration 301 / 5000: loss 2.056187
iteration 401 / 5000: loss 1.837083
iteration 501 / 5000: loss 1.716248
iteration 601 / 5000: loss 1.544325
iteration 701 / 5000: loss 1.508487
iteration 801 / 5000: loss 1.461666
iteration 901 / 5000: loss 1.464779
iteration 1001 / 5000: loss 1.549870
iteration 1101 / 5000: loss 1.326794
iteration 1201 / 5000: loss 1.409093
iteration 1301 / 5000: loss 1.480360
iteration 1401 / 5000: loss 1.431782
iteration 1501 / 5000: loss 1.471791
iteration 1601 / 5000: loss 1.416977
iteration 1701 / 5000: loss 1.537763
iteration 1801 / 5000: loss 1.392261
iteration 1901 / 5000: loss 1.476753
iteration 2001 / 5000: loss 1.538376
iteration 2101 / 5000: loss 1.371274
iteration 2201 / 5000: loss 1.407080
iteration 2301 / 5000: loss 1.431599
iteration 2401 / 5000: loss 1.384743
iteration 2501 / 5000: loss 1.373009
iteration 2601 / 5000: loss 1.436576
iteration 2701 / 5000: loss 1.340022
iteration 2801 / 5000: loss 1.315933
iteration 2901 / 5000: loss 1.365652
iteration 3001 / 5000: loss 1.390318
iteration 3101 / 5000: loss 1.389607
iteration 3201 / 5000: loss 1.398047
iteration 3301 / 5000: loss 1.398488
iteration 3401 / 5000: loss 1.355666
iteration 3501 / 5000: loss 1.370937
iteration 3601 / 5000: loss 1.369134
iteration 3701 / 5000: loss 1.403616
iteration 3801 / 5000: loss 1.420545
iteration 3901 / 5000: loss 1.445742
iteration 4001 / 5000: loss 1.569441
iteration 4101 / 5000: loss 1.376922
iteration 4201 / 5000: loss 1.463793
iteration 4301 / 5000: loss 1.528781
iteration 4401 / 5000: loss 1.437253
iteration 4501 / 5000: loss 1.495386
iteration 4601 / 5000: loss 1.490764
iteration 4701 / 5000: loss 1.379064
iteration 4801 / 5000: loss 1.328630
iteration 4901 / 5000: loss 1.454305
lr: 3.000000e-01, reg: 5.000000e-03, train_acc: 0.601184, val_acc: 0.588000
iteration 1 / 5000: loss 8.565681
iteration 101 / 5000: loss 3.716991
iteration 201 / 5000: loss 2.510470
iteration 301 / 5000: loss 2.037523
iteration 401 / 5000: loss 1.819094
iteration 501 / 5000: loss 1.592694
iteration 601 / 5000: loss 1.605759
iteration 701 / 5000: loss 1.524681
iteration 801 / 5000: loss 1.510712
iteration 901 / 5000: loss 1.474469
iteration 1001 / 5000: loss 1.394888
iteration 1101 / 5000: loss 1.480247
iteration 1201 / 5000: loss 1.524712
iteration 1301 / 5000: loss 1.421223
iteration 1401 / 5000: loss 1.491432
iteration 1501 / 5000: loss 1.447841
iteration 1601 / 5000: loss 1.515791
iteration 1701 / 5000: loss 1.498459
iteration 1801 / 5000: loss 1.405598
iteration 1901 / 5000: loss 1.443098
iteration 2001 / 5000: loss 1.486293
iteration 2101 / 5000: loss 1.381143
iteration 2201 / 5000: loss 1.405138
iteration 2301 / 5000: loss 1.416990
iteration 2401 / 5000: loss 1.543579
iteration 2501 / 5000: loss 1.442629
iteration 2601 / 5000: loss 1.381157
iteration 2701 / 5000: loss 1.320718
iteration 2801 / 5000: loss 1.412815
iteration 2901 / 5000: loss 1.447465
iteration 3001 / 5000: loss 1.408722
iteration 3101 / 5000: loss 1.473474
iteration 3201 / 5000: loss 1.595401
iteration 3301 / 5000: loss 1.407081
iteration 3401 / 5000: loss 1.410752
iteration 3501 / 5000: loss 1.440733
iteration 3601 / 5000: loss 1.500512
iteration 3701 / 5000: loss 1.396875
iteration 3801 / 5000: loss 1.470133
iteration 3901 / 5000: loss 1.362828
iteration 4001 / 5000: loss 1.351434
iteration 4101 / 5000: loss 1.489164
iteration 4201 / 5000: loss 1.344203
iteration 4301 / 5000: loss 1.465472
iteration 4401 / 5000: loss 1.533532
iteration 4501 / 5000: loss 1.430678
iteration 4601 / 5000: loss 1.346759
iteration 4701 / 5000: loss 1.435949
iteration 4801 / 5000: loss 1.395991
iteration 4901 / 5000: loss 1.487862
lr: 3.000000e-01, reg: 6.000000e-03, train_acc: 0.580816, val_acc: 0.559000
iteration 1 / 5000: loss 9.520098
iteration 101 / 5000: loss 4.017190
iteration 201 / 5000: loss 2.503164
iteration 301 / 5000: loss 2.051990
iteration 401 / 5000: loss 1.847102
iteration 501 / 5000: loss 1.653212
iteration 601 / 5000: loss 1.716098
iteration 701 / 5000: loss 1.439105
iteration 801 / 5000: loss 1.454261
iteration 901 / 5000: loss 1.503200
iteration 1001 / 5000: loss 1.463070
iteration 1101 / 5000: loss 1.545180
iteration 1201 / 5000: loss 1.545501
iteration 1301 / 5000: loss 1.495125
iteration 1401 / 5000: loss 1.472176
iteration 1501 / 5000: loss 1.475723
iteration 1601 / 5000: loss 1.381536
iteration 1701 / 5000: loss 1.501003
iteration 1801 / 5000: loss 1.431175
iteration 1901 / 5000: loss 1.547349
iteration 2001 / 5000: loss 1.485895
iteration 2101 / 5000: loss 1.508487
iteration 2201 / 5000: loss 1.440660
iteration 2301 / 5000: loss 1.434699
iteration 2401 / 5000: loss 1.415479
iteration 2501 / 5000: loss 1.397006
iteration 2601 / 5000: loss 1.418416
iteration 2701 / 5000: loss 1.420885
iteration 2801 / 5000: loss 1.523226
iteration 2901 / 5000: loss 1.601218
iteration 3001 / 5000: loss 1.485383
iteration 3101 / 5000: loss 1.443116
iteration 3201 / 5000: loss 1.513479
iteration 3301 / 5000: loss 1.533376
iteration 3401 / 5000: loss 1.479350
iteration 3501 / 5000: loss 1.489510
iteration 3601 / 5000: loss 1.361460
iteration 3701 / 5000: loss 1.501677
iteration 3801 / 5000: loss 1.471041
iteration 3901 / 5000: loss 1.554511
iteration 4001 / 5000: loss 1.444689
iteration 4101 / 5000: loss 1.559395
iteration 4201 / 5000: loss 1.510311
iteration 4301 / 5000: loss 1.631092
iteration 4401 / 5000: loss 1.498377
iteration 4501 / 5000: loss 1.514839
iteration 4601 / 5000: loss 1.532532
iteration 4701 / 5000: loss 1.560516
iteration 4801 / 5000: loss 1.516291
iteration 4901 / 5000: loss 1.520946
lr: 3.000000e-01, reg: 7.000000e-03, train_acc: 0.569612, val_acc: 0.552000
iteration 1 / 5000: loss 10.453917
iteration 101 / 5000: loss 3.940095
iteration 201 / 5000: loss 2.551183
iteration 301 / 5000: loss 1.860946
iteration 401 / 5000: loss 1.629604
iteration 501 / 5000: loss 1.594513
iteration 601 / 5000: loss 1.540737
iteration 701 / 5000: loss 1.617719
iteration 801 / 5000: loss 1.411333
iteration 901 / 5000: loss 1.549083
iteration 1001 / 5000: loss 1.670860
iteration 1101 / 5000: loss 1.465450
iteration 1201 / 5000: loss 1.544621
iteration 1301 / 5000: loss 1.539724
iteration 1401 / 5000: loss 1.556697
iteration 1501 / 5000: loss 1.447601
iteration 1601 / 5000: loss 1.613463
iteration 1701 / 5000: loss 1.466887
iteration 1801 / 5000: loss 1.582700
iteration 1901 / 5000: loss 1.543431
iteration 2001 / 5000: loss 1.506277
iteration 2101 / 5000: loss 1.520057
iteration 2201 / 5000: loss 1.599215
iteration 2301 / 5000: loss 1.503740
iteration 2401 / 5000: loss 1.520180
iteration 2501 / 5000: loss 1.444261
iteration 2601 / 5000: loss 1.489411
iteration 2701 / 5000: loss 1.444586
iteration 2801 / 5000: loss 1.492453
iteration 2901 / 5000: loss 1.567049
iteration 3001 / 5000: loss 1.460246
iteration 3101 / 5000: loss 1.601251
iteration 3201 / 5000: loss 1.579567
iteration 3301 / 5000: loss 1.476020
iteration 3401 / 5000: loss 1.502379
iteration 3501 / 5000: loss 1.552930
iteration 3601 / 5000: loss 1.542233
iteration 3701 / 5000: loss 1.576618
iteration 3801 / 5000: loss 1.534457
iteration 3901 / 5000: loss 1.430745
iteration 4001 / 5000: loss 1.490478
iteration 4101 / 5000: loss 1.456958
iteration 4201 / 5000: loss 1.418502
iteration 4301 / 5000: loss 1.485078
iteration 4401 / 5000: loss 1.583005
iteration 4501 / 5000: loss 1.538947
iteration 4601 / 5000: loss 1.533634
iteration 4701 / 5000: loss 1.480287
iteration 4801 / 5000: loss 1.400193
iteration 4901 / 5000: loss 1.457178
lr: 3.000000e-01, reg: 8.000000e-03, train_acc: 0.554163, val_acc: 0.543000
iteration 1 / 5000: loss 11.512526
iteration 101 / 5000: loss 3.993178
iteration 201 / 5000: loss 2.531911
iteration 301 / 5000: loss 1.966231
iteration 401 / 5000: loss 1.702296
iteration 501 / 5000: loss 1.612955
iteration 601 / 5000: loss 1.628916
iteration 701 / 5000: loss 1.547729
iteration 801 / 5000: loss 1.514786
iteration 901 / 5000: loss 1.580476
iteration 1001 / 5000: loss 1.575579
iteration 1101 / 5000: loss 1.599735
iteration 1201 / 5000: loss 1.533524
iteration 1301 / 5000: loss 1.539628
iteration 1401 / 5000: loss 1.450032
iteration 1501 / 5000: loss 1.557624
iteration 1601 / 5000: loss 1.614710
iteration 1701 / 5000: loss 1.526902
iteration 1801 / 5000: loss 1.481174
iteration 1901 / 5000: loss 1.464210
iteration 2001 / 5000: loss 1.561432
iteration 2101 / 5000: loss 1.475249
iteration 2201 / 5000: loss 1.456523
iteration 2301 / 5000: loss 1.554604
iteration 2401 / 5000: loss 1.516589
iteration 2501 / 5000: loss 1.482731
iteration 2601 / 5000: loss 1.445529
iteration 2701 / 5000: loss 1.469946
iteration 2801 / 5000: loss 1.535683
iteration 2901 / 5000: loss 1.540455
iteration 3001 / 5000: loss 1.609573
iteration 3101 / 5000: loss 1.527749
iteration 3201 / 5000: loss 1.608911
iteration 3301 / 5000: loss 1.574150
iteration 3401 / 5000: loss 1.461939
iteration 3501 / 5000: loss 1.518704
iteration 3601 / 5000: loss 1.543338
iteration 3701 / 5000: loss 1.620500
iteration 3801 / 5000: loss 1.512646
iteration 3901 / 5000: loss 1.477730
iteration 4001 / 5000: loss 1.551104
iteration 4101 / 5000: loss 1.496000
iteration 4201 / 5000: loss 1.485154
iteration 4301 / 5000: loss 1.494288
iteration 4401 / 5000: loss 1.481890
iteration 4501 / 5000: loss 1.518291
iteration 4601 / 5000: loss 1.502681
iteration 4701 / 5000: loss 1.581513
iteration 4801 / 5000: loss 1.468763
iteration 4901 / 5000: loss 1.597745
lr: 3.000000e-01, reg: 9.000000e-03, train_acc: 0.548959, val_acc: 0.531000
iteration 1 / 5000: loss 11.953357
iteration 101 / 5000: loss 3.973483
iteration 201 / 5000: loss 2.304825
iteration 301 / 5000: loss 1.864477
iteration 401 / 5000: loss 1.691630
iteration 501 / 5000: loss 1.547534
iteration 601 / 5000: loss 1.599025
iteration 701 / 5000: loss 1.678731
iteration 801 / 5000: loss 1.596196
iteration 901 / 5000: loss 1.432903
iteration 1001 / 5000: loss 1.465457
iteration 1101 / 5000: loss 1.553630
iteration 1201 / 5000: loss 1.643945
iteration 1301 / 5000: loss 1.503156
iteration 1401 / 5000: loss 1.565765
iteration 1501 / 5000: loss 1.527663
iteration 1601 / 5000: loss 1.469213
iteration 1701 / 5000: loss 1.526132
iteration 1801 / 5000: loss 1.565082
iteration 1901 / 5000: loss 1.515181
iteration 2001 / 5000: loss 1.546807
iteration 2101 / 5000: loss 1.510768
iteration 2201 / 5000: loss 1.621963
iteration 2301 / 5000: loss 1.470760
iteration 2401 / 5000: loss 1.456336
iteration 2501 / 5000: loss 1.509638
iteration 2601 / 5000: loss 1.516522
iteration 2701 / 5000: loss 1.679268
iteration 2801 / 5000: loss 1.481718
iteration 2901 / 5000: loss 1.452614
iteration 3001 / 5000: loss 1.520980
iteration 3101 / 5000: loss 1.642225
iteration 3201 / 5000: loss 1.602498
iteration 3301 / 5000: loss 1.642649
iteration 3401 / 5000: loss 1.511714
iteration 3501 / 5000: loss 1.445759
iteration 3601 / 5000: loss 1.517049
iteration 3701 / 5000: loss 1.563989
iteration 3801 / 5000: loss 1.485167
iteration 3901 / 5000: loss 1.408767
iteration 4001 / 5000: loss 1.531469
iteration 4101 / 5000: loss 1.601732
iteration 4201 / 5000: loss 1.544175
iteration 4301 / 5000: loss 1.545530
iteration 4401 / 5000: loss 1.498042
iteration 4501 / 5000: loss 1.574887
iteration 4601 / 5000: loss 1.555472
iteration 4701 / 5000: loss 1.475973
iteration 4801 / 5000: loss 1.604490
iteration 4901 / 5000: loss 1.468283
lr: 3.000000e-01, reg: 1.000000e-02, train_acc: 0.539224, val_acc: 0.530000
iteration 1 / 5000: loss 6.268766
iteration 101 / 5000: loss 2.252398
iteration 201 / 5000: loss 1.713535
iteration 301 / 5000: loss 1.486839
iteration 401 / 5000: loss 1.455689
iteration 501 / 5000: loss 1.492949
iteration 601 / 5000: loss 1.448427
iteration 701 / 5000: loss 1.309270
iteration 801 / 5000: loss 1.444991
iteration 901 / 5000: loss 1.484152
iteration 1001 / 5000: loss 1.403101
iteration 1101 / 5000: loss 1.377768
iteration 1201 / 5000: loss 1.391160
iteration 1301 / 5000: loss 1.453950
iteration 1401 / 5000: loss 1.470054
iteration 1501 / 5000: loss 1.272617
iteration 1601 / 5000: loss 1.363234
iteration 1701 / 5000: loss 1.328086
iteration 1801 / 5000: loss 1.458348
iteration 1901 / 5000: loss 1.499629
iteration 2001 / 5000: loss 1.435267
iteration 2101 / 5000: loss 1.354333
iteration 2201 / 5000: loss 1.423319
iteration 2301 / 5000: loss 1.355759
iteration 2401 / 5000: loss 1.491013
iteration 2501 / 5000: loss 1.393662
iteration 2601 / 5000: loss 1.263396
iteration 2701 / 5000: loss 1.392271
iteration 2801 / 5000: loss 1.363418
iteration 2901 / 5000: loss 1.347372
iteration 3001 / 5000: loss 1.336383
iteration 3101 / 5000: loss 1.442822
iteration 3201 / 5000: loss 1.248980
iteration 3301 / 5000: loss 1.432126
iteration 3401 / 5000: loss 1.241177
iteration 3501 / 5000: loss 1.238256
iteration 3601 / 5000: loss 1.338733
iteration 3701 / 5000: loss 1.403721
iteration 3801 / 5000: loss 1.378610
iteration 3901 / 5000: loss 1.289740
iteration 4001 / 5000: loss 1.361848
iteration 4101 / 5000: loss 1.366259
iteration 4201 / 5000: loss 1.377318
iteration 4301 / 5000: loss 1.304119
iteration 4401 / 5000: loss 1.282492
iteration 4501 / 5000: loss 1.373076
iteration 4601 / 5000: loss 1.229434
iteration 4701 / 5000: loss 1.358707
iteration 4801 / 5000: loss 1.245780
iteration 4901 / 5000: loss 1.308604
lr: 9.000000e-01, reg: 3.000000e-03, train_acc: 0.635122, val_acc: 0.587000
iteration 1 / 5000: loss 7.302421
iteration 101 / 5000: loss 2.216459
iteration 201 / 5000: loss 1.746704
iteration 301 / 5000: loss 1.583059
iteration 401 / 5000: loss 1.436541
iteration 501 / 5000: loss 1.419534
iteration 601 / 5000: loss 1.474017
iteration 701 / 5000: loss 1.485721
iteration 801 / 5000: loss 1.442849
iteration 901 / 5000: loss 1.475003
iteration 1001 / 5000: loss 1.508172
iteration 1101 / 5000: loss 1.392827
iteration 1201 / 5000: loss 1.433941
iteration 1301 / 5000: loss 1.502353
iteration 1401 / 5000: loss 1.508924
iteration 1501 / 5000: loss 1.349740
iteration 1601 / 5000: loss 1.461819
iteration 1701 / 5000: loss 1.366886
iteration 1801 / 5000: loss 1.426734
iteration 1901 / 5000: loss 1.381485
iteration 2001 / 5000: loss 1.583229
iteration 2101 / 5000: loss 1.333221
iteration 2201 / 5000: loss 1.375252
iteration 2301 / 5000: loss 1.329685
iteration 2401 / 5000: loss 1.558156
iteration 2501 / 5000: loss 1.467762
iteration 2601 / 5000: loss 1.442445
iteration 2701 / 5000: loss 1.440168
iteration 2801 / 5000: loss 1.421114
iteration 2901 / 5000: loss 1.450699
iteration 3001 / 5000: loss 1.378511
iteration 3101 / 5000: loss 1.277303
iteration 3201 / 5000: loss 1.529920
iteration 3301 / 5000: loss 1.447222
iteration 3401 / 5000: loss 1.513097
iteration 3501 / 5000: loss 1.424672
iteration 3601 / 5000: loss 1.436346
iteration 3701 / 5000: loss 1.577261
iteration 3801 / 5000: loss 1.347788
iteration 3901 / 5000: loss 1.357533
iteration 4001 / 5000: loss 1.298038
iteration 4101 / 5000: loss 1.377535
iteration 4201 / 5000: loss 1.483625
iteration 4301 / 5000: loss 1.403335
iteration 4401 / 5000: loss 1.361174
iteration 4501 / 5000: loss 1.260081
iteration 4601 / 5000: loss 1.475335
iteration 4701 / 5000: loss 1.357455
iteration 4801 / 5000: loss 1.353356
iteration 4901 / 5000: loss 1.407337
lr: 9.000000e-01, reg: 4.000000e-03, train_acc: 0.602755, val_acc: 0.566000
iteration 1 / 5000: loss 7.771049
iteration 101 / 5000: loss 2.178657
iteration 201 / 5000: loss 1.673375
iteration 301 / 5000: loss 1.646062
iteration 401 / 5000: loss 1.558514
iteration 501 / 5000: loss 1.526562
iteration 601 / 5000: loss 1.630743
iteration 701 / 5000: loss 1.518663
iteration 801 / 5000: loss 1.534133
iteration 901 / 5000: loss 1.441927
iteration 1001 / 5000: loss 1.466740
iteration 1101 / 5000: loss 1.430941
iteration 1201 / 5000: loss 1.433366
iteration 1301 / 5000: loss 1.485460
iteration 1401 / 5000: loss 1.460345
iteration 1501 / 5000: loss 1.553676
iteration 1601 / 5000: loss 1.549433
iteration 1701 / 5000: loss 1.457459
iteration 1801 / 5000: loss 1.568247
iteration 1901 / 5000: loss 1.469430
iteration 2001 / 5000: loss 1.475117
iteration 2101 / 5000: loss 1.520671
iteration 2201 / 5000: loss 1.387822
iteration 2301 / 5000: loss 1.322983
iteration 2401 / 5000: loss 1.506310
iteration 2501 / 5000: loss 1.409990
iteration 2601 / 5000: loss 1.510527
iteration 2701 / 5000: loss 1.469654
iteration 2801 / 5000: loss 1.404685
iteration 2901 / 5000: loss 1.464618
iteration 3001 / 5000: loss 1.504778
iteration 3101 / 5000: loss 1.453661
iteration 3201 / 5000: loss 1.533217
iteration 3301 / 5000: loss 1.523240
iteration 3401 / 5000: loss 1.437986
iteration 3501 / 5000: loss 1.516363
iteration 3601 / 5000: loss 1.440272
iteration 3701 / 5000: loss 1.349193
iteration 3801 / 5000: loss 1.410229
iteration 3901 / 5000: loss 1.513633
iteration 4001 / 5000: loss 1.508461
iteration 4101 / 5000: loss 1.505209
iteration 4201 / 5000: loss 1.351125
iteration 4301 / 5000: loss 1.440719
iteration 4401 / 5000: loss 1.444376
iteration 4501 / 5000: loss 1.487445
iteration 4601 / 5000: loss 1.412891
iteration 4701 / 5000: loss 1.489404
iteration 4801 / 5000: loss 1.452336
iteration 4901 / 5000: loss 1.455271
lr: 9.000000e-01, reg: 5.000000e-03, train_acc: 0.586265, val_acc: 0.570000
iteration 1 / 5000: loss 8.516044
iteration 101 / 5000: loss 1.989632
iteration 201 / 5000: loss 1.639260
iteration 301 / 5000: loss 1.581048
iteration 401 / 5000: loss 1.461112
iteration 501 / 5000: loss 1.494138
iteration 601 / 5000: loss 1.553428
iteration 701 / 5000: loss 1.581229
iteration 801 / 5000: loss 1.516893
iteration 901 / 5000: loss 1.531639
iteration 1001 / 5000: loss 1.595127
iteration 1101 / 5000: loss 1.514021
iteration 1201 / 5000: loss 1.538046
iteration 1301 / 5000: loss 1.459640
iteration 1401 / 5000: loss 1.624554
iteration 1501 / 5000: loss 1.460488
iteration 1601 / 5000: loss 1.524950
iteration 1701 / 5000: loss 1.479527
iteration 1801 / 5000: loss 1.473206
iteration 1901 / 5000: loss 1.480578
iteration 2001 / 5000: loss 1.519662
iteration 2101 / 5000: loss 1.532796
iteration 2201 / 5000: loss 1.432747
iteration 2301 / 5000: loss 1.444047
iteration 2401 / 5000: loss 1.600006
iteration 2501 / 5000: loss 1.537653
iteration 2601 / 5000: loss 1.495020
iteration 2701 / 5000: loss 1.430172
iteration 2801 / 5000: loss 1.637256
iteration 2901 / 5000: loss 1.557195
iteration 3001 / 5000: loss 1.561161
iteration 3101 / 5000: loss 1.461615
iteration 3201 / 5000: loss 1.532297
iteration 3301 / 5000: loss 1.498475
iteration 3401 / 5000: loss 1.585162
iteration 3501 / 5000: loss 1.554870
iteration 3601 / 5000: loss 1.468829
iteration 3701 / 5000: loss 1.680389
iteration 3801 / 5000: loss 1.614086
iteration 3901 / 5000: loss 1.490886
iteration 4001 / 5000: loss 1.451468
iteration 4101 / 5000: loss 1.518062
iteration 4201 / 5000: loss 1.633482
iteration 4301 / 5000: loss 1.419342
iteration 4401 / 5000: loss 1.412435
iteration 4501 / 5000: loss 1.291359
iteration 4601 / 5000: loss 1.535861
iteration 4701 / 5000: loss 1.494185
iteration 4801 / 5000: loss 1.451139
iteration 4901 / 5000: loss 1.471136
lr: 9.000000e-01, reg: 6.000000e-03, train_acc: 0.571020, val_acc: 0.551000
iteration 1 / 5000: loss 9.421499
iteration 101 / 5000: loss 2.166058
iteration 201 / 5000: loss 1.686078
iteration 301 / 5000: loss 1.493867
iteration 401 / 5000: loss 1.662787
iteration 501 / 5000: loss 1.416300
iteration 601 / 5000: loss 1.535188
iteration 701 / 5000: loss 1.563164
iteration 801 / 5000: loss 1.522354
iteration 901 / 5000: loss 1.544725
iteration 1001 / 5000: loss 1.543058
iteration 1101 / 5000: loss 1.593262
iteration 1201 / 5000: loss 1.684859
iteration 1301 / 5000: loss 1.456047
iteration 1401 / 5000: loss 1.648793
iteration 1501 / 5000: loss 1.517432
iteration 1601 / 5000: loss 1.484938
iteration 1701 / 5000: loss 1.541217
iteration 1801 / 5000: loss 1.504794
iteration 1901 / 5000: loss 1.522006
iteration 2001 / 5000: loss 1.495639
iteration 2101 / 5000: loss 1.530830
iteration 2201 / 5000: loss 1.622070
iteration 2301 / 5000: loss 1.423841
iteration 2401 / 5000: loss 1.455045
iteration 2501 / 5000: loss 1.608624
iteration 2601 / 5000: loss 1.554349
iteration 2701 / 5000: loss 1.592068
iteration 2801 / 5000: loss 1.520039
iteration 2901 / 5000: loss 1.493894
iteration 3001 / 5000: loss 1.511770
iteration 3101 / 5000: loss 1.581664
iteration 3201 / 5000: loss 1.490187
iteration 3301 / 5000: loss 1.599731
iteration 3401 / 5000: loss 1.452431
iteration 3501 / 5000: loss 1.441364
iteration 3601 / 5000: loss 1.500256
iteration 3701 / 5000: loss 1.514728
iteration 3801 / 5000: loss 1.513387
iteration 3901 / 5000: loss 1.467671
iteration 4001 / 5000: loss 1.550575
iteration 4101 / 5000: loss 1.492839
iteration 4201 / 5000: loss 1.532418
iteration 4301 / 5000: loss 1.506158
iteration 4401 / 5000: loss 1.493867
iteration 4501 / 5000: loss 1.362448
iteration 4601 / 5000: loss 1.435119
iteration 4701 / 5000: loss 1.549475
iteration 4801 / 5000: loss 1.475522
iteration 4901 / 5000: loss 1.585731
lr: 9.000000e-01, reg: 7.000000e-03, train_acc: 0.553061, val_acc: 0.540000
iteration 1 / 5000: loss 11.051557
iteration 101 / 5000: loss 2.006370
iteration 201 / 5000: loss 1.672064
iteration 301 / 5000: loss 1.616106
iteration 401 / 5000: loss 1.641281
iteration 501 / 5000: loss 1.501034
iteration 601 / 5000: loss 1.606754
iteration 701 / 5000: loss 1.681812
iteration 801 / 5000: loss 1.715446
iteration 901 / 5000: loss 1.723558
iteration 1001 / 5000: loss 1.546180
iteration 1101 / 5000: loss 1.544212
iteration 1201 / 5000: loss 1.485885
iteration 1301 / 5000: loss 1.549791
iteration 1401 / 5000: loss 1.535424
iteration 1501 / 5000: loss 1.650489
iteration 1601 / 5000: loss 1.670646
iteration 1701 / 5000: loss 1.589736
iteration 1801 / 5000: loss 1.597595
iteration 1901 / 5000: loss 1.586004
iteration 2001 / 5000: loss 1.549728
iteration 2101 / 5000: loss 1.565034
iteration 2201 / 5000: loss 1.535454
iteration 2301 / 5000: loss 1.556697
iteration 2401 / 5000: loss 1.561714
iteration 2501 / 5000: loss 1.606537
iteration 2601 / 5000: loss 1.551740
iteration 2701 / 5000: loss 1.568745
iteration 2801 / 5000: loss 1.429477
iteration 2901 / 5000: loss 1.497615
iteration 3001 / 5000: loss 1.514416
iteration 3101 / 5000: loss 1.484609
iteration 3201 / 5000: loss 1.571442
iteration 3301 / 5000: loss 1.467768
iteration 3401 / 5000: loss 1.535591
iteration 3501 / 5000: loss 1.628153
iteration 3601 / 5000: loss 1.453133
iteration 3701 / 5000: loss 1.471344
iteration 3801 / 5000: loss 1.634972
iteration 3901 / 5000: loss 1.590876
iteration 4001 / 5000: loss 1.578271
iteration 4101 / 5000: loss 1.598594
iteration 4201 / 5000: loss 1.515147
iteration 4301 / 5000: loss 1.548295
iteration 4401 / 5000: loss 1.596424
iteration 4501 / 5000: loss 1.507939
iteration 4601 / 5000: loss 1.584148
iteration 4701 / 5000: loss 1.584936
iteration 4801 / 5000: loss 1.475735
iteration 4901 / 5000: loss 1.448272
lr: 9.000000e-01, reg: 8.000000e-03, train_acc: 0.542714, val_acc: 0.526000
iteration 1 / 5000: loss 11.004438
iteration 101 / 5000: loss 1.905966
iteration 201 / 5000: loss 1.676897
iteration 301 / 5000: loss 1.637396
iteration 401 / 5000: loss 1.695048
iteration 501 / 5000: loss 1.687142
iteration 601 / 5000: loss 1.644878
iteration 701 / 5000: loss 1.610463
iteration 801 / 5000: loss 1.730429
iteration 901 / 5000: loss 1.834781
iteration 1001 / 5000: loss 1.578199
iteration 1101 / 5000: loss 1.611961
iteration 1201 / 5000: loss 1.538785
iteration 1301 / 5000: loss 1.583312
iteration 1401 / 5000: loss 1.554207
iteration 1501 / 5000: loss 1.668094
iteration 1601 / 5000: loss 1.610181
iteration 1701 / 5000: loss 1.662525
iteration 1801 / 5000: loss 1.676410
iteration 1901 / 5000: loss 1.569738
iteration 2001 / 5000: loss 1.567164
iteration 2101 / 5000: loss 1.651561
iteration 2201 / 5000: loss 1.608773
iteration 2301 / 5000: loss 1.587612
iteration 2401 / 5000: loss 1.535409
iteration 2501 / 5000: loss 1.582973
iteration 2601 / 5000: loss 1.640607
iteration 2701 / 5000: loss 1.602801
iteration 2801 / 5000: loss 1.569958
iteration 2901 / 5000: loss 1.598158
iteration 3001 / 5000: loss 1.665773
iteration 3101 / 5000: loss 1.681514
iteration 3201 / 5000: loss 1.575646
iteration 3301 / 5000: loss 1.653465
iteration 3401 / 5000: loss 1.548499
iteration 3501 / 5000: loss 1.547217
iteration 3601 / 5000: loss 1.589463
iteration 3701 / 5000: loss 1.635316
iteration 3801 / 5000: loss 1.471500
iteration 3901 / 5000: loss 1.637900
iteration 4001 / 5000: loss 1.539892
iteration 4101 / 5000: loss 1.549891
iteration 4201 / 5000: loss 1.574213
iteration 4301 / 5000: loss 1.529607
iteration 4401 / 5000: loss 1.530296
iteration 4501 / 5000: loss 1.654703
iteration 4601 / 5000: loss 1.556442
iteration 4701 / 5000: loss 1.589773
iteration 4801 / 5000: loss 1.515955
iteration 4901 / 5000: loss 1.607362
lr: 9.000000e-01, reg: 9.000000e-03, train_acc: 0.531041, val_acc: 0.514000
iteration 1 / 5000: loss 12.047566
iteration 101 / 5000: loss 1.883716
iteration 201 / 5000: loss 1.590744
iteration 301 / 5000: loss 1.562205
iteration 401 / 5000: loss 1.438070
iteration 501 / 5000: loss 1.609871
iteration 601 / 5000: loss 1.694870
iteration 701 / 5000: loss 1.686718
iteration 801 / 5000: loss 1.605962
iteration 901 / 5000: loss 1.700719
iteration 1001 / 5000: loss 1.576952
iteration 1101 / 5000: loss 1.788619
iteration 1201 / 5000: loss 1.519053
iteration 1301 / 5000: loss 1.541651
iteration 1401 / 5000: loss 1.581252
iteration 1501 / 5000: loss 1.654099
iteration 1601 / 5000: loss 1.631253
iteration 1701 / 5000: loss 1.712760
iteration 1801 / 5000: loss 1.575814
iteration 1901 / 5000: loss 1.526000
iteration 2001 / 5000: loss 1.609736
iteration 2101 / 5000: loss 1.584818
iteration 2201 / 5000: loss 1.623562
iteration 2301 / 5000: loss 1.628933
iteration 2401 / 5000: loss 1.456375
iteration 2501 / 5000: loss 1.586979
iteration 2601 / 5000: loss 1.546671
iteration 2701 / 5000: loss 1.625763
iteration 2801 / 5000: loss 1.543706
iteration 2901 / 5000: loss 1.537122
iteration 3001 / 5000: loss 1.638183
iteration 3101 / 5000: loss 1.487963
iteration 3201 / 5000: loss 1.575658
iteration 3301 / 5000: loss 1.590181
iteration 3401 / 5000: loss 1.670580
iteration 3501 / 5000: loss 1.565788
iteration 3601 / 5000: loss 1.657711
iteration 3701 / 5000: loss 1.615690
iteration 3801 / 5000: loss 1.570084
iteration 3901 / 5000: loss 1.571889
iteration 4001 / 5000: loss 1.457292
iteration 4101 / 5000: loss 1.573553
iteration 4201 / 5000: loss 1.610961
iteration 4301 / 5000: loss 1.534717
iteration 4401 / 5000: loss 1.570297
iteration 4501 / 5000: loss 1.604438
iteration 4601 / 5000: loss 1.448956
iteration 4701 / 5000: loss 1.572187
iteration 4801 / 5000: loss 1.544444
iteration 4901 / 5000: loss 1.637129
lr: 9.000000e-01, reg: 1.000000e-02, train_acc: 0.518755, val_acc: 0.510000
iteration 1 / 5000: loss 6.625941
iteration 101 / 5000: loss 2.158345
iteration 201 / 5000: loss 1.728991
iteration 301 / 5000: loss 1.653094
iteration 401 / 5000: loss 1.484523
iteration 501 / 5000: loss 1.469444
iteration 601 / 5000: loss 1.443074
iteration 701 / 5000: loss 1.544738
iteration 801 / 5000: loss 1.510045
iteration 901 / 5000: loss 1.465542
iteration 1001 / 5000: loss 1.433352
iteration 1101 / 5000: loss 1.408401
iteration 1201 / 5000: loss 1.294366
iteration 1301 / 5000: loss 1.431043
iteration 1401 / 5000: loss 1.502541
iteration 1501 / 5000: loss 1.474094
iteration 1601 / 5000: loss 1.365099
iteration 1701 / 5000: loss 1.477533
iteration 1801 / 5000: loss 1.274995
iteration 1901 / 5000: loss 1.466081
iteration 2001 / 5000: loss 1.368208
iteration 2101 / 5000: loss 1.513390
iteration 2201 / 5000: loss 1.339934
iteration 2301 / 5000: loss 1.407885
iteration 2401 / 5000: loss 1.475937
iteration 2501 / 5000: loss 1.279466
iteration 2601 / 5000: loss 1.420950
iteration 2701 / 5000: loss 1.245773
iteration 2801 / 5000: loss 1.439854
iteration 2901 / 5000: loss 1.372586
iteration 3001 / 5000: loss 1.361561
iteration 3101 / 5000: loss 1.339442
iteration 3201 / 5000: loss 1.289725
iteration 3301 / 5000: loss 1.435471
iteration 3401 / 5000: loss 1.342855
iteration 3501 / 5000: loss 1.361409
iteration 3601 / 5000: loss 1.327349
iteration 3701 / 5000: loss 1.340036
iteration 3801 / 5000: loss 1.358362
iteration 3901 / 5000: loss 1.443276
iteration 4001 / 5000: loss 1.274959
iteration 4101 / 5000: loss 1.292112
iteration 4201 / 5000: loss 1.390202
iteration 4301 / 5000: loss 1.378868
iteration 4401 / 5000: loss 1.304568
iteration 4501 / 5000: loss 1.317260
iteration 4601 / 5000: loss 1.353015
iteration 4701 / 5000: loss 1.255664
iteration 4801 / 5000: loss 1.306043
iteration 4901 / 5000: loss 1.349149
lr: 1.000000e+00, reg: 3.000000e-03, train_acc: 0.628449, val_acc: 0.571000
iteration 1 / 5000: loss 6.765817
iteration 101 / 5000: loss 2.054274
iteration 201 / 5000: loss 1.654786
iteration 301 / 5000: loss 1.598898
iteration 401 / 5000: loss 1.538643
iteration 501 / 5000: loss 1.470399
iteration 601 / 5000: loss 1.691675
iteration 701 / 5000: loss 1.389679
iteration 801 / 5000: loss 1.503692
iteration 901 / 5000: loss 1.648631
iteration 1001 / 5000: loss 1.450592
iteration 1101 / 5000: loss 1.401304
iteration 1201 / 5000: loss 1.431570
iteration 1301 / 5000: loss 1.483510
iteration 1401 / 5000: loss 1.459341
iteration 1501 / 5000: loss 1.488366
iteration 1601 / 5000: loss 1.423326
iteration 1701 / 5000: loss 1.444689
iteration 1801 / 5000: loss 1.473582
iteration 1901 / 5000: loss 1.504336
iteration 2001 / 5000: loss 1.414355
iteration 2101 / 5000: loss 1.369981
iteration 2201 / 5000: loss 1.379859
iteration 2301 / 5000: loss 1.479912
iteration 2401 / 5000: loss 1.550461
iteration 2501 / 5000: loss 1.398421
iteration 2601 / 5000: loss 1.382833
iteration 2701 / 5000: loss 1.538333
iteration 2801 / 5000: loss 1.515627
iteration 2901 / 5000: loss 1.414494
iteration 3001 / 5000: loss 1.534199
iteration 3101 / 5000: loss 1.430276
iteration 3201 / 5000: loss 1.482066
iteration 3301 / 5000: loss 1.445656
iteration 3401 / 5000: loss 1.475302
iteration 3501 / 5000: loss 1.433016
iteration 3601 / 5000: loss 1.459778
iteration 3701 / 5000: loss 1.466512
iteration 3801 / 5000: loss 1.525145
iteration 3901 / 5000: loss 1.397600
iteration 4001 / 5000: loss 1.576290
iteration 4101 / 5000: loss 1.277617
iteration 4201 / 5000: loss 1.357686
iteration 4301 / 5000: loss 1.503182
iteration 4401 / 5000: loss 1.401561
iteration 4501 / 5000: loss 1.388297
iteration 4601 / 5000: loss 1.318189
iteration 4701 / 5000: loss 1.320861
iteration 4801 / 5000: loss 1.311049
iteration 4901 / 5000: loss 1.388448
lr: 1.000000e+00, reg: 4.000000e-03, train_acc: 0.598449, val_acc: 0.567000
iteration 1 / 5000: loss 7.534077
iteration 101 / 5000: loss 2.072019
iteration 201 / 5000: loss 1.668069
iteration 301 / 5000: loss 1.571895
iteration 401 / 5000: loss 1.516849
iteration 501 / 5000: loss 1.587410
iteration 601 / 5000: loss 1.518399
iteration 701 / 5000: loss 1.609876
iteration 801 / 5000: loss 1.557320
iteration 901 / 5000: loss 1.472115
iteration 1001 / 5000: loss 1.498841
iteration 1101 / 5000: loss 1.544969
iteration 1201 / 5000: loss 1.411483
iteration 1301 / 5000: loss 1.449303
iteration 1401 / 5000: loss 1.614241
iteration 1501 / 5000: loss 1.405017
iteration 1601 / 5000: loss 1.608493
iteration 1701 / 5000: loss 1.528527
iteration 1801 / 5000: loss 1.447667
iteration 1901 / 5000: loss 1.483875
iteration 2001 / 5000: loss 1.379344
iteration 2101 / 5000: loss 1.567486
iteration 2201 / 5000: loss 1.425148
iteration 2301 / 5000: loss 1.407365
iteration 2401 / 5000: loss 1.454369
iteration 2501 / 5000: loss 1.438267
iteration 2601 / 5000: loss 1.496577
iteration 2701 / 5000: loss 1.311931
iteration 2801 / 5000: loss 1.494751
iteration 2901 / 5000: loss 1.522215
iteration 3001 / 5000: loss 1.410947
iteration 3101 / 5000: loss 1.333287
iteration 3201 / 5000: loss 1.542306
iteration 3301 / 5000: loss 1.471937
iteration 3401 / 5000: loss 1.472484
iteration 3501 / 5000: loss 1.620736
iteration 3601 / 5000: loss 1.517435
iteration 3701 / 5000: loss 1.400214
iteration 3801 / 5000: loss 1.539967
iteration 3901 / 5000: loss 1.398100
iteration 4001 / 5000: loss 1.433666
iteration 4101 / 5000: loss 1.413922
iteration 4201 / 5000: loss 1.487616
iteration 4301 / 5000: loss 1.473182
iteration 4401 / 5000: loss 1.436633
iteration 4501 / 5000: loss 1.442587
iteration 4601 / 5000: loss 1.497850
iteration 4701 / 5000: loss 1.566110
iteration 4801 / 5000: loss 1.528461
iteration 4901 / 5000: loss 1.414388
lr: 1.000000e+00, reg: 5.000000e-03, train_acc: 0.577082, val_acc: 0.543000
iteration 1 / 5000: loss 8.797585
iteration 101 / 5000: loss 2.051815
iteration 201 / 5000: loss 1.707182
iteration 301 / 5000: loss 1.646449
iteration 401 / 5000: loss 1.585067
iteration 501 / 5000: loss 1.570148
iteration 601 / 5000: loss 1.514796
iteration 701 / 5000: loss 1.658714
iteration 801 / 5000: loss 1.555812
iteration 901 / 5000: loss 1.614709
iteration 1001 / 5000: loss 1.556123
iteration 1101 / 5000: loss 1.504025
iteration 1201 / 5000: loss 1.541883
iteration 1301 / 5000: loss 1.633113
iteration 1401 / 5000: loss 1.665444
iteration 1501 / 5000: loss 1.460821
iteration 1601 / 5000: loss 1.658201
iteration 1701 / 5000: loss 1.564300
iteration 1801 / 5000: loss 1.499430
iteration 1901 / 5000: loss 1.541047
iteration 2001 / 5000: loss 1.415727
iteration 2101 / 5000: loss 1.616763
iteration 2201 / 5000: loss 1.607229
iteration 2301 / 5000: loss 1.457236
iteration 2401 / 5000: loss 1.531337
iteration 2501 / 5000: loss 1.441085
iteration 2601 / 5000: loss 1.613223
iteration 2701 / 5000: loss 1.403407
iteration 2801 / 5000: loss 1.491316
iteration 2901 / 5000: loss 1.471440
iteration 3001 / 5000: loss 1.624869
iteration 3101 / 5000: loss 1.563876
iteration 3201 / 5000: loss 1.577799
iteration 3301 / 5000: loss 1.474542
iteration 3401 / 5000: loss 1.538777
iteration 3501 / 5000: loss 1.657981
iteration 3601 / 5000: loss 1.519392
iteration 3701 / 5000: loss 1.561830
iteration 3801 / 5000: loss 1.553125
iteration 3901 / 5000: loss 1.463132
iteration 4001 / 5000: loss 1.540110
iteration 4101 / 5000: loss 1.479949
iteration 4201 / 5000: loss 1.353442
iteration 4301 / 5000: loss 1.526947
iteration 4401 / 5000: loss 1.467703
iteration 4501 / 5000: loss 1.392157
iteration 4601 / 5000: loss 1.447909
iteration 4701 / 5000: loss 1.392038
iteration 4801 / 5000: loss 1.424005
iteration 4901 / 5000: loss 1.522379
lr: 1.000000e+00, reg: 6.000000e-03, train_acc: 0.567184, val_acc: 0.541000
iteration 1 / 5000: loss 9.739913
iteration 101 / 5000: loss 2.053661
iteration 201 / 5000: loss 1.628144
iteration 301 / 5000: loss 1.602915
iteration 401 / 5000: loss 1.658345
iteration 501 / 5000: loss 1.633255
iteration 601 / 5000: loss 1.643926
iteration 701 / 5000: loss 1.748797
iteration 801 / 5000: loss 1.635467
iteration 901 / 5000: loss 1.599655
iteration 1001 / 5000: loss 1.568922
iteration 1101 / 5000: loss 1.529116
iteration 1201 / 5000: loss 1.546839
iteration 1301 / 5000: loss 1.564483
iteration 1401 / 5000: loss 1.580723
iteration 1501 / 5000: loss 1.646589
iteration 1601 / 5000: loss 1.689023
iteration 1701 / 5000: loss 1.522576
iteration 1801 / 5000: loss 1.547278
iteration 1901 / 5000: loss 1.665191
iteration 2001 / 5000: loss 1.486657
iteration 2101 / 5000: loss 1.662863
iteration 2201 / 5000: loss 1.591585
iteration 2301 / 5000: loss 1.600170
iteration 2401 / 5000: loss 1.572904
iteration 2501 / 5000: loss 1.579912
iteration 2601 / 5000: loss 1.487491
iteration 2701 / 5000: loss 1.591081
iteration 2801 / 5000: loss 1.520962
iteration 2901 / 5000: loss 1.574504
iteration 3001 / 5000: loss 1.460645
iteration 3101 / 5000: loss 1.544241
iteration 3201 / 5000: loss 1.512085
iteration 3301 / 5000: loss 1.486677
iteration 3401 / 5000: loss 1.499599
iteration 3501 / 5000: loss 1.585546
iteration 3601 / 5000: loss 1.453281
iteration 3701 / 5000: loss 1.657587
iteration 3801 / 5000: loss 1.545307
iteration 3901 / 5000: loss 1.571105
iteration 4001 / 5000: loss 1.539964
iteration 4101 / 5000: loss 1.537112
iteration 4201 / 5000: loss 1.511104
iteration 4301 / 5000: loss 1.520958
iteration 4401 / 5000: loss 1.522474
iteration 4501 / 5000: loss 1.600474
iteration 4601 / 5000: loss 1.521366
iteration 4701 / 5000: loss 1.563465
iteration 4801 / 5000: loss 1.398148
iteration 4901 / 5000: loss 1.583769
lr: 1.000000e+00, reg: 7.000000e-03, train_acc: 0.552571, val_acc: 0.525000
iteration 1 / 5000: loss 10.294179
iteration 101 / 5000: loss 1.947789
iteration 201 / 5000: loss 1.652871
iteration 301 / 5000: loss 1.593570
iteration 401 / 5000: loss 1.663740
iteration 501 / 5000: loss 1.693705
iteration 601 / 5000: loss 1.579535
iteration 701 / 5000: loss 1.612782
iteration 801 / 5000: loss 1.604312
iteration 901 / 5000: loss 1.679793
iteration 1001 / 5000: loss 1.638058
iteration 1101 / 5000: loss 1.628208
iteration 1201 / 5000: loss 1.600666
iteration 1301 / 5000: loss 1.638695
iteration 1401 / 5000: loss 1.610939
iteration 1501 / 5000: loss 1.517665
iteration 1601 / 5000: loss 1.559048
iteration 1701 / 5000: loss 1.651261
iteration 1801 / 5000: loss 1.586628
iteration 1901 / 5000: loss 1.531228
iteration 2001 / 5000: loss 1.624375
iteration 2101 / 5000: loss 1.541137
iteration 2201 / 5000: loss 1.530832
iteration 2301 / 5000: loss 1.582378
iteration 2401 / 5000: loss 1.685098
iteration 2501 / 5000: loss 1.538628
iteration 2601 / 5000: loss 1.541544
iteration 2701 / 5000: loss 1.615218
iteration 2801 / 5000: loss 1.600967
iteration 2901 / 5000: loss 1.535616
iteration 3001 / 5000: loss 1.572341
iteration 3101 / 5000: loss 1.545699
iteration 3201 / 5000: loss 1.565373
iteration 3301 / 5000: loss 1.507154
iteration 3401 / 5000: loss 1.486448
iteration 3501 / 5000: loss 1.476508
iteration 3601 / 5000: loss 1.536281
iteration 3701 / 5000: loss 1.595767
iteration 3801 / 5000: loss 1.515730
iteration 3901 / 5000: loss 1.474182
iteration 4001 / 5000: loss 1.432894
iteration 4101 / 5000: loss 1.539558
iteration 4201 / 5000: loss 1.528164
iteration 4301 / 5000: loss 1.509131
iteration 4401 / 5000: loss 1.489696
iteration 4501 / 5000: loss 1.583009
iteration 4601 / 5000: loss 1.545451
iteration 4701 / 5000: loss 1.544858
iteration 4801 / 5000: loss 1.462222
iteration 4901 / 5000: loss 1.602632
lr: 1.000000e+00, reg: 8.000000e-03, train_acc: 0.545571, val_acc: 0.534000
iteration 1 / 5000: loss 11.025876
iteration 101 / 5000: loss 2.006682
iteration 201 / 5000: loss 1.715454
iteration 301 / 5000: loss 1.641521
iteration 401 / 5000: loss 1.579472
iteration 501 / 5000: loss 1.651007
iteration 601 / 5000: loss 1.652821
iteration 701 / 5000: loss 1.648118
iteration 801 / 5000: loss 1.609789
iteration 901 / 5000: loss 1.521285
iteration 1001 / 5000: loss 1.569382
iteration 1101 / 5000: loss 1.698587
iteration 1201 / 5000: loss 1.610584
iteration 1301 / 5000: loss 1.606626
iteration 1401 / 5000: loss 1.602792
iteration 1501 / 5000: loss 1.585883
iteration 1601 / 5000: loss 1.608465
iteration 1701 / 5000: loss 1.608768
iteration 1801 / 5000: loss 1.538468
iteration 1901 / 5000: loss 1.516430
iteration 2001 / 5000: loss 1.631782
iteration 2101 / 5000: loss 1.618054
iteration 2201 / 5000: loss 1.678393
iteration 2301 / 5000: loss 1.620475
iteration 2401 / 5000: loss 1.569453
iteration 2501 / 5000: loss 1.750159
iteration 2601 / 5000: loss 1.620376
iteration 2701 / 5000: loss 1.522082
iteration 2801 / 5000: loss 1.632084
iteration 2901 / 5000: loss 1.592964
iteration 3001 / 5000: loss 1.550895
iteration 3101 / 5000: loss 1.557906
iteration 3201 / 5000: loss 1.487731
iteration 3301 / 5000: loss 1.626526
iteration 3401 / 5000: loss 1.549835
iteration 3501 / 5000: loss 1.641229
iteration 3601 / 5000: loss 1.435382
iteration 3701 / 5000: loss 1.537213
iteration 3801 / 5000: loss 1.454920
iteration 3901 / 5000: loss 1.634948
iteration 4001 / 5000: loss 1.527882
iteration 4101 / 5000: loss 1.612366
iteration 4201 / 5000: loss 1.539028
iteration 4301 / 5000: loss 1.529013
iteration 4401 / 5000: loss 1.732342
iteration 4501 / 5000: loss 1.590813
iteration 4601 / 5000: loss 1.549603
iteration 4701 / 5000: loss 1.395007
iteration 4801 / 5000: loss 1.516953
iteration 4901 / 5000: loss 1.562419
lr: 1.000000e+00, reg: 9.000000e-03, train_acc: 0.533918, val_acc: 0.523000
iteration 1 / 5000: loss 11.806633
iteration 101 / 5000: loss 1.808519
iteration 201 / 5000: loss 1.664423
iteration 301 / 5000: loss 1.742873
iteration 401 / 5000: loss 1.646826
iteration 501 / 5000: loss 1.685652
iteration 601 / 5000: loss 1.760762
iteration 701 / 5000: loss 1.596380
iteration 801 / 5000: loss 1.557886
iteration 901 / 5000: loss 1.660967
iteration 1001 / 5000: loss 1.647109
iteration 1101 / 5000: loss 1.641356
iteration 1201 / 5000: loss 1.680670
iteration 1301 / 5000: loss 1.738958
iteration 1401 / 5000: loss 1.628444
iteration 1501 / 5000: loss 1.533391
iteration 1601 / 5000: loss 1.609930
iteration 1701 / 5000: loss 1.599160
iteration 1801 / 5000: loss 1.756457
iteration 1901 / 5000: loss 1.803932
iteration 2001 / 5000: loss 1.602292
iteration 2101 / 5000: loss 1.760809
iteration 2201 / 5000: loss 1.716649
iteration 2301 / 5000: loss 1.728502
iteration 2401 / 5000: loss 1.507963
iteration 2501 / 5000: loss 1.610911
iteration 2601 / 5000: loss 1.584032
iteration 2701 / 5000: loss 1.569955
iteration 2801 / 5000: loss 1.567282
iteration 2901 / 5000: loss 1.820090
iteration 3001 / 5000: loss 1.593236
iteration 3101 / 5000: loss 1.686703
iteration 3201 / 5000: loss 1.567861
iteration 3301 / 5000: loss 1.522576
iteration 3401 / 5000: loss 1.620993
iteration 3501 / 5000: loss 1.654502
iteration 3601 / 5000: loss 1.674878
iteration 3701 / 5000: loss 1.571107
iteration 3801 / 5000: loss 1.672683
iteration 3901 / 5000: loss 1.479040
iteration 4001 / 5000: loss 1.463092
iteration 4101 / 5000: loss 1.477329
iteration 4201 / 5000: loss 1.627789
iteration 4301 / 5000: loss 1.588412
iteration 4401 / 5000: loss 1.588905
iteration 4501 / 5000: loss 1.647712
iteration 4601 / 5000: loss 1.619318
iteration 4701 / 5000: loss 1.550806
iteration 4801 / 5000: loss 1.568869
iteration 4901 / 5000: loss 1.648878
lr: 1.000000e+00, reg: 1.000000e-02, train_acc: 0.524531, val_acc: 0.516000

best val_acc: 0.608000

lr: 3.000000e-01, reg: 3.000000e-03, train_acc: 0.653959, val_acc: 0.608000
lr: 3.000000e-01, reg: 4.000000e-03, train_acc: 0.621939, val_acc: 0.578000
lr: 3.000000e-01, reg: 5.000000e-03, train_acc: 0.601184, val_acc: 0.588000
lr: 3.000000e-01, reg: 6.000000e-03, train_acc: 0.580816, val_acc: 0.559000
lr: 3.000000e-01, reg: 7.000000e-03, train_acc: 0.569612, val_acc: 0.552000
lr: 3.000000e-01, reg: 8.000000e-03, train_acc: 0.554163, val_acc: 0.543000
lr: 3.000000e-01, reg: 9.000000e-03, train_acc: 0.548959, val_acc: 0.531000
lr: 3.000000e-01, reg: 1.000000e-02, train_acc: 0.539224, val_acc: 0.530000

lr: 9.000000e-01, reg: 3.000000e-03, train_acc: 0.635122, val_acc: 0.587000
lr: 9.000000e-01, reg: 4.000000e-03, train_acc: 0.602755, val_acc: 0.566000
lr: 9.000000e-01, reg: 5.000000e-03, train_acc: 0.586265, val_acc: 0.570000
lr: 9.000000e-01, reg: 6.000000e-03, train_acc: 0.571020, val_acc: 0.551000
lr: 9.000000e-01, reg: 7.000000e-03, train_acc: 0.553061, val_acc: 0.540000
lr: 9.000000e-01, reg: 8.000000e-03, train_acc: 0.542714, val_acc: 0.526000
lr: 9.000000e-01, reg: 9.000000e-03, train_acc: 0.531041, val_acc: 0.514000
lr: 9.000000e-01, reg: 1.000000e-02, train_acc: 0.518755, val_acc: 0.510000

lr: 1.000000e+00, reg: 3.000000e-03, train_acc: 0.628449, val_acc: 0.571000
lr: 1.000000e+00, reg: 4.000000e-03, train_acc: 0.598449, val_acc: 0.567000
lr: 1.000000e+00, reg: 5.000000e-03, train_acc: 0.577082, val_acc: 0.543000
lr: 1.000000e+00, reg: 6.000000e-03, train_acc: 0.567184, val_acc: 0.541000
lr: 1.000000e+00, reg: 7.000000e-03, train_acc: 0.552571, val_acc: 0.525000
lr: 1.000000e+00, reg: 8.000000e-03, train_acc: 0.545571, val_acc: 0.534000
lr: 1.000000e+00, reg: 9.000000e-03, train_acc: 0.533918, val_acc: 0.523000
lr: 1.000000e+00, reg: 1.000000e-02, train_acc: 0.524531, val_acc: 0.516000

In [23]:
old_lr = -1
for lr, reg in sorted(results):
    if old_lr != lr:
        old_lr = lr
        print
        
    train_acc, val_acc = results[(lr, reg)]
    print('lr: %e, reg: %e, train_acc: %f, val_acc: %f' % (lr, reg, train_acc, val_acc))


lr: 3.000000e-01, reg: 3.000000e-03, train_acc: 0.653959, val_acc: 0.608000
lr: 3.000000e-01, reg: 4.000000e-03, train_acc: 0.621939, val_acc: 0.578000
lr: 3.000000e-01, reg: 5.000000e-03, train_acc: 0.601184, val_acc: 0.588000
lr: 3.000000e-01, reg: 6.000000e-03, train_acc: 0.580816, val_acc: 0.559000
lr: 3.000000e-01, reg: 7.000000e-03, train_acc: 0.569612, val_acc: 0.552000
lr: 3.000000e-01, reg: 8.000000e-03, train_acc: 0.554163, val_acc: 0.543000
lr: 3.000000e-01, reg: 9.000000e-03, train_acc: 0.548959, val_acc: 0.531000
lr: 3.000000e-01, reg: 1.000000e-02, train_acc: 0.539224, val_acc: 0.530000

lr: 9.000000e-01, reg: 3.000000e-03, train_acc: 0.635122, val_acc: 0.587000
lr: 9.000000e-01, reg: 4.000000e-03, train_acc: 0.602755, val_acc: 0.566000
lr: 9.000000e-01, reg: 5.000000e-03, train_acc: 0.586265, val_acc: 0.570000
lr: 9.000000e-01, reg: 6.000000e-03, train_acc: 0.571020, val_acc: 0.551000
lr: 9.000000e-01, reg: 7.000000e-03, train_acc: 0.553061, val_acc: 0.540000
lr: 9.000000e-01, reg: 8.000000e-03, train_acc: 0.542714, val_acc: 0.526000
lr: 9.000000e-01, reg: 9.000000e-03, train_acc: 0.531041, val_acc: 0.514000
lr: 9.000000e-01, reg: 1.000000e-02, train_acc: 0.518755, val_acc: 0.510000

lr: 1.000000e+00, reg: 3.000000e-03, train_acc: 0.628449, val_acc: 0.571000
lr: 1.000000e+00, reg: 4.000000e-03, train_acc: 0.598449, val_acc: 0.567000
lr: 1.000000e+00, reg: 5.000000e-03, train_acc: 0.577082, val_acc: 0.543000
lr: 1.000000e+00, reg: 6.000000e-03, train_acc: 0.567184, val_acc: 0.541000
lr: 1.000000e+00, reg: 7.000000e-03, train_acc: 0.552571, val_acc: 0.525000
lr: 1.000000e+00, reg: 8.000000e-03, train_acc: 0.545571, val_acc: 0.534000
lr: 1.000000e+00, reg: 9.000000e-03, train_acc: 0.533918, val_acc: 0.523000
lr: 1.000000e+00, reg: 1.000000e-02, train_acc: 0.524531, val_acc: 0.516000

In [ ]:
# get more than 55% accuracy
test_acc = (best_model.predict(X_test_feats) == y_test).mean()
print('Test accuracy: ', test_acc)

Bonus: Design your own features!

You have seen that simple image features can improve classification performance. So far we have tried HOG and color histograms, but other types of features may be able to achieve even better classification performance.

For bonus points, design and implement a new type of feature and use it for image classification on CIFAR-10. Explain how your feature works and why you expect it to be useful for image classification. Implement it in this notebook, cross-validate any hyperparameters, and compare its performance to the HOG + Color histogram baseline.

Bonus: Do something extra!

Use the material and code we have presented in this assignment to do something interesting. Was there another question we should have asked? Did any cool ideas pop into your head as you were working on the assignment? This is your chance to show off!


In [ ]: