Neural Network Training

This notebook will summarize general approaches and strategies to deal with issues in training a neural network. Resources:

----- Architecture Sizing ------------------------------------------------------------

  • Deeper networks (more than 4 layers) usually do not help, except when there are very complex patterns like with convolutional neural networks.
  • Regularization strength is the preferred way to control overfitting, not adjusting the number of neurons.

----- Underfitting ----------------------------------------------------------------------

Model lacks complexity and does not fit the data very well (lower accuracy, higher loss). Validation set accuracy may be higher than the training set.

[Solution]
  • Decrease or eliminate Dropout

----- Overfitting ------------------------------------------------------------------------

Model has been trained too long and has learned the dataset. It does not generalize well to new data it hasn't seen before. Training accuracy is higher than validation accuracy.

[Solution]
  • More Data - if you can find it
  • Data Augmentation - minor alterations to batches so it isn't exactly the same
  • Batch Normalization - prevent an imbalanced data point from exploding the optimization process
    • subtract the mean and divide by standard deviation
  • Reduce Architecture complexity

----- Fine Tuning ----------------------------------------------------------------------

Pop off the last dense classification layer to allow an already trained model to predict for a different set of outputs. All the previous layers can be frozen which means their weights don't get updated, however everything they've learned can be applied to a new classification task.

----- Training Time -------------------------------------------------------------------

  • If CNN has been fully trained, then pre-calculate the outputs from the CONV layers and just use those as inputs to the updated final Dense layers

In [ ]: