Neural Network Training

This notebook will summarize general approaches and strategies to deal with issues in training a neural network. Resources:

----- Architecture Sizing ------------------------------------------------------------

Deeper networks (more than 4 layers) usually do not help, except when there are very complex patterns like with convolutional neural networks.
Regularization strength is the preferred way to control overfitting, not adjusting the number of neurons.

----- Underfitting ----------------------------------------------------------------------

Model lacks complexity and does not fit the data very well (lower accuracy, higher loss). Validation set accuracy may be higher than the training set.

[Solution]

Decrease or eliminate Dropout

----- Overfitting ------------------------------------------------------------------------

Model has been trained too long and has learned the dataset. It does not generalize well to new data it hasn't seen before. Training accuracy is higher than validation accuracy.

[Solution]

More Data - if you can find it
Data Augmentation - minor alterations to batches so it isn't exactly the same
Batch Normalization - prevent an imbalanced data point from exploding the optimization process
- subtract the mean and divide by standard deviation
Reduce Architecture complexity

----- Fine Tuning ----------------------------------------------------------------------

Pop off the last dense classification layer to allow an already trained model to predict for a different set of outputs. All the previous layers can be frozen which means their weights don't get updated, however everything they've learned can be applied to a new classification task.

----- Training Time -------------------------------------------------------------------

If CNN has been fully trained, then pre-calculate the outputs from the CONV layers and just use those as inputs to the updated final Dense layers



In [ ]: