Better Deep Learning¶
An internal place to share deep learning modeling experience, feel free to make frequent addition/deletion of the content
- Better Training
- Configure capacity with nodes and layers
- Configure what to optimize with loss functions
- Configure gradient precision with batch size
- Configure the speed of learning with learning rate
- Stabilize learning with data scaling
- Fixing vanishing gradients with ReLU (sigmoid and hyperbolic tangent not suitable as activations)
- Fixing exploding gradients with gradient clipping
- Accelerate learning with batch normalization
- Greedy layer-wise pre-training
- Jump start with transfer learning
- Issues Log
- Better Generalization
- Better Prediction
- Reduce model variance with ensemble learning
- Combine models from multiple runs with model averaging ensemble
- Contribute proportional to trust with weighted average ensemble
- Fit models on different samples with resampling ensembles
- Models from from continuous epochs with horizontal voting ensembles
- Cyclic learning rates and snapshot ensembles
- Learn to combine predictions with stacked generalization ensembles
- Combine model parameters with average model weights ensemble
- Issues Log