This post explores the effects of different parameters on model training using Torch's stochastic gradient descent optimizer. It includes a toy problem of performing linear regression, visualizations of loss functions, and the impact of parameters like momentum, weight decay, dampening, and nesterov momentum.
•7m read time• From towardsdatascience.com
Table of contents
Visualizing Gradient Descent Parameters in TorchToy ProblemVisualizing the Loss FunctionVisualizing the Other ParametersReferencesSort: