Intro to Scientific Computing
Stochastic gradient descent (SGD) is an optimization algorithm used to minimize an objective function by iteratively updating parameters based on the gradient of the loss function. Unlike standard gradient descent, which computes the gradient using the entire dataset, SGD updates the parameters using a single data point at each iteration. This leads to faster convergence, especially with large datasets, and introduces randomness that can help escape local minima.
congrats on reading the definition of stochastic gradient descent. now let's actually learn it.