Stochastic gradient descent (SGD) is an optimization algorithm used to minimize the loss function in machine learning and statistical modeling by iteratively adjusting model parameters based on the gradient of the loss function. Unlike standard gradient descent, which computes gradients using the entire dataset, SGD updates parameters using a randomly selected subset or a single sample from the dataset, making it more efficient for large datasets and enabling faster convergence to optimal solutions.
congrats on reading the definition of stochastic gradient descent. now let's actually learn it.