Stochastic gradient descent (SGD) is an optimization algorithm used for minimizing the loss function in machine learning models, particularly neural networks. It updates the model's weights iteratively based on a random subset of data points, rather than using the entire dataset at once. This approach allows for faster convergence and can escape local minima more effectively, making it a popular choice for training complex models.