Study smarter with Fiveable
Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.
Machine learning algorithms are the backbone of modern forecasting—whether you're predicting stock prices, classifying customer behavior, or modeling time-dependent phenomena. You're being tested not just on what these algorithms do, but on when to apply them, why they work, and how they compare to one another. The exam expects you to understand the trade-offs: bias vs. variance, interpretability vs. accuracy, computational cost vs. predictive power.
Don't just memorize algorithm names and definitions. Know what problem type each algorithm solves (regression vs. classification vs. time series), understand the underlying mathematical intuition, and recognize when one approach outperforms another. If you can explain why random forests reduce overfitting compared to single decision trees, or when logistic regression beats a neural network, you're thinking like a data scientist—and that's exactly what the exam rewards.
These algorithms model continuous relationships between variables. They assume some functional form connecting inputs to outputs and estimate parameters to minimize prediction error.
Compare: Linear Regression vs. Logistic Regression—both estimate coefficients for predictor variables, but linear regression predicts continuous values while logistic regression predicts probabilities for categorical outcomes. If an FRQ asks about predicting "whether" something happens, logistic is your answer; if it asks "how much," think linear.
These algorithms partition the feature space into regions using sequential decision rules. They're intuitive because they mirror human decision-making: "If X, then Y."
Compare: Random Forests vs. Gradient Boosting—both are ensemble tree methods, but random forests build trees independently (parallel) while gradient boosting builds them sequentially (each correcting prior errors). Gradient boosting typically achieves higher accuracy but requires more careful tuning and is more prone to overfitting.
These algorithms classify data points based on their proximity to other points or their likelihood under probabilistic assumptions. They rely on measuring similarity or computing conditional probabilities.
Compare: KNN vs. Naive Bayes—both are simple, interpretable classifiers, but KNN is instance-based (stores all training data) while Naive Bayes is model-based (stores probability estimates). KNN struggles with high dimensions; Naive Bayes handles them well but assumes feature independence.
These algorithms find complex decision boundaries in high-dimensional spaces. They're powerful but often sacrifice interpretability for predictive accuracy.
Compare: SVM vs. Neural Networks—both can model non-linear relationships, but SVMs work well with smaller datasets and clearer margins while neural networks excel with massive datasets and complex patterns. SVMs are more interpretable; neural networks are more flexible but act as "black boxes."
These algorithms specifically handle temporally ordered data where past values inform future predictions. They decompose patterns into trend, seasonality, and residual components.
Compare: ARIMA vs. Machine Learning approaches—ARIMA explicitly models temporal structure and works well with univariate series, while ML methods (random forests, neural networks) can incorporate multiple features but may miss temporal dependencies without feature engineering. For pure time series forecasting, ARIMA often provides a strong baseline.
| Concept | Best Examples |
|---|---|
| Continuous outcome prediction | Linear Regression, Neural Networks, KNN (regression mode) |
| Binary/multiclass classification | Logistic Regression, SVM, Naive Bayes, KNN |
| Ensemble methods (variance reduction) | Random Forests, Gradient Boosting (XGBoost) |
| High interpretability | Linear Regression, Logistic Regression, Decision Trees |
| Text and high-dimensional data | Naive Bayes, SVM with kernels |
| Time-dependent forecasting | ARIMA, SARIMA |
| Non-linear pattern capture | Neural Networks, SVM (kernel), Decision Trees |
| Small dataset performance | Naive Bayes, Logistic Regression, SVM |
Which two algorithms are both ensemble methods, and what distinguishes how they combine individual models (parallel vs. sequential)?
If you need to predict whether a customer will churn (yes/no) and explain to stakeholders exactly which factors drove the prediction, which algorithm would you choose and why?
Compare and contrast KNN and Naive Bayes: What assumptions does each make, and in what data scenarios would you prefer one over the other?
A dataset shows strong weekly seasonality in sales data. Which algorithm family is specifically designed for this, and what parameter would you adjust to capture the seasonal pattern?
An FRQ asks you to recommend an algorithm for a dataset with 50 features, 500 observations, and a binary outcome. Defend your choice by explaining why high-complexity models like neural networks might underperform here.