Non-linear models: Types vs Applications
Types of non-linear models
Non-linear models describe relationships between variables that don't follow a straight line. They're the right tool when the rate of change between your independent and dependent variables isn't constant, which turns out to be extremely common in real-world data.
Exponential models apply when the rate of change of the dependent variable is proportional to its current value. The general form is:
where is the initial value, is the growth or decay factor, and is the independent variable.
- Exponential growth (): The rate of change increases over time. Think population growth or compound interest. If , the quantity grows by 3% per unit of time.
- Exponential decay (): The rate of change decreases over time. Radioactive decay and drug elimination from the body follow this pattern.
Logarithmic models are the inverse of exponential models. They capture situations where the dependent variable changes quickly at first, then levels off as the independent variable increases. The form is:
where is the y-intercept and is the slope. A classic example: the relationship between body mass and metabolic rate in animals, where metabolic rate rises steeply for small animals but flattens out for larger ones.
Polynomial models handle curvilinear relationships using a polynomial equation of degree :
- Quadratic models (degree 2) describe relationships with a single turning point, like the trajectory of a thrown object or profit as a function of production level.
- Higher-degree polynomials can capture more complex curves, but they come with a real tradeoff: they're prone to overfitting and become harder to interpret as the degree increases.
Applying non-linear models to real-world data
Fitting a non-linear model to data involves three main steps:
- Select the model type. Look at the observed pattern in your data (scatter plots help) and consider what the underlying theory suggests. Does the relationship look like it accelerates? Levels off? Has a peak?
- Estimate the parameters. Methods like least squares regression or maximum likelihood estimation find parameter values that minimize the difference between observed and predicted values.
- Interpret the results in context. This is where the model type matters:
- In an exponential model, the growth factor tells you the rate of change. A population model with means a 5% increase per unit of time.
- In a logarithmic model, the slope represents the change in associated with a one-unit increase in . The practical meaning depends on your specific variables.
- In a polynomial model, each coefficient captures the effect of at a different order. For a quadratic, the coefficient of the term determines the direction and steepness of the curvature.
Once fitted, the model can generate predictions for new values of the independent variable. Just be cautious about extrapolation: predictions outside the range of your original data can be unreliable, especially with polynomials.

Logistic regression for binary outcomes
Properties of logistic regression
Logistic regression predicts the probability of a binary outcome (success/failure, yes/no, present/absent) based on one or more predictor variables. Unlike ordinary regression, it doesn't predict a continuous value. Instead, it maps a linear combination of predictors onto a probability between 0 and 1 using the logistic function:
Here, is the predicted probability, is the intercept, through are the coefficients for predictor variables through , and is the base of the natural logarithm.
The S-shaped (sigmoid) curve of this function is what keeps predicted probabilities bounded between 0 and 1, which a standard linear model can't guarantee.
Parameter estimation uses maximum likelihood estimation (MLE), which finds the coefficient values that make the observed data most probable given the model. This differs from ordinary least squares used in standard linear regression.
Interpreting coefficients in logistic regression centers on the odds ratio. For a one-unit increase in a predictor variable (holding all others constant):
- An odds ratio greater than 1 means the odds of the outcome increase.
- An odds ratio less than 1 means the odds decrease.
- An odds ratio of exactly 1 means no effect.
For example, if a predictor has an odds ratio of 2.5, a one-unit increase in that predictor multiplies the odds of the outcome by 2.5.

Applications of logistic regression
Logistic regression works with both categorical and continuous predictors, making it versatile across many fields:
- Medical diagnosis: Predicting whether a disease is present or absent based on patient characteristics and test results.
- Marketing: Estimating the likelihood a customer will purchase a product based on demographic and behavioral data.
- Credit risk assessment: Predicting the probability of loan default based on an applicant's financial profile.
The framework also extends beyond binary outcomes:
- Multinomial logistic regression handles outcomes with more than two unordered categories (e.g., choosing among three brands).
- Ordinal logistic regression handles ordered categories (e.g., rating something as low, medium, or high).
These extensions modify the link function and how coefficients are interpreted, but the core logic remains the same.
Goodness-of-fit and predictive power of non-linear models
Evaluating goodness-of-fit
Goodness-of-fit tells you how well your model captures the underlying pattern in the data and how much variability it explains.
Coefficient of determination () is the most familiar measure. It represents the proportion of variance in the dependent variable explained by the model. However, use it with caution for non-linear models. The standard doesn't always have the same clean interpretation it has in linear regression, and some software packages compute it differently for non-linear fits.
Residual analysis is often more informative. Residuals are the differences between observed and predicted values. For a well-fitting model:
- Residuals should be randomly scattered around zero with no systematic pattern.
- Plotting residuals against predicted values (or against the independent variable) can reveal problems:
- Fanning patterns suggest heteroscedasticity (non-constant variance).
- Curved patterns suggest the model hasn't fully captured the non-linearity.
- Isolated extreme points may be outliers or influential observations that disproportionately affect the fit.
Assessing predictive power
A model can fit the training data well but still predict poorly on new data. Cross-validation techniques test this directly:
- Split the data into training and testing sets.
- Fit the model on the training set.
- Evaluate performance on the held-out testing set.
Common approaches include k-fold cross-validation (splitting data into subsets and rotating which one is the test set) and leave-one-out cross-validation (each observation takes a turn as the test set).
Prediction error metrics for continuous outcomes include:
- Mean squared error (MSE): Average of squared residuals on the test set.
- Root mean squared error (RMSE): Square root of MSE, in the same units as the dependent variable.
- Mean absolute error (MAE): Average of absolute residuals, less sensitive to large errors than MSE.
For logistic regression specifically, the area under the ROC curve (AUC-ROC) measures how well the model discriminates between the two outcome classes. An AUC of 0.5 means the model is no better than random guessing; an AUC of 1.0 means perfect discrimination.
When comparing multiple non-linear models, predictive power matters, but so do interpretability, parsimony (simpler models are preferred when performance is similar), and theoretical justification. The best-fitting model isn't always the most useful one if it can't be explained or doesn't align with domain knowledge.