Survival and hazard functions are the core tools actuaries use to model how long people live and how risk changes over time. The survival function tells you the probability someone is still alive at time $t$ , while the hazard function captures the instantaneous risk of death at that moment. Together, they form the mathematical backbone for building mortality tables, pricing life insurance, and valuing annuities.

Definition of survival function

The survival function $S(t)$ gives the probability that an individual survives beyond time $t$ . If $T$ is the random variable representing time of death (or failure), then:

$S(t) = P(T > t)$

You can think of it as answering the question: "What's the chance this person is still alive after $t$ years?"

Relationship to the distribution function

The survival function is simply the complement of the cumulative distribution function $F(t)$ :

$S(t) = 1 - F(t)$

where $F(t) = P(T \leq t)$ is the probability of death occurring at or before time $t$ . The probability density function $f(t)$ connects to both through differentiation:

$f(t) = F'(t) = -S'(t)$

This means the density function equals the negative derivative of the survival function. Knowing any one of $S(t)$ , $F(t)$ , or $f(t)$ lets you derive the other two.

Properties of survival function

Three properties define a valid survival function. If any of these fail, you don't have a legitimate model.

Non-increasing

$S(t)$ never increases over time. For any $t_1 \leq t_2$ , we have $S(t_1) \geq S(t_2)$ . This makes intuitive sense: the probability of surviving past 80 years can't be higher than the probability of surviving past 70 years.

Right-continuous

The survival function is right-continuous at every point:

$\lim_{t \to t_0^+} S(t) = S(t_0)$

This technical property ensures the function is well-defined, particularly when dealing with discrete jumps in the distribution (such as in empirical survival data).

Boundary conditions

At time zero: $S(0) = 1$ . Everyone is alive at the start.
As time goes to infinity: $\lim_{t \to \infty} S(t) = 0$ . Eventually, everyone dies.

These two conditions anchor the survival function between 1 and 0.

Definition of hazard function

The hazard function $h(t)$ measures the instantaneous rate of death at time $t$ , given survival up to that point. Unlike $S(t)$ , which gives a probability, $h(t)$ is a rate and can exceed 1.

Instantaneous failure rate

Formally, the hazard function is defined as:

$h(t) = \lim_{\Delta t \to 0} \frac{P(t \leq T < t + \Delta t \mid T \geq t)}{\Delta t}$

In words: take a tiny interval starting at $t$ , find the conditional probability of dying in that interval (given you've survived to $t$ ), and divide by the interval width. The limit gives you the instantaneous failure rate.

An equivalent and often more useful formula is:

$h(t) = \frac{f(t)}{S(t)}$

This ratio of the density to the survival function is frequently the easiest way to compute $h(t)$ in practice.

Relationship to survival function

The hazard and survival functions are linked by:

$h(t) = -\frac{d}{dt} \ln S(t)$

Integrating both sides from 0 to $t$ gives the cumulative hazard function:

$H(t) = \int_0^t h(u)\, du = -\ln S(t)$

And inverting that relationship recovers the survival function:

$S(t) = \exp\left(-\int_0^t h(u)\, du\right) = \exp(-H(t))$

These relationships are central. If you know any one of $S(t)$ , $h(t)$ , $f(t)$ , or $H(t)$ , you can derive all the others.

Properties of hazard function

Probability of survival, Finding median survival time from survival function - Cross Validated

Non-negativity

$h(t) \geq 0$ for all $t$ . Since $h(t) = f(t)/S(t)$ and both the density and survival function are non-negative, the hazard rate must be too.

Cumulative hazard function

The cumulative hazard $H(t) = \int_0^t h(u)\, du$ accumulates risk over time. It's non-decreasing (risk can only pile up, never reverse) and satisfies:

$H(0) = 0$
$\lim_{t \to \infty} H(t) = \infty$
$S(t) = \exp(-H(t))$

The second condition follows from the requirement that $S(t) \to 0$ as $t \to \infty$ .

Relationships between functions

Here's a summary of how to move between the key functions. Knowing one determines all the others:

Starting from	To get $S(t)$	To get $h(t)$	To get $f(t)$
$S(t)$	—	$-\frac{d}{dt}\ln S(t)$	$-S'(t)$
$h(t)$	$\exp\left(-\int_0^t h(u)\,du\right)$	—	$h(t)\cdot S(t)$
$f(t)$	$1 - \int_0^t f(u)\,du$	$\frac{f(t)}{S(t)}$	—

The key derivation steps:

Hazard from survival: Differentiate $-\ln S(t)$ with respect to $t$ .
Survival from hazard: Integrate $h(u)$ from 0 to $t$ , negate, and exponentiate.
Density from survival: Take the negative derivative of $S(t)$ .

Common survival distributions

Exponential distribution

The simplest survival model. It has a constant hazard rate:

Hazard: $h(t) = \lambda$ for all $t \geq 0$ , where $\lambda > 0$
Survival: $S(t) = \exp(-\lambda t)$
Mean lifetime: $1/\lambda$

The exponential distribution has the memoryless property: the probability of surviving an additional $s$ years doesn't depend on how long you've already survived. This makes it unrealistic for human mortality (older people clearly face higher risk), but it's useful as a baseline model and in contexts where aging effects are negligible.

Weibull distribution

The Weibull generalizes the exponential by allowing the hazard to change over time:

Hazard: $h(t) = \lambda k (\lambda t)^{k-1}$
Survival: $S(t) = \exp(-(\lambda t)^k)$

where $\lambda > 0$ is the scale parameter and $k > 0$ is the shape parameter. The shape parameter controls the hazard behavior:

$k < 1$ : decreasing hazard (e.g., infant mortality, early component failure)
$k = 1$ : constant hazard (reduces to exponential)
$k > 1$ : increasing hazard (e.g., wear-out, aging)

Gompertz-Makeham distribution

This is the go-to model for adult human mortality. The hazard function is:

$h(t) = \alpha + \beta \exp(\gamma t)$

where $\alpha \geq 0$ represents age-independent (accidental) mortality, $\beta > 0$ is the baseline level of age-dependent mortality, and $\gamma > 0$ controls how fast mortality increases with age. The corresponding survival function is:

$S(t) = \exp\left(-\alpha t - \frac{\beta}{\gamma}(\exp(\gamma t) - 1)\right)$

When $\alpha = 0$ , this reduces to the pure Gompertz model. The exponential growth term $\beta \exp(\gamma t)$ captures the empirical observation that human mortality roughly doubles every 7-8 years in adulthood.

Estimating survival functions

Kaplan-Meier estimator

The Kaplan-Meier (product-limit) estimator is the standard nonparametric method for estimating $S(t)$ from observed data:

$\hat{S}(t) = \prod_{i:\, t_i \leq t} \left(1 - \frac{d_i}{n_i}\right)$

where:

$t_i$ = observed failure times (in order)
$d_i$ = number of deaths at time $t_i$
$n_i$ = number of individuals still at risk just before $t_i$

The estimator produces a step function that drops at each observed death time. Its major strength is that it handles censored data (individuals who leave the study or are still alive at the end) by removing them from the risk set at the time of censoring rather than treating them as deaths.

Probability of survival, Kaplan-Meier estimator - wikidoc

Nelson-Aalen estimator

The Nelson-Aalen estimator targets the cumulative hazard function instead:

$\hat{H}(t) = \sum_{i:\, t_i \leq t} \frac{d_i}{n_i}$

You can then estimate the survival function as $\hat{S}(t) = \exp(-\hat{H}(t))$ . For large samples, the Kaplan-Meier and Nelson-Aalen-based survival estimates are very similar, but the Nelson-Aalen estimator can be more stable in small samples.

Confidence intervals

Confidence intervals quantify uncertainty in the estimated survival probabilities. Greenwood's formula estimates the variance of the Kaplan-Meier estimator:

$\widehat{\text{Var}}(\hat{S}(t)) = [\hat{S}(t)]^2 \sum_{i:\, t_i \leq t} \frac{d_i}{n_i(n_i - d_i)}$

A common refinement is the log-log transformation, which constructs intervals on the scale of $\ln(-\ln S(t))$ and then transforms back. This approach keeps the confidence limits within the valid range of [0, 1].

Comparing survival functions

Log-rank test

The log-rank test is the most widely used nonparametric test for comparing survival curves between two or more groups. It tests the null hypothesis that all groups have identical survival functions.

At each observed death time, the test compares the observed number of deaths in each group to the expected number (calculated assuming no difference between groups). The test statistic follows a chi-square distribution with degrees of freedom equal to the number of groups minus one. The log-rank test gives equal weight to all time points, making it most powerful when hazard ratios are roughly constant over time.

Wilcoxon (Breslow) test

The Wilcoxon test works like the log-rank test but weights each time point by the number of individuals at risk $n_i$ . This gives more weight to earlier time points, when the risk set is larger. Use the Wilcoxon test when you believe differences between groups are more pronounced early on. Its test statistic also follows a chi-square distribution under the null hypothesis.

Cox proportional hazards model

The Cox model is a semi-parametric regression approach that relates covariates to the hazard function:

$h(t \mid x) = h_0(t) \exp(\beta^T x)$

where $h_0(t)$ is an unspecified baseline hazard and $\beta$ is the vector of regression coefficients. The key assumption is proportional hazards: the ratio of hazard functions for any two individuals is constant over time.

The exponentiated coefficients $\exp(\beta_j)$ are interpreted as hazard ratios. For example, $\exp(\beta_j) = 1.5$ means a one-unit increase in covariate $j$ is associated with a 50% increase in the hazard of death. The Cox model is powerful because it estimates covariate effects without requiring you to specify the shape of the baseline hazard.

Applications in actuarial science

Life insurance pricing

Actuaries build mortality tables from estimated survival functions, then use those tables to price life insurance. The premium for a policy reflects the expected present value of the death benefit, weighted by the probability of death in each future year and discounted at an appropriate interest rate. More refined survival models (incorporating age, sex, health status, and smoking status) lead to more accurate risk classification and fairer premiums.

Annuity valuation

An annuity pays a stream of income, often until the annuitant dies. Valuing an annuity requires estimating how long the annuitant will survive. The expected present value of an annuity equals the sum of each future payment multiplied by the probability of the annuitant being alive at that payment date, discounted back to the present. Longer expected survival means higher annuity values, which is why annuity pricing is so sensitive to the choice of survival model.

Pension plan funding

Pension plans promise future retirement benefits, and actuaries must ensure there are enough assets to cover those promises. Survival functions project how many retirees will be alive to collect benefits in each future year. Underestimating longevity leads to underfunding; overestimating it leads to unnecessarily high contributions. Regular actuarial valuations reassess mortality assumptions and adjust contribution rates to keep the plan solvent.