Credibility theory blends individual risk experience with broader collective data to produce better estimates of future outcomes. The Bühlmann and Bühlmann-Straub models are the workhorses of empirical credibility: they give you a principled way to calculate how much weight individual experience deserves relative to the group average, based on data volume and variability.

Credibility theory overview

Credibility theory answers a practical question: when you have claims data from one specific risk and data from a larger pool, how do you combine them? Too much reliance on individual data is noisy; too much reliance on the collective ignores real differences between risks. Credibility theory finds the sweet spot.

Bayesian vs empirical approaches

Bayesian credibility starts with a prior distribution for the risk parameter and updates it with observed data to get a posterior estimate. This requires you to specify the prior (often using conjugate priors for tractability).
Empirical credibility skips the prior distribution entirely. Instead, it estimates the needed structural parameters directly from the observed data using techniques like method of moments.
The Bühlmann and Bühlmann-Straub models are empirical credibility models. You don't need to assume a specific prior; you just need enough data to estimate the key quantities.

Full vs partial credibility

Full credibility assigns 100% weight to individual experience. This happens when the volume and stability of the individual data are large enough that the collective adds no useful information.
Partial credibility assigns a credibility factor $Z$ between 0 and 1. The individual experience gets weight $Z$ , and the collective gets weight $1 - Z$ .
The size of $Z$ depends on how much individual data you have and how variable it is relative to the collective.

Credibility premium formula

The credibility premium is a weighted average of individual and collective experience:

$P = Z\overline{X} + (1 - Z)M$

$P$ = credibility premium
$Z$ = credibility factor (between 0 and 1)
$\overline{X}$ = individual risk experience (observed mean)
$M$ = collective premium (grand mean or manual rate)

The entire challenge of credibility modeling comes down to determining the right value of $Z$ . That's what the Bühlmann and Bühlmann-Straub models do.

Bühlmann model

The Bühlmann model provides a clean framework for deriving the optimal credibility factor when all observations for a given risk carry equal weight. It minimizes expected squared error between the credibility estimate and the true (unknown) hypothetical mean.

Model assumptions

Three key assumptions drive the model:

Each risk $i$ has a parameter $\Theta_i$ drawn independently from some population distribution $U(\theta)$ . You don't need to know the form of $U(\theta)$ .
Given $\Theta_i = \theta_i$ , the observations $X_{i1}, X_{i2}, \dots, X_{in_i}$ are independent and identically distributed with conditional mean $\mu(\theta_i)$ and conditional variance $\sigma^2(\theta_i)$ .
Different risks are independent of each other. Knowing the claims of risk $i$ tells you nothing about risk $j$ .

Observed vs hypothetical means

The observed mean for risk $i$ is $\overline{X}_i = \frac{1}{n_i} \sum_{j=1}^{n_i} X_{ij}$ . This is your best direct estimate from the data, but it's noisy when $n_i$ is small.
The hypothetical mean $\mu(\theta_i)$ is what you'd expect if you knew the risk's true parameter. This is the quantity you're trying to estimate.
The credibility approach estimates $\mu(\theta_i)$ by blending $\overline{X}_i$ with the overall population mean $\mu = E[\mu(\Theta)]$ .

Derivation of credibility factor

The optimal credibility factor minimizes $E\left[(Z_i \overline{X}_i + (1 - Z_i)\mu - \mu(\Theta_i))^2\right]$ . Solving this optimization yields:

$Z_i = \frac{n_i}{n_i + k}$

where $k = \frac{EPV}{VHM}$

Two structural parameters control everything:

EPV (Expected Process Variance) = $E[\sigma^2(\Theta)]$ . This is the average variability within a single risk. High EPV means individual observations are noisy, which pushes $Z$ down.
VHM (Variance of Hypothetical Means) = $Var(\mu(\Theta))$ . This measures how much the true means differ across risks. High VHM means risks are genuinely different from each other, which pushes $Z$ up.

The ratio $k = EPV / VHM$ captures the signal-to-noise tradeoff. When $k$ is small (low noise relative to real differences), you need fewer observations to reach high credibility.

Estimating model parameters

Since you don't know $EPV$ and $VHM$ , you estimate them from data:

Estimate EPV using the average within-risk variance:

$\widehat{EPV} = \frac{1}{\sum_{i=1}^{I} n_i - I} \sum_{i=1}^{I} \sum_{j=1}^{n_i} (X_{ij} - \overline{X}_i)^2$

This pools the squared deviations of each observation from its own risk's mean.

Estimate VHM using the between-risk variance of observed means, with an adjustment:

$\widehat{VHM} = \frac{1}{I - 1} \sum_{i=1}^{I} (\overline{X}_i - \overline{X})^2 - \frac{\widehat{EPV}}{\bar{n}}$

where $\overline{X}$ is the grand mean across all risks and $\bar{n}$ is the average number of observations per risk.

The subtraction of $\widehat{EPV}/\bar{n}$ is critical. Without it, you'd overestimate VHM because the spread of observed means includes both genuine differences between risks and sampling noise. If this estimate comes out negative, you set $\widehat{VHM} = 0$ , which means $Z = 0$ (no credibility to individual experience).

Compute $k$ and then $Z_i$ for each risk.

Bühlmann premium calculation

Once you have $Z_i$ , the credibility premium for risk $i$ is:

$P_i = Z_i \overline{X}_i + (1 - Z_i) \overline{X}$

As $n_i$ grows, $Z_i \to 1$ and the premium converges to the individual experience. With very little data, $Z_i \approx 0$ and the premium stays close to the grand mean.

Bayesian vs empirical approaches, Frontiers | Indices of Effect Existence and Significance in the Bayesian Framework

Bühlmann-Straub model

The Bühlmann-Straub model generalizes the Bühlmann model to handle unequal exposures. In practice, risks rarely contribute equal amounts of data. One policyholder might insure 50 employees while another insures 5,000. The Bühlmann-Straub model accounts for this by weighting observations by their exposure or volume.

Model vs Bühlmann assumptions

The Bühlmann-Straub model keeps the same structural assumptions as Bühlmann, with one key modification:

Each observation $X_{ij}$ has a known risk volume (exposure) $m_{ij}$
The conditional variance is inversely proportional to volume: $Var(X_{ij} \mid \Theta_i = \theta_i) = \frac{\sigma^2(\theta_i)}{m_{ij}}$

This makes intuitive sense: a claim rate based on 5,000 policies is more stable than one based on 50 policies.

Varying risk volumes

The risk volume $m_{ij}$ represents the size of the exposure for observation $j$ of risk $i$ . Common examples include number of policies, total payroll, or earned premium.

Total volume for risk $i$ : $M_i = \sum_{j=1}^{n_i} m_{ij}$
Volume-weighted mean for risk $i$ : $\overline{X}_i = \frac{1}{M_i} \sum_{j=1}^{n_i} m_{ij} X_{ij}$

Larger volumes produce more stable means, so the model naturally gives them more influence.

Derivation of credibility factor

The credibility factor now uses total volume instead of observation count:

$Z_i = \frac{M_i}{M_i + k}$

where $k = \frac{EPV}{VHM}$ , exactly as in the Bühlmann model.

The only difference from the Bühlmann formula is that $n_i$ is replaced by $M_i$ . A risk with large total exposure gets high credibility even if it has few observation periods, because each period carries substantial volume.

Estimating model parameters

The estimation formulas parallel the Bühlmann model but incorporate volumes:

Estimate EPV:

$\widehat{EPV} = \frac{1}{\sum_{i=1}^{I} M_i - I} \sum_{i=1}^{I} \sum_{j=1}^{n_i} m_{ij}(X_{ij} - \overline{X}_i)^2$

This is a volume-weighted pooled variance within each risk.

Estimate VHM:

$\widehat{VHM} = \frac{1}{I - 1}\left[\sum_{i=1}^{I} M_i(\overline{X}_i - \overline{X})^2 - (I - 1)\widehat{EPV}\right] \cdot \frac{1}{\sum M_i - \frac{\sum M_i^2}{\sum M_i}}$

A simplified version that's sometimes presented uses:

$\widehat{VHM} = \frac{\sum_{i=1}^{I} M_i(\overline{X}_i - \overline{X})^2 - (I-1)\widehat{EPV}}{\sum M_i - \frac{\sum M_i^2}{\sum M_i}}$

The denominator adjustment accounts for the fact that risks with different volumes contribute unequally to the between-risk variance. As with the Bühlmann model, if this estimate is negative, set $\widehat{VHM} = 0$ .

Compute $k$ and then $Z_i$ for each risk.

Bühlmann-Straub premium calculation

The premium formula is identical in form to the Bühlmann model:

$P_i = Z_i \overline{X}_i + (1 - Z_i) \overline{X}$

The difference is that $\overline{X}_i$ is now the volume-weighted mean, $Z_i$ uses $M_i$ instead of $n_i$ , and the grand mean $\overline{X}$ is also volume-weighted across all risks.

Comparison of models

Bühlmann vs Bühlmann-Straub

Feature	Bühlmann	Bühlmann-Straub
Exposure handling	Equal weight per observation	Weights by risk volume $m_{ij}$
Credibility factor	$Z_i = \frac{n_i}{n_i + k}$	$Z_i = \frac{M_i}{M_i + k}$
Observed mean	Simple average	Volume-weighted average
Best used when	Risks have similar exposures	Risks have varying exposures

The Bühlmann model is actually a special case of Bühlmann-Straub where all volumes $m_{ij} = 1$ .

Advantages of Bühlmann-Straub

Handles the heterogeneity in risk sizes that's typical of real insurance portfolios
Produces more accurate premiums by appropriately weighting stable, high-volume experience
Naturally adapts credibility to exposure size without requiring separate adjustments

Bayesian vs empirical approaches, Frontiers | Increasing Interpretability of Bayesian Probabilistic Programming Models Through ...

Limitations of Bühlmann-Straub

Requires reliable exposure data for every observation, which isn't always available
Assumes variance is inversely proportional to volume ( $Var \propto 1/m$ ). This linear relationship may not hold if, for example, large risks have systematically different claim patterns
Small risks with extreme claims can still distort parameter estimates, particularly $\widehat{VHM}$
Both models assume the credibility premium is a linear function of the data, which may miss nonlinear patterns

Applications of credibility models

Experience rating in insurance

Credibility models are the foundation of experience rating. The credibility premium adjusts the manual (class) rate based on a policyholder's own claims history. Good experience pulls the premium below the class rate; poor experience pushes it above. The credibility factor controls how aggressively the adjustment responds to individual data.

Prospective premium calculation

For setting next year's premium, the credibility estimate combines past individual experience with expected future collective trends. The credibility factor determines how much the individual's history matters versus the broader class expectation. This keeps premiums responsive to real risk differences while avoiding wild swings from random claim fluctuations.

Estimating claim frequency and severity

Credibility models apply to both components of pure premium:

Claim frequency: Observed claim counts for a risk are blended with the collective average frequency. This helps identify genuinely high-frequency risks versus those that had a bad year by chance.
Claim severity: Observed average claim amounts are blended with the collective average severity. This is particularly useful when individual claim sizes are highly variable and a single large claim could distort the picture.

Advanced topics in credibility

These extensions address limitations of the basic Bühlmann and Bühlmann-Straub models.

Hierarchical credibility models

Real portfolios often have nested structure: individual risks within subclasses within broader classes. Hierarchical models (such as the Jewell model) estimate credibility at each level of the hierarchy. A risk's premium reflects its own experience, its subclass experience, and the portfolio-wide experience, each weighted by the appropriate credibility factor at that level.

Regression-based credibility models

The Hachemeister regression model incorporates covariates (risk characteristics like age, location, or industry) directly into the credibility framework. Instead of estimating a single mean for each risk, it estimates regression coefficients that capture how risk factors relate to expected claims. Generalized linear mixed models (GLMMs) extend this further by combining credibility concepts with modern regression techniques.

Nonparametric credibility models

These relax the distributional assumptions of Bühlmann-type models. Techniques like kernel smoothing or empirical Bayes methods estimate credibility premiums without assuming specific parametric forms for the claim distribution. This provides more flexibility when claim distributions are skewed, heavy-tailed, or otherwise poorly captured by standard assumptions.

Bayesian credibility models

Bayesian approaches specify prior distributions for risk parameters and update them with observed data to produce posterior distributions. They allow incorporation of expert judgment and external information beyond the data at hand. Hierarchical Bayesian models combine the multi-level structure of hierarchical credibility with the full probabilistic framework of Bayesian inference, providing both point estimates and uncertainty quantification.

2,589 studying →