Bayesian Statistics

📊Bayesian Statistics Unit 4 – Likelihood functions

Likelihood functions are a cornerstone of statistical inference, quantifying how well different parameter values explain observed data. They bridge the gap between theoretical models and empirical observations, enabling us to estimate parameters, compare hypotheses, and update our beliefs. In Bayesian statistics, likelihood functions play a crucial role in combining prior knowledge with new data. They form the basis for updating prior distributions to posterior distributions, allowing for a principled approach to incorporating evidence and quantifying uncertainty in our statistical analyses.

What's a Likelihood Function?

  • Quantifies the plausibility of parameter values given observed data
  • Denoted as L(θx)L(\theta|x) where θ\theta represents the parameter(s) and xx represents the observed data
  • Differs from probability as it treats the parameter(s) as variable(s) and the data as fixed
  • Proportional to the probability of observing the data given the parameter values P(xθ)P(x|\theta)
  • Plays a crucial role in parameter estimation and model selection
    • Used to determine the most plausible parameter values that explain the observed data
    • Allows for comparing the relative support for different models or hypotheses
  • Essential concept in Bayesian inference for updating prior beliefs about parameters
  • Can be used to construct confidence intervals and hypothesis tests

Why Likelihood Matters in Bayesian Stats

  • Fundamental component of Bayesian inference alongside the prior distribution
  • Allows updating prior beliefs about parameters based on observed data
  • Combines with the prior distribution to form the posterior distribution
    • Posterior distribution represents the updated beliefs about the parameters after considering the data
    • Obtained by multiplying the likelihood function and the prior distribution
  • Enables a principled approach to incorporate prior knowledge and data into parameter estimation
  • Provides a framework for model comparison and selection
    • Bayes factors compare the likelihood of the data under different models
    • Allows for assessing the relative evidence for competing hypotheses
  • Crucial for making probabilistic statements and quantifying uncertainty in Bayesian analysis
  • Facilitates the integration of multiple sources of information (datadata, priorbeliefsprior beliefs, expertknowledgeexpert knowledge)

Building Likelihood Functions

  • Specify the probability distribution that generates the observed data given the parameters
  • Determine the functional form of the likelihood based on the assumed data generating process
    • Commonly used distributions include GaussianGaussian, BinomialBinomial, PoissonPoisson, ExponentialExponential
    • Choice depends on the nature of the data and the underlying scientific context
  • Treat the observed data as fixed and express the likelihood as a function of the parameters
  • Incorporate any relevant assumptions or constraints on the parameters
  • Consider the independence structure of the data points
    • Independent and identically distributed (iidiid) data leads to the likelihood being a product of individual probabilities
    • Dependent data requires more complex likelihood formulations (timeseriestime series, spatialmodelsspatial models)
  • Normalize the likelihood function to ensure it integrates to a constant
  • Verify that the likelihood function is mathematically valid and computationally tractable

Common Likelihood Distributions

  • Gaussian (Normal) likelihood
    • Used for continuous data that is symmetric and unimodal
    • Parameterized by the mean μ\mu and variance σ2\sigma^2
    • Likelihood function: L(μ,σ2x)i=1n12πσ2exp((xiμ)22σ2)L(\mu, \sigma^2|x) \propto \prod_{i=1}^n \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x_i-\mu)^2}{2\sigma^2}\right)
  • Binomial likelihood
    • Used for binary or success/failure data with a fixed number of trials
    • Parameterized by the success probability pp and the number of trials nn
    • Likelihood function: L(px)pxi(1p)nxiL(p|x) \propto p^{\sum x_i} (1-p)^{n-\sum x_i}
  • Poisson likelihood
    • Used for count data or rare events occurring in a fixed interval or space
    • Parameterized by the rate parameter λ\lambda
    • Likelihood function: L(λx)i=1nλxieλxi!L(\lambda|x) \propto \prod_{i=1}^n \frac{\lambda^{x_i} e^{-\lambda}}{x_i!}
  • Exponential likelihood
    • Used for positive continuous data with a constant hazard rate
    • Parameterized by the rate parameter λ\lambda
    • Likelihood function: L(λx)λnexp(λi=1nxi)L(\lambda|x) \propto \lambda^n \exp\left(-\lambda \sum_{i=1}^n x_i\right)
  • Other common likelihood distributions include GammaGamma, BetaBeta, MultinomialMultinomial, DirichletDirichlet

Maximum Likelihood Estimation

  • Method for estimating the parameters of a statistical model by maximizing the likelihood function
  • Finds the parameter values that make the observed data most probable under the assumed model
  • Denoted as θ^MLE=argmaxθL(θx)\hat{\theta}_{MLE} = \arg\max_{\theta} L(\theta|x)
  • Computed by solving the likelihood equations obtained by setting the derivatives of the log-likelihood to zero
    • Log-likelihood is often used for mathematical convenience and numerical stability
    • Maximum of the log-likelihood corresponds to the maximum of the likelihood function
  • Provides point estimates of the parameters without incorporating prior information
  • Asymptotically unbiased, consistent, and efficient under certain regularity conditions
  • Can be used as a starting point for Bayesian inference by combining with prior distributions
  • Limitations include potential overfitting, sensitivity to model misspecification, and lack of uncertainty quantification

Likelihood vs. Probability

  • Likelihood and probability are related but distinct concepts
  • Probability quantifies the plausibility of an event or outcome before it occurs
    • Probability is a function of the data given fixed parameter values P(xθ)P(x|\theta)
    • Probabilities sum or integrate to one over the sample space of possible outcomes
  • Likelihood quantifies the plausibility of parameter values given observed data
    • Likelihood is a function of the parameters given fixed observed data L(θx)L(\theta|x)
    • Likelihoods do not necessarily sum or integrate to one over the parameter space
  • Likelihood is not a probability distribution over the parameters
    • It does not obey the axioms of probability
    • Cannot be directly interpreted as the probability of the parameters being true
  • Likelihood ratios compare the relative plausibility of different parameter values
    • Ratios of likelihoods are meaningful and interpretable
    • Allows for assessing the relative support for different hypotheses or models
  • Likelihood principle states that all relevant information for inference is contained in the likelihood function
    • Implies that inferences should be based on the likelihood rather than the probability of the data

Likelihood in Bayesian Inference

  • Likelihood function is a key component of Bayesian inference
  • Combines with the prior distribution p(θ)p(\theta) to form the posterior distribution p(θx)p(\theta|x)
    • Posterior distribution represents the updated beliefs about the parameters after observing the data
    • Obtained through Bayes' theorem: p(θx)L(θx)×p(θ)p(\theta|x) \propto L(\theta|x) \times p(\theta)
  • Likelihood function determines how the data influences the posterior distribution
    • Stronger likelihood concentrates the posterior around the most plausible parameter values
    • Weaker likelihood results in a more diffuse posterior distribution
  • Bayesian inference incorporates prior knowledge and uncertainty about the parameters
    • Prior distribution represents the initial beliefs before observing the data
    • Likelihood function updates the prior beliefs based on the observed data
  • Posterior distribution is used for parameter estimation, prediction, and decision making
    • Point estimates can be obtained using summary statistics (posteriormeanposterior mean, medianmedian, modemode)
    • Interval estimates provide a range of plausible parameter values (credibleintervalscredible intervals)
  • Bayesian model comparison and selection rely on the likelihood function
    • Bayes factors compare the marginal likelihood of the data under different models
    • Marginal likelihood integrates the likelihood over the prior distribution of the parameters

Practical Applications

  • Parameter estimation in various fields (economicseconomics, biologybiology, engineeringengineering)
    • Estimating the parameters of regression models, time series models, or machine learning algorithms
    • Determining the most plausible values of physical constants or biological rates
  • Hypothesis testing and model selection
    • Comparing the relative support for different scientific hypotheses or theories
    • Selecting the best model among a set of competing models based on their likelihood
  • Bayesian inference in decision making and uncertainty quantification
    • Updating prior beliefs about parameters or hypotheses based on observed data
    • Incorporating expert knowledge and prior information into the analysis
  • Likelihood-based methods for handling missing data or measurement errors
    • Expectation-Maximization (EMEM) algorithm for parameter estimation with incomplete data
    • Likelihood-based imputation methods for handling missing values
  • Likelihood-free inference methods for complex models
    • Approximate Bayesian Computation (ABCABC) for models with intractable likelihoods
    • Synthetic likelihood for models where the likelihood function is difficult to evaluate
  • Applications in various domains such as financefinance, marketingmarketing, epidemiologyepidemiology, geneticsgenetics


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.