Best Linear Unbiased Estimator (BLUE) is a key concept in econometrics. It describes estimators that are unbiased and have the smallest variance among all linear unbiased estimators, making them the most efficient for parameter estimation.

The Gauss-Markov theorem states that under certain assumptions, the ordinary least squares (OLS) estimator is BLUE. This theorem provides a foundation for linear regression analysis and helps ensure reliable parameter estimates in econometric models.

Properties of BLUE

Best Linear Unbiased Estimator (BLUE) is a central concept in econometrics that describes the optimal properties of an estimator
BLUE estimators are desirable because they have the smallest variance among all linear unbiased estimators, making them the most efficient

Gauss-Markov theorem

States that under certain assumptions, the ordinary least squares (OLS) estimator is BLUE
Provides a set of sufficient conditions for an estimator to be BLUE
Ensures that the OLS estimator has the lowest variance among all linear unbiased estimators

Unbiasedness

An estimator is unbiased if its expected value is equal to the true value of the parameter being estimated
Unbiasedness is a desirable property because it ensures that the estimator, on average, correctly estimates the parameter
Can be mathematically expressed as $E[\hat{\beta}] = \beta$ , where $\hat{\beta}$ is the estimator and $\beta$ is the true parameter value

Minimum variance

BLUE estimators have the smallest variance among all linear unbiased estimators
Minimum variance implies that the estimator is the most precise and efficient among the class of linear unbiased estimators
Leads to narrower confidence intervals and more accurate hypothesis testing

Linear in parameters

BLUE estimators are linear functions of the observed data
Linearity in parameters implies that the estimator can be expressed as a linear combination of the observed variables
Allows for straightforward computation and interpretation of the estimates

Deriving BLUE

Several methods can be used to derive BLUE estimators, depending on the assumptions and the available information
These methods aim to find estimators that satisfy the properties of BLUE, such as unbiasedness and minimum variance

Method of moments

A general approach to estimating parameters by equating sample moments to population moments
Involves setting up a system of equations based on the moment conditions and solving for the parameters
Can be used to derive BLUE estimators when the population moments are known or can be estimated from the sample

Ordinary least squares

A widely used method for estimating the parameters in a linear regression model
Minimizes the sum of squared residuals between the observed and predicted values
Under the Gauss-Markov assumptions, the OLS estimator is BLUE

Maximum likelihood estimation

An estimation method that finds the parameter values that maximize the likelihood function
The likelihood function represents the joint probability of observing the sample data given the parameter values
Maximum likelihood estimators are asymptotically BLUE under certain regularity conditions

Gauss-Markov theorem, regression - OLS estimate of a linear model with dummy variable - Cross Validated

Assumptions for BLUE

For an estimator to be BLUE, certain assumptions must be satisfied
These assumptions ensure that the Gauss-Markov theorem holds and that the estimator has the desired properties

Linearity in parameters

The regression model must be linear in the parameters (coefficients)
Implies that the dependent variable is a linear function of the independent variables and the error term
Allows for the use of linear estimation methods, such as OLS

Random sampling

The sample data must be obtained through random sampling from the population
Random sampling ensures that the observations are independent and identically distributed (i.i.d.)
Helps to avoid selection bias and ensures that the sample is representative of the population

No perfect collinearity

The independent variables in the regression model must not be perfectly correlated with each other
Perfect collinearity occurs when one independent variable is an exact linear combination of the others
Perfect collinearity leads to the inability to estimate the parameters uniquely

Zero conditional mean

The error term must have a zero conditional mean, given the values of the independent variables
Mathematically, $E[u|X] = 0$ , where $u$ is the error term and $X$ represents the independent variables
Ensures that the error term is not systematically related to the independent variables, avoiding bias in the estimates

Homoskedasticity

The error term must have constant variance across all observations
Homoskedasticity implies that the spread of the errors is the same for all values of the independent variables
Violation of homoskedasticity (heteroskedasticity) can lead to inefficient estimates and invalid standard errors

Violations of BLUE assumptions

When the assumptions for BLUE are violated, the properties of the estimator may no longer hold
Violations can lead to biased, inconsistent, or inefficient estimates, affecting the reliability of the results

Consequences of violations

Biased estimates: If the zero conditional mean assumption is violated (e.g., omitted variable bias), the estimator may be biased
Inconsistent estimates: If the random sampling assumption is violated (e.g., non-random sample selection), the estimator may not converge to the true value as the sample size increases
Inefficient estimates: If the homoskedasticity assumption is violated (heteroskedasticity), the estimator may not have the minimum variance among linear unbiased estimators

Gauss-Markov theorem, How to derive variance-covariance matrix of coefficients in linear regression - Cross Validated

Detecting violations

Residual plots: Plotting the residuals against the fitted values or independent variables can reveal patterns that suggest violations of assumptions (heteroskedasticity, non-linearity)
Statistical tests: Various tests can be used to detect specific violations, such as the Breusch-Pagan test for heteroskedasticity or the Durbin-Watson test for autocorrelation

Correcting for violations

Robust standard errors: When heteroskedasticity is present, using robust standard errors (e.g., White's heteroskedasticity-consistent standard errors) can provide valid inference
Transformations: Applying transformations to the variables (e.g., logarithmic, square root) can sometimes address non-linearity or heteroskedasticity
Instrumental variables: When the zero conditional mean assumption is violated due to endogeneity, instrumental variable estimation (e.g., two-stage least squares) can be used to obtain consistent estimates

BLUE vs biased estimators

While BLUE estimators are desirable, there may be situations where biased estimators are preferred or necessary
The choice between BLUE and biased estimators depends on various factors, such as the nature of the data, the purpose of the analysis, and the trade-offs involved

Bias-variance tradeoff

Biased estimators may have lower variance than unbiased estimators, leading to a trade-off between bias and variance
In some cases, accepting a small amount of bias in exchange for a significant reduction in variance can result in better overall performance
Ridge regression and LASSO are examples of biased estimators that can outperform OLS in the presence of multicollinearity or high-dimensional data

Asymptotic properties

Asymptotic properties describe the behavior of estimators as the sample size approaches infinity
Consistency: An estimator is consistent if it converges in probability to the true parameter value as the sample size increases
Asymptotic normality: An estimator is asymptotically normal if its distribution converges to a normal distribution as the sample size increases
BLUE estimators are typically consistent and asymptotically normal under appropriate assumptions

Finite sample properties

Finite sample properties describe the behavior of estimators for a given sample size
Unbiasedness: An estimator is unbiased if its expected value is equal to the true parameter value for any sample size
Minimum variance: An estimator has minimum variance if it has the smallest variance among all unbiased estimators for a given sample size
BLUE estimators have optimal finite sample properties, but biased estimators may be preferred in some cases due to the bias-variance tradeoff

Applications of BLUE

BLUE estimators are widely used in various econometric models and applications
They provide a foundation for estimation and inference in linear regression analysis

Simple linear regression

Simple linear regression models the relationship between a dependent variable and a single independent variable
The OLS estimator is BLUE for simple linear regression under the Gauss-Markov assumptions
Allows for the estimation and interpretation of the intercept and slope coefficients

Multiple linear regression

Multiple linear regression extends simple linear regression to include multiple independent variables
The OLS estimator remains BLUE for multiple linear regression under the Gauss-Markov assumptions
Enables the estimation of the partial effects of each independent variable while controlling for the others

Generalized least squares

Generalized least squares (GLS) is an extension of OLS that allows for non-spherical error terms (heteroskedasticity or autocorrelation)
GLS transforms the model to satisfy the Gauss-Markov assumptions and then applies OLS to the transformed model
Under certain conditions, the GLS estimator is BLUE and provides more efficient estimates than OLS in the presence of non-spherical errors