Statistics in industrial engineering helps you make sense of data and drive decisions. Descriptive stats summarize what you see in your data, while inferential stats let you draw conclusions about larger populations from samples.
Central tendency, dispersion, probability distributions, hypothesis testing, and confidence intervals form the core toolkit here. Together, they let you test claims, estimate parameters, and make informed choices across quality control, reliability, and process improvement.
Central Tendency and Dispersion
Measures of Central Tendency
The three main measures each capture a different sense of what's "typical" in a dataset:
- Mean (arithmetic): Sum all values and divide by the number of observations. This is the most common measure, but it's sensitive to outliers. A weighted mean adjusts for the relative importance of each value, which is useful when some data points matter more than others.
- Median: The middle value when data is sorted in order. If you have an even number of observations, it's the average of the two middle values. The median is more robust to outliers than the mean.
- Mode: The most frequently occurring value. Particularly useful for categorical data or when you want to know the most common outcome.
In industrial engineering, you'll use these constantly. For example, mean cycle time tells you average process speed, while median is better when a few extreme delays would skew the average.
Measures of Dispersion
Dispersion measures tell you how spread out your data is, which is just as important as knowing the center.
- Range: Maximum value minus minimum value. Simple but easily distorted by a single outlier.
- Variance: The average of the squared deviations from the mean. Squaring ensures negative and positive deviations don't cancel out.
- Standard deviation: The square root of variance. It brings the units back to the same scale as your original data, making it more interpretable.
- Interquartile range (IQR): The difference between the 75th percentile (Q3) and the 25th percentile (Q1). Like the median, it's resistant to outliers.
- Coefficient of variation (CV): Standard deviation divided by the mean, expressed as a percentage. CV lets you compare variability across datasets with different units or scales. For instance, you can compare the variability of cycle times (in seconds) to the variability of defect counts (in units) using CV.
Data Distribution Characteristics
Beyond center and spread, the shape of your data matters.
- Skewness measures asymmetry. Positive skew means the tail extends to the right (a few unusually large values), while negative skew means the tail extends to the left.
- Kurtosis measures how heavy or light the tails are compared to a normal distribution. High kurtosis means more data in the tails (more extreme values), and low kurtosis means lighter tails.
Box plots display the median, quartiles, and potential outliers in a compact visual. Histograms show the frequency distribution across value ranges. Both are go-to tools for spotting outliers or anomalies in production data before you run any formal tests.
Probability Distributions for Modeling
Probability distributions are mathematical models that describe how likely different outcomes are. Choosing the right distribution is critical for accurate modeling in IE.
Discrete Probability Distributions
These apply when outcomes are countable (whole numbers).
- Binomial: Models the number of successes in a fixed number of independent trials, each with the same probability of success. Example: the number of defective items in a batch of 50, where each item has a 2% defect rate.
- Poisson: Models the number of events occurring in a fixed interval of time or space, when events happen independently at a constant average rate. Example: customer arrivals per hour at a service counter.
- Geometric: Models the number of trials needed until the first success. Example: how many attempts until a machine repair succeeds.
- Hypergeometric: Like the binomial, but for sampling without replacement from a finite population. Example: selecting 10 items from a lot of 100 and counting how many are defective.

Continuous Probability Distributions
These apply when the variable can take any value within a range.
- Normal: The classic bell curve, defined by its mean () and standard deviation (). Many natural and industrial measurements approximate this shape.
- Exponential: Models the time between independent events. Example: time between successive machine failures. It's memoryless, meaning the probability of failure doesn't depend on how long the machine has already been running.
- Weibull: A flexible distribution used heavily in reliability analysis and product lifetime modeling. By adjusting its shape parameter, it can model increasing, decreasing, or constant failure rates.
- Lognormal: Models data that results from the product of many small, independent factors. Example: particle size distributions in manufacturing.
- Uniform: Every value in a given range is equally likely. Used in random number generation for simulations.
Application and Analysis
The Central Limit Theorem (CLT) is one of the most important results in statistics: regardless of the population's distribution, the sampling distribution of the sample mean approaches a normal distribution as sample size increases. This is why so many inferential methods rely on the normal distribution.
To check whether your data fits a particular distribution:
- Create a probability plot (e.g., a normal probability plot). If the data points fall roughly along a straight line, the distribution is a reasonable fit.
- Run a goodness-of-fit test such as the Chi-square test or the Kolmogorov-Smirnov test to formally assess the fit.
These distribution models are essential for reliability analysis, inventory management, simulation modeling, queuing theory, and setting control limits in statistical process control.
Hypothesis Testing for Decisions
Hypothesis testing is a structured method for deciding whether sample data provides enough evidence to support a claim about a population.
Fundamentals of Hypothesis Testing
Every hypothesis test follows the same basic framework:
- State the hypotheses. The null hypothesis () represents the status quo or "no effect." The alternative hypothesis () represents the claim you're investigating.
- Choose a significance level (). This is the probability of a Type I error (rejecting when it's actually true). Common choices are 0.05 or 0.01.
- Collect data and compute a test statistic. The specific statistic depends on the test you're using.
- Find the p-value. The p-value is the probability of observing results at least as extreme as yours, assuming is true.
- Make a decision. If the p-value is less than , reject . Otherwise, you fail to reject it. (Note: "fail to reject" is not the same as "accept.")
Two types of errors to keep straight:
- Type I error (): Rejecting a true . Think of it as a false alarm.
- Type II error (): Failing to reject a false . Think of it as a missed detection.
A one-tailed test checks for an effect in one specific direction (e.g., "the new process is faster"). A two-tailed test checks for any difference in either direction.
Common Hypothesis Tests
- T-tests: Compare means. One-sample (is this mean different from a target?), two-sample (are these two group means different?), or paired (before vs. after on the same subjects).
- Chi-square tests: Analyze categorical data. Used for goodness-of-fit (does data match an expected distribution?) and tests of independence (are two categorical variables related?).
- ANOVA (Analysis of Variance): Compares means across three or more groups. One-way ANOVA has one factor; two-way ANOVA examines two factors and their interaction.
- F-test: Compares variances of two populations.
- Nonparametric tests: Used when data violates assumptions of parametric tests (e.g., non-normal data). Examples include the Mann-Whitney U test and the Kruskal-Wallis test.
- Regression analysis: Tests relationships between variables, such as whether machine speed affects defect rate.

Test Power and Sample Size
The power of a test () is the probability of correctly rejecting a false . Higher power means you're more likely to detect a real effect. Three factors influence power:
- Sample size: Larger samples increase power and reduce the margin of error.
- Effect size: The magnitude of the difference you're trying to detect. Larger effects are easier to find.
- Significance level (): Increasing raises power but also raises the risk of Type I error. There's always a trade-off.
Before collecting data, you should calculate the required sample size to achieve a desired power level (commonly 0.80 or higher). This prevents wasting resources on a study too small to detect meaningful differences.
Confidence Intervals for Estimation
A confidence interval gives you a range of plausible values for a population parameter, rather than just a single point estimate.
Confidence Interval Basics
A 95% confidence interval means that if you repeated the sampling process many times, about 95% of the resulting intervals would contain the true population parameter. It does not mean there's a 95% probability the true value is in this specific interval.
The margin of error is half the width of the interval. It represents the maximum expected difference between your point estimate and the true parameter. Three things affect interval width:
- Sample size: Larger samples produce narrower (more precise) intervals.
- Data variability: More variability in the data widens the interval.
- Confidence level: Higher confidence (e.g., 99% vs. 95%) produces wider intervals. You gain confidence at the cost of precision.
Types of Confidence Intervals
- For means: Use the t-distribution for small samples (or when population standard deviation is unknown) and the normal distribution for large samples.
- For proportions: Based on the normal approximation to the binomial distribution. Example: estimating the proportion of defective items in production.
- For differences: Intervals for the difference between two means or two proportions, used when comparing groups.
- For variance and standard deviation: Based on the chi-square distribution.
- Tolerance intervals: A different concept. These contain a specified proportion of the population with a given confidence. Useful in quality engineering when you need to know the range that captures, say, 99% of all product measurements.
Applications in Industrial Engineering
- Process capability: Estimate capability indices (, ) to assess whether a process can consistently meet specifications.
- Reliability: Predict failure rates and product lifetimes with quantified uncertainty.
- Process improvement: Make decisions about whether changes to a process produced a real improvement.
- Quality control: Determine how many items to inspect to achieve a desired precision in your estimates.
There's a direct link between confidence intervals and hypothesis testing: if a 95% confidence interval for a parameter does not contain the hypothesized value, you would reject at the significance level. They're two sides of the same coin.