13.3 Measures of Position

3 min readjune 18, 2024

Z-scores, , and percentiles help us understand where data points fall within a distribution. These measures of position allow us to compare values across different datasets and identify unusual observations.

, those data points that fall far from the rest, can be detected using the method. Understanding the shape of a distribution through and provides insights into data patterns and potential anomalies.

Measures of Position

Z-scores for relative position

Top images from around the web for Z-scores for relative position
Top images from around the web for Z-scores for relative position
  • Calculate the number of standard deviations an observation is from the
    • Positive value indicates observation is above mean (1.5 means 1.5 standard deviations above)
    • Negative value indicates observation is below mean (-0.8 means 0.8 standard deviations below)
  • Use formula z=xμσz = \frac{x - \mu}{\sigma}
    • xx represents individual value being analyzed
    • μ\mu represents population mean or average
    • σ\sigma represents population , a measure of dispersion
  • Interpret z-scores to understand relative position
    • of 2 means observation is 2 standard deviations above mean (unusually high)
    • of -1.5 means observation is 1.5 standard deviations below mean (somewhat low)
    • Z-score of 0 means observation is equal to mean (typical or average)
  • Z-scores are particularly useful when data follows a

Quartiles and percentiles for distribution

  • Divide dataset into four equal parts using quartiles
    • is 25th , separating lowest 25% of data
    • is 50th percentile or , middle value in dataset
    • is 75th percentile, separating highest 25% of data
  • Use percentiles to indicate percentage of observations below a certain value
    • 60th percentile is value below which 60% of observations fall (above average)
    • 10th percentile is value below which only 10% of observations fall (very low)
  • Calculate quartiles and percentiles by first arranging data in ascending order
    • Find position of each using formula n+14\frac{n+1}{4}, where nn is number of observations
    • Interpolate between two closest values if position is not a whole number (e.g., 3.5)
  • Interpret quartiles and percentiles to understand distribution
    • Value in second quartile (Q2) is among middle 50% of data (typical)
    • Value in 95th percentile is higher than vast majority of observations (extremely high)
  • Visualize quartiles and potential outliers using a

Outliers using interquartile range

  • Identify observations significantly different from rest of data as outliers
  • Calculate ###interquartile_range_()_0### as difference between Q3 and Q1
    • IQR=Q3Q1IQR = Q3 - Q1 measures spread of middle 50% of data
  • Use IQR method to identify outliers
    • calculated as Q11.5×IQRQ1 - 1.5 \times IQR, observations below are outliers
    • calculated as Q3+1.5×IQRQ3 + 1.5 \times IQR, observations above are outliers
    • Fences create boundaries for identifying unusually low or high values
  • Rely on IQR method for detection
    • Not influenced by extreme values like mean and
    • Helps identify potential data entry errors (typos) or unusual observations (anomalies)

Distribution Characteristics

  • Analyze the shape of data distribution using various measures
  • Skewness measures the asymmetry of the distribution
    • Positive skew indicates a longer tail on the right side
    • Negative skew indicates a longer tail on the left side
  • Kurtosis measures the "tailedness" of the distribution
    • Higher kurtosis indicates heavier tails and a sharper peak
    • Lower kurtosis indicates lighter tails and a flatter peak
  • Visualize distribution shape using a

Key Terms to Review (29)

Arithmetic mean: The arithmetic mean is the sum of a set of numbers divided by the count of numbers in the set. It is commonly used to find the central tendency of data in finance, such as average returns.
Box Plot: A box plot, also known as a box-and-whisker plot, is a graphical representation that displays the distribution of a dataset using five key statistical measures: the minimum value, the first quartile (Q1), the median, the third quartile (Q3), and the maximum value. This visual tool provides a concise summary of the central tendency, spread, and skewness of a dataset, making it particularly useful for understanding and comparing the characteristics of different data distributions.
Histogram: A histogram is a graphical display of data using bars of different heights. It represents the frequency distribution of numerical data, where each bar groups numbers into specific ranges.
Histogram: A histogram is a graphical representation of the distribution of numerical data. It displays the frequency or count of data points within specified intervals or bins, providing a visual summary of the underlying data's characteristics.
Interquartile Range: The interquartile range (IQR) is a measure of statistical dispersion that represents the range of values between the first and third quartiles of a data set. It is a useful tool for analyzing the spread or variability of a distribution, providing information about the central tendency and the degree of dispersion in the data.
Interquartile range (IQR): Interquartile Range (IQR) measures the spread of the middle 50% of data points in a dataset. It is calculated by subtracting the first quartile (Q1) from the third quartile (Q3).
IQR: The interquartile range (IQR) is a measure of statistical dispersion that represents the middle 50% of a dataset. It is calculated as the difference between the 75th and 25th percentiles, providing a robust measure of the spread of a distribution that is less affected by outliers compared to the range.
Kurtosis: Kurtosis is a statistical measure that describes the shape of a probability distribution. It quantifies the peakedness or flatness of a distribution relative to a normal distribution. Kurtosis provides information about the tails of a distribution, indicating whether the tails contain more or less data than expected for a normal distribution.
Lower Fence: The lower fence, also known as the lower quartile or 25th percentile, is a measure of position in statistics that represents the value below which 25% of the data points in a dataset fall. It is a key metric used to analyze the distribution and spread of a dataset.
Mean: The mean, also known as the arithmetic average, is a measure of central tendency that represents the typical or central value in a dataset. It is calculated by summing up all the values in the dataset and dividing by the total number of data points.
Median: The median is the middle value in a data set when the numbers are arranged in ascending or descending order. In finance, it is used to find the central tendency of a dataset and mitigate the impact of outliers.
Median: The median is the middle value in a set of data when the values are arranged in numerical order. It represents the central tendency of a distribution and is a measure of central location that is often used to describe the typical or central value in a dataset.
Normal distribution: A normal distribution is a bell-shaped curve where most of the data points cluster around the mean, and probabilities for values taper off symmetrically towards both extremes. It is characterized by its mean and standard deviation.
Normal Distribution: The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetrical and bell-shaped. It is one of the most important and widely used probability distributions in statistics, with applications across various fields.
Outlier: An outlier is an observation or data point that lies an abnormal distance from other values in a data set. It is a data point that is significantly different from the rest of the data, often standing out as being much larger or smaller than the majority of the data points.
Outliers: Outliers are data points significantly different from others in a dataset. They can affect measures of center and overall statistical analysis.
Percentile: A percentile is a statistical measure that indicates the relative position of a value within a distribution of values. It represents the percentage of values in a dataset that fall below a given value.
Q1: Q1 is a measure of position that represents the value below which 25% of the data falls. It is the first quartile of a dataset and is used to describe the distribution and spread of data.
Q2: Q2 is a measure of position that represents the second quartile or the median of a dataset. It is the value that separates the lower 50% of the data from the upper 50%, dividing the data into two equal halves.
Q3: Q3, or the third quartile, is a measure of position in statistics that divides a set of data into four equal parts. It represents the value below which 75% of the data in the set falls.
Quartile: A quartile is a statistical measure that divides a dataset into four equal parts. Quartiles are used to describe the distribution of a dataset and provide information about its central tendency and dispersion.
Quartiles: Quartiles are values that divide a data set into four equal parts. They help in understanding the distribution and spread of the data.
Skewness: Skewness is a measure of the asymmetry or lack of symmetry in the distribution of a dataset. It quantifies the degree and direction of a dataset's deviation from a normal, symmetric distribution.
Standard deviation: Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of data values. It is used to assess the risk and volatility of an investment's returns in finance.
Standard Deviation: Standard deviation is a statistical measure that quantifies the amount of variation or dispersion of a set of data values around the mean or average. It provides a way to understand how spread out a group of numbers is from the central tendency.
Upper Fence: The upper fence, in the context of measures of position, is a statistical concept that defines the upper boundary of the normal range for a dataset. It is used to identify outliers or extreme values within the distribution.
Z-score: A z-score measures how many standard deviations a data point is from the mean. It helps determine the position of a value within a distribution.
Z-Score: A z-score is a standardized measure that expresses a data point's relationship to the mean of a dataset in terms of standard deviations. It is a fundamental concept in statistics that provides insight into the position and relative standing of a value within a distribution.
Z-value: A z-value, also known as a z-score, measures the number of standard deviations a data point is from the mean of a dataset. It is used to standardize scores on different scales and compare them directly.
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.