Distribution shapes and are crucial concepts in statistics. They help us understand how data is spread out and where most values are concentrated, which is essential for choosing the right analysis methods.

Symmetrical distributions have equal sides, while skewed ones have longer tails on one side. This affects where the , , and fall, impacting how we interpret and use these measures of in real-world scenarios.

Distribution Shapes and Skewness

Symmetrical vs skewed distributions

Top images from around the web for Symmetrical vs skewed distributions
Top images from around the web for Symmetrical vs skewed distributions
  • Symmetrical distributions have a bell-shaped curve with left and right sides as mirror images ()
  • Right-skewed distributions have a extending further to the right with most data points concentrated on the left (income distribution)
  • Left-skewed distributions have a tail extending further to the left with most data points concentrated on the right (age distribution in a retirement community)
  • measures the "tailedness" of the distribution, indicating whether data are heavy-tailed or light-tailed relative to a normal distribution

Positions of mean, median and mode

  • In symmetrical distributions, mean = median = mode located at the center of the distribution
  • In right-skewed distributions:
    1. Mode located at the peak of the distribution, furthest to the left
    2. Median located between the mode and the mean
    3. Mean located furthest to the right, pulled by extreme values in the right tail ()
  • In left-skewed distributions:
    1. Mode located at the peak of the distribution, furthest to the right
    2. Median located between the mode and the mean
    3. Mean located furthest to the left, pulled by extreme values in the left tail (outliers)

Impact of Skewness on Measures of Central Tendency

Impact of skewness on central tendency

  • In right-skewed distributions, mean > median as mean is pulled towards the right tail by extreme values making median a more robust measure of central tendency (household income)
  • In left-skewed distributions, mean < median as mean is pulled towards the left tail by extreme values making median a more robust measure of central tendency (age distribution in a country with low life expectancy)
  • In symmetrical distributions, mean, median and mode are equal and not affected by skewness, making all three measures suitable for representing the center of the distribution (height distribution)

Descriptive Statistics and Distribution Characteristics

  • Central tendency measures (mean, median, mode) provide information about the typical or average value in a dataset
  • measures (such as standard deviation) quantify the spread or variability of data points around the central tendency
  • The describes the relative likelihood of different values occurring in a continuous probability distribution

Key Terms to Review (20)

Asymmetrical Distribution: An asymmetrical distribution, also known as a skewed distribution, is a probability distribution where the data points are not evenly distributed around the central tendency. This means the distribution has an uneven shape, with one tail of the distribution being longer or more extended than the other.
Central Tendency: Central tendency is a statistical measure that describes the central or typical value in a dataset. It provides a way to summarize and understand the overall distribution of data by identifying the value around which the data tends to cluster.
Descriptive Statistics: Descriptive statistics is the branch of statistics that involves the collection, organization, analysis, and presentation of data in a meaningful way. It provides a summary of the key characteristics and patterns within a dataset, allowing researchers to gain a better understanding of the data without making inferences or drawing conclusions about the broader population.
Dispersion: Dispersion refers to the extent to which a set of data values are spread out or scattered around a central value, such as the mean or median. It measures the variability or spread of the data, providing insights into the distribution and characteristics of the dataset.
Kurtosis: Kurtosis is a statistical measure that describes the shape of a probability distribution. It quantifies the peakedness or flatness of a distribution relative to a normal distribution. Kurtosis provides information about the tails of a distribution, indicating whether they contain unusually large or small values compared to a normal distribution.
Left-Tailed: Left-tailed refers to a statistical distribution where the majority of the data points are concentrated on the left side of the distribution, resulting in a longer or heavier tail on the left side of the graph. This term is particularly relevant in the context of skewness and the comparison of two population means with unknown standard deviations.
Mean: The mean, also known as the arithmetic mean or average, is a measure of central tendency that represents the central or typical value in a dataset. It is calculated by summing all the values in the dataset and dividing by the total number of values. The mean is a widely used statistic that provides information about the location or central tendency of a distribution.
Median: The median is a measure of the central tendency of a dataset, representing the middle value when the data is arranged in numerical order. It is a key statistical concept that provides information about the location and distribution of data points.
Mode: The mode is the value that appears most frequently in a dataset. It is a measure of the central tendency, or the typical value, in a distribution of data. The mode is one of the three primary measures of central tendency, along with the mean and median.
Negative Skew: Negative skew refers to a distribution of data where the tail on the left side of the probability density function is longer or fatter than the right side. This results in the mean being less than the median, which is less than the mode, indicating an asymmetrical distribution skewed towards lower values.
Normal Distribution: The normal distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetrical and bell-shaped. It is a fundamental concept in statistics and probability theory, with widespread applications across various fields, including the topics covered in this course.
Outliers: Outliers are data points that lie an abnormal distance from other values in a dataset. They are observations that are markedly different from the rest of the data, often due to measurement errors, experimental conditions, or natural variability within the population.
Positive Skew: Positive skew refers to a distribution where the tail on the right side of the probability density function is longer or fatter than the left side. This indicates that the majority of the values in the distribution are clustered towards the left, with a long right tail of higher values.
Probability Density Function: The probability density function (PDF) is a mathematical function that describes the relative likelihood of a continuous random variable taking on a particular value. It provides a way to quantify the probability distribution of a continuous random variable.
Right-Tailed: Right-tailed refers to a probability distribution where the tail of the distribution extends more to the right side of the graph, indicating a skewed distribution with a longer right tail. This term is particularly relevant in the context of skewness and hypothesis testing involving population means.
Skewness: Skewness is a measure of the asymmetry or lack of symmetry in the distribution of a dataset. It describes the degree and direction of a dataset's departure from a normal, symmetrical distribution.
Symmetrical Distribution: A symmetrical distribution is a probability distribution where the data is evenly distributed around the central value, resulting in a bell-shaped curve with the mean, median, and mode all being equal. This type of distribution is important in the context of understanding skewness and the relationships between the measures of central tendency.
Tail: In statistics, the term 'tail' refers to the extreme ends of a probability distribution or dataset. The tails of a distribution represent the values that are furthest away from the central tendency, such as the mean or median.
: x̄ is the symbol used to represent the sample mean, which is the average value of a set of observations or data points. The sample mean is a measure of central tendency that provides an estimate of the population mean, and it is a fundamental concept in statistical analysis.
μ (Mu): μ, or mu, is a Greek letter that represents the population mean or average in statistical analysis. It is a fundamental concept that is crucial in understanding various statistical topics, including measures of central tendency, probability distributions, and hypothesis testing.
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.