Probability and expected values connect calculus with real-world modeling of uncertainty. By extending integration to two dimensions, you can compute probabilities, averages, and measures of spread for pairs of continuous random variables.

This topic covers joint and marginal probability density functions, conditional probability, and expected values/variances, all computed through double integrals.

Joint and Marginal Probability Density Functions

Defining Joint and Marginal Probability Density Functions

A joint probability density function $f(x,y)$ describes how probability is distributed across all possible pairs of values for two continuous random variables $X$ and $Y$ . Think of it as a surface over the $xy$ -plane whose "volume" under any region gives the probability of landing in that region.

Every valid joint pdf must satisfy two properties:

Non-negativity: $f(x,y) \geq 0$ for all $x$ and $y$
Total probability equals 1: $\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y)\, dx\, dy = 1$

A marginal probability density function isolates one variable by "integrating out" the other. If you only care about $X$ , you sum up (integrate) all the contributions from every possible $y$ :

$f_X(x) = \int_{-\infty}^{\infty} f(x,y)\, dy$

Similarly, the marginal pdf of $Y$ is:

$f_Y(y) = \int_{-\infty}^{\infty} f(x,y)\, dx$

The marginal pdf reduces a two-variable problem back to a single-variable one, which is useful when you need the distribution of just one of the two variables.

Calculating Probabilities Using Double Integrals

To find the probability that $(X,Y)$ lands in some region $R$ of the $xy$ -plane, integrate the joint pdf over that region:

$P((X,Y) \in R) = \iint_R f(x,y)\, dA$

Here $dA$ is the area element ( $dx\,dy$ or $dy\,dx$ , depending on your order of integration), and $R$ is whatever region the problem specifies.

Example: Suppose $f(x,y) = 6xy$ on the triangular region where $0 \leq x \leq 1$ and $0 \leq y \leq 1 - x$ (and $f = 0$ elsewhere). To verify this is a valid pdf, check that the total integral equals 1:

$\int_0^1 \int_0^{1-x} 6xy\, dy\, dx = \int_0^1 6x \cdot \frac{(1-x)^2}{2}\, dx = \int_0^1 3x(1-x)^2\, dx = 1$

To find the probability that, say, $X \leq 1/2$ and $Y \leq 1/2$ , you'd integrate $6xy$ over the appropriate sub-region of the triangle.

Conditional Probability

Defining Joint and Marginal Probability Density Functions, integration - Joint and marginal distributions and expectations (Is my proof right ...

Definition and Formula

Conditional probability answers the question: given that $X$ takes a particular value $x$ , how is $Y$ distributed?

The conditional pdf of $Y$ given $X = x$ is:

$f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)}, \quad \text{provided } f_X(x) > 0$

You're dividing the joint density by the marginal density of the known variable. This "slices" the joint distribution at a fixed $x$ and rescales it so it integrates to 1 over $y$ .

The symmetric version gives the conditional pdf of $X$ given $Y = y$ :

$f_{X|Y}(x|y) = \frac{f(x,y)}{f_Y(y)}$

Calculating Conditional Probabilities

Once you have the conditional pdf, computing a conditional probability is a single-variable integral.

To find $P(c \leq Y \leq d \mid X = x)$ :

Compute the marginal $f_X(x)$ by integrating $f(x,y)$ over all $y$ .
Form the conditional pdf: $f_{Y|X}(y|x) = \frac{f(x,y)}{f_X(x)}$ .
Integrate over the desired range of $y$ :

$P(c \leq Y \leq d \mid X = x) = \int_c^d f_{Y|X}(y|x)\, dy$

The same steps apply (with roles swapped) for $P(a \leq X \leq b \mid Y = y)$ .

Expected Value and Variance

Defining Joint and Marginal Probability Density Functions, probability - Joint distribution of multiple binomial distributions - Mathematics Stack Exchange

Expected Value

The expected value (mean) of a random variable tells you the "center of mass" of its distribution.

For a single variable with pdf $f(x)$ :

$E(X) = \int_{-\infty}^{\infty} x\, f(x)\, dx$

When you're working with a joint pdf $f(x,y)$ and want the expected value of $X$ , you integrate $x$ weighted by the joint density over the entire plane:

$E(X) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} x\, f(x,y)\, dx\, dy$

This is equivalent to computing $E(X) = \int_{-\infty}^{\infty} x\, f_X(x)\, dx$ using the marginal, but sometimes it's easier to work directly with the joint pdf.

More generally, for any function $g(X,Y)$ :

$E(g(X,Y)) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g(x,y)\, f(x,y)\, dx\, dy$

This formula is the workhorse for computing variances, covariances, and other quantities.

Variance

Variance measures how spread out a distribution is around its mean $\mu_X = E(X)$ :

$\text{Var}(X) = E\!\left((X - \mu_X)^2\right) = \int_{-\infty}^{\infty} (x - \mu_X)^2\, f(x)\, dx$

The computational shortcut is often easier to use in practice:

$\text{Var}(X) = E(X^2) - \left(E(X)\right)^2$

where $E(X^2) = \int_{-\infty}^{\infty} x^2\, f(x)\, dx$ .

Steps to compute variance:

Find $E(X)$ .
Find $E(X^2)$ using the same integration technique but with $x^2$ in the integrand.
Subtract: $\text{Var}(X) = E(X^2) - (E(X))^2$ .

The standard deviation $\sigma_X = \sqrt{\text{Var}(X)}$ puts the spread back in the same units as $X$ .

Covariance and Correlation

Covariance

Covariance quantifies how two random variables move together. If large values of $X$ tend to occur with large values of $Y$ , the covariance is positive; if large $X$ pairs with small $Y$ , it's negative.

$\text{Cov}(X,Y) = E\!\left((X - \mu_X)(Y - \mu_Y)\right) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} (x - \mu_X)(y - \mu_Y)\, f(x,y)\, dx\, dy$

The computational shortcut:

$\text{Cov}(X,Y) = E(XY) - E(X)\,E(Y)$

where $E(XY) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} xy\, f(x,y)\, dx\, dy$ .

Positive covariance: $X$ and $Y$ tend to increase together.
Negative covariance: one tends to increase as the other decreases.
Zero covariance: no linear relationship (but a nonlinear relationship could still exist).

Correlation Coefficient

Covariance depends on the scale of $X$ and $Y$ , which makes it hard to interpret the magnitude. The correlation coefficient normalizes covariance to a dimensionless number:

$\rho_{XY} = \frac{\text{Cov}(X,Y)}{\sigma_X\, \sigma_Y}$

Properties of $\rho_{XY}$ :

Always satisfies $-1 \leq \rho_{XY} \leq 1$
$\rho_{XY} = 1$ : perfect positive linear relationship
$\rho_{XY} = -1$ : perfect negative linear relationship
$\rho_{XY} = 0$ : no linear relationship (same caveat as covariance about nonlinear dependence)