Continuous Distributions
Uniform, Exponential, and Normal — why probability for continuous variables is area under a curve, why the density height is not a probability, and how to compute interval probabilities and quantiles with scipy.stats.
Wait times, salaries, temperatures, response latencies, heights — these don't come in whole-number counts. They're continuous: between any two possible values there's always another. That one fact forces a different way of thinking about probability than the discrete world of Discrete Distributions, and it trips up almost everyone the first time.
Here's the shift. For a count, "what's the probability of exactly 3 tickets?" is a sensible question with a real answer. For a continuous quantity, "what's the probability the wait is exactly 5.000000... minutes?" has the answer zero — there are infinitely many possible values, so no single one carries any probability on its own. Instead, probability for continuous variables lives in intervals, and it's computed as area under a curve. Internalize that and the rest of this page falls into place.
The big idea: probability is area under the PDF
A continuous distribution is described by a probability density function (PDF) — a smooth curve. The curve itself is not probability. Probability is the area under the curve between two values:
P(a < X < b) = area under the PDF from a to b
The total area under any PDF is exactly 1 (something must happen). A single point has zero width, so it has zero area — which is why P(X = exact value) = 0 for continuous variables. You always ask about ranges.
Misconception #1: the PDF height is a probability
The height of a PDF is a density, not a probability. It can even
be greater than 1. For a Uniform(0, 0.5), the density is 2
everywhere on that interval — a perfectly valid PDF, because what must
stay ≤ 1 is the area (2 × 0.5 = 1), not the height. Never read a
y-axis density value as "the probability of that x." Probability is
always an area, never a height.
Discrete PMF vs continuous PDF
The contrast with the discrete world is worth pinning down, because the mental model is genuinely different.
Why P(X = x) = 0 isn't a paradox
"The probability of any exact value is zero" sounds like it means nothing can happen — but some value always occurs. The resolution: with infinitely many possibilities, probability spreads across intervals, not points. It's like asking which exact real number a dart lands on along a line — the dart lands somewhere, yet the chance of any pre-named exact point is zero. So we ask "within 1mm of here?" (an interval) instead.
Uniform: every value equally likely
The Uniform distribution spreads probability evenly over an interval [a, b] — a flat PDF. It's the simplest continuous distribution and the natural model when you have no reason to favor any value over another in a range: a random timestamp within an hour, a spawn point along a line, the raw output of a random-number generator.
In scipy the convention is loc = a (the left edge) and
scale = b - a (the width), so Uniform(0, 10) is
stats.uniform(loc=0, scale=10). This loc/scale convention runs
through almost every scipy distribution — learn it once here.
Interval probability = cdf(b) − cdf(a)
The cumulative distribution function (CDF) gives P(X ≤ x) — the
area to the left of x. So the area between a and b is just
cdf(b) - cdf(a). This one identity computes every interval
probability for every continuous distribution. You'll use it
constantly.
Exponential: waiting time until the next event
The Exponential distribution models the time between independent events that happen at a constant rate — the continuous partner of the Poisson from Discrete Distributions. If support tickets arrive Poisson-style at some rate, the gap between consecutive tickets is Exponential. Real uses:
- Wait times: time until the next customer, request, or arrival.
- Time-to-failure: how long a component lasts before breaking.
- Inter-event gaps: seconds between clicks, days between incidents.
In scipy, parameterize by scale = mean. So a process with a mean
wait of 5 minutes is stats.expon(scale=5). (There's also a loc
shift, but you'll almost always leave it at 0.)
The exponential is right-skewed: short waits are most likely, but there's a long tail of occasional long waits. Its defining quirk is memorylessness — having already waited 10 minutes tells you nothing about how much longer you'll wait. That's a strong assumption, and a reason it sometimes doesn't fit (a machine that wears out gets more likely to fail over time, violating memorylessness).
Exponential vs Poisson: gaps vs counts
They describe the same stream of events from two angles. Poisson (discrete) counts how many events land in a fixed interval. Exponential (continuous) measures how long until the next one. Same rate parameter underneath; choose based on whether you're counting events or timing the gaps between them.
Normal: the bell curve (a first look)
The Normal (Gaussian) distribution is the famous symmetric bell
curve, defined by a mean μ (center) and standard deviation σ
(spread): stats.norm(loc=mu, scale=sigma). It models quantities that
pile up around a center with symmetric tails — measurement errors,
aggregated noise, many natural and human measurements. It's so central
that it gets its own page next, The Normal Distribution, where we
cover the 68–95–99.7 rule and z-scores. Here we just meet it as another
continuous distribution and practice the same area-based questions.
Quantiles: inverting the question with ppf
So far we've gone value → probability with cdf. Often you need the
reverse: probability → value. "What wait time are 90% of waits below?"
"What score puts you in the top 5%?" That's the percent-point function
(ppf), the inverse of the CDF, also called the quantile function.
ppf(0.9) returns the value x such that P(X ≤ x) = 0.9 — the 90th
percentile. It's the tool for setting thresholds, SLAs, and cutoffs.
cdf and ppf are inverses
cdf takes a value and returns a probability (area to the left). ppf
takes a probability and returns the value. d.ppf(d.cdf(x)) == x and
d.cdf(d.ppf(p)) == p. Use cdf/sf for "what's the chance?" and
ppf/isf for "what's the cutoff?".
Adult resting heart rate is modeled as Normal with mean 70 bpm and standard deviation 8 bpm.
Using scipy.stats.norm, compute the probability that a randomly chosen person's heart rate is between 60 and 80 bpm, P(60 < X < 80).
- Build a frozen distribution
d = stats.norm(loc=70, scale=8). - An interval probability is
cdf(upper) - cdf(lower). - Store the answer as a plain Python
floatinp_between.
Customer wait times follow an Exponential distribution with a mean of 4 minutes.
You want to publish a service-level promise: "90% of customers are served within X minutes." Find that X — the value such that P(X <= x) = 0.90.
- Build
d = stats.expon(scale=4)(thescaleis the mean). - Use the percent-point function:
d.ppf(0.90). - Store the answer as a plain Python
floatincutoff.
Reading a density plot correctly
One more guard against the height-is-probability trap. When you plot a PDF, the y-axis is density, and you should read relative heights and areas, never absolute heights as probabilities.
Check your understanding
For a continuous random variable X, what is P(X = 7.0) — the probability it equals exactly 7.0?
It equals the height of the PDF at 7.0
It is 0, because a single exact point has zero width and therefore zero area under the PDF
It is small but positive, roughly the PDF height times a tiny number
It cannot be computed without more information
You plot the PDF of a Uniform(0, 0.5) distribution and notice the density is 2 across the whole interval. What does this tell you?
The plot is wrong, because a probability can never exceed 1
The probability of landing in that interval is 2
The density height (2) is fine because what must equal 1 is the area: 2 × 0.5 = 1
The distribution is invalid and should be renormalized
Wait times are Exponential with mean 5 minutes (d = stats.expon(scale=5)). Which expression gives the probability the wait is between 2 and 8 minutes?
d.pdf(8) - d.pdf(2)
d.cdf(8) - d.cdf(2)
d.cdf(8) + d.cdf(2)
d.ppf(8) - d.ppf(2)
You need the value below which 95% of a normal variable's outcomes fall (the 95th percentile). Which scipy method do you use?
d.cdf(0.95)
d.ppf(0.95)
d.sf(0.95)
d.pdf(0.95)
A teammate looks at a density plot, sees the curve peak at a height of 0.6 over the value 50, and concludes "there's a 60% chance the value is 50." What's the precise problem?
They should have read 50, not 0.6, as the probability
Nothing — that's a correct reading of a density plot
The height should have been read off the x-axis instead
A PDF height is a density, not a probability, and the chance of any single exact value is 0; probability is the area over an interval
Key takeaways
- Continuous variables use a PDF; probability is area under it, and the total area is 1.
- The PDF height is a density, not a probability — it can exceed 1. P(X = exact value) = 0; always ask about intervals.
cdf(b) − cdf(a)gives any interval probability P(a < X < b);sfgives the right tail;ppfinverts the CDF to find percentiles and thresholds.- Uniform = flat density over a range; Exponential = waiting time between events at a constant rate (right-skewed, memoryless); Normal = the symmetric bell curve, detailed next.
- The scipy
loc/scaleconvention and the frozen-distribution pattern carry over from the discrete page and into Working with Distributions.
Discrete Distributions
Bernoulli, Binomial, Poisson, and Geometric — the count models for yes/no outcomes and rare events, when each one arises in real data, and how to answer probability questions with scipy.stats.
The Normal Distribution
Why the bell curve shows up everywhere, the 68-95-99.7 empirical rule, z-scores and standardization, converting between raw values, z-scores, and percentiles — and the real danger of assuming data is normal when it isn't.