Skip to content
LearnMathora

Probability & statistics · 02 · Data without lies · 8 min

Center, spread & honest summaries

A thousand numbers are unreadable; a summary is unavoidable. The craft of statistics starts with summarizing without distorting — knowing what the mean hides, what the median resists, and what spread reveals.

Build the intuition

Mean vs median: the billionaire test

Nine people earn $50k; a billionaire walks into the room. The mean salary rockets past $100M — the median stays $50k. Means follow the money; medians follow the middle person. Skewed data (incomes, house prices, wait times) is median territory; “average” headlines deserve suspicion.

Spread is half the story

Two cities can share a mean temperature of 15°C — one ranging 5–25°, the other −20–50°. Same center, utterly different lives. Standard deviation (σ) measures typical distance from the mean: small σ, clustered and predictable; large σ, scattered and volatile.

σ=1n(xixˉ)2\sigma = \sqrt{\tfrac{1}{n}\sum (x_i - \bar{x})^2}

Distribution: the full portrait

Beyond two numbers lies the shape: pile the data into a histogram and look. Symmetric bell? One lonely peak or two (two populations mixed)? A long tail of extremes? Many wrong conclusions die at the moment someone actually plots the data.

See it move

InteractiveThe bell curve
0
1
1
Mean 0, spread σ = 1. Within ±1σ of the mean lives 68.3% of everything. (±1σ ≈ 68%, ±2σ ≈ 95% — the most useful rule of thumb in statistics.)

μ slides the center; σ stretches the spread. Two dials describe the whole population — when the shape is a bell.

A worked example

Which commute is better?

  1. Route A: mean 30 min, σ = 2. Route B: mean 28 min, σ = 12.

  2. B is faster on average — but its spread means 50+ minute disasters happen regularly.

  3. With a 35-minute deadline, A almost never fails; B fails often. The mean said B; the spread said A — and the spread was right.

Out in the world

Manufacturing lives on σ

A bolt factory's mean diameter can be perfect while variance quietly produces failures. Quality control is variance control — “six sigma” is literally a promise about standard deviations. Consistency, not averages, is what you can build bridges on.

Common confusion, cleared

The average person earns the average salary.

With skewed data most people sit below the mean — a few giants drag it up. The median is where the middle person actually stands.

More data automatically beats better data.

A biased mountain loses to an honest hill: polling a million gym members about exercise tells you about gym members. How data was gathered outranks how much.

Recap

  • Mean follows the money; median follows the middle; check which fits.
  • Spread (σ) tells you what to expect around the center — always ask for it.
  • Plot the shape before trusting any summary.