The starting point
What Is Variation in Data?
Suppose someone asks: "What is the average height of high school students?" You would not measure just one student and call it done. You would measure several people and compute the average — because every individual is different. That difference is variation.
Fig.01 — Heights of five students and their deviations (mean = 169 cm)
| Student | Height (cm) | Deviation (value − mean) | Deviation² |
|---|---|---|---|
| A | 170 | +1 | 1 |
| B | 165 | −4 | 16 |
| C | 160 | −9 | 81 |
| D | 172 | +3 | 9 |
| E | 175 | +6 | 36 |
| Mean = 169 cm | Sum = 0 (always) | Sum = 143 | |
Deviation = individual value − mean. The deviations always sum to zero — which is why simple averaging does not work.
The deviation for each individual is their personal value minus the overall mean. Standard deviation is something close to "the average of these deviations" — but there is a problem: if you add the deviations directly, they always cancel out to zero. A fix is needed (explained below).
Why the simple average of deviations is always zero
By definition, the mean is the balance point of all values. When you subtract it from every data point and add the results, the positives and negatives cancel exactly — the sum is always zero, for any dataset. This is a mathematical certainty, not a coincidence. It means "sum all deviations and divide by n" can never describe spread — it always yields zero.
Practical value
What Can You Do with Variation?
① Compare groups with a single number
Two classes can share the same mean height yet have completely different characters depending on their spread.
Fig.02 — Same mean, very different spread
Both classes have mean 169 cm. Class 1 is a tight cluster; Class 2 spans a wide range. The mean alone does not distinguish them — standard deviation does.
② Decide whether a difference in means is real
Class A has a mean height of 170 cm; Class B has 171 cm. Is that 1 cm gap meaningful?
- If spread is small (e.g., σ = 0.5 cm) → the 1 cm gap is large relative to spread. Real difference.
- If spread is large (e.g., σ = 10 cm) → the 1 cm gap is noise. Not meaningful.
Standard deviation converts that judgment from intuition into a number-backed decision.
Same idea, different domain
Connection to Surface Roughness Ra
The logic of standard deviation — "collect all deviations from a mean and express them as one number" — applies to anything that has "unevenness" or "scatter." Surface texture is one such thing.
Think of a machined surface as a series of high and low points. Each point's distance from the mean plane is a deviation. Aggregate those deviations and you get a number that describes how rough the surface is.
Surface roughness Ra (JIS B 0601)
Ra — the arithmetic mean deviation of a profile — is the average of the absolute distances between each profile point and the mean line, measured over a sampling length. It is a close relative of standard deviation: the same "average deviation from a reference" concept, using absolute values instead of squares to keep the unit linear. Whenever you wonder how to turn a visual observation of "unevenness" into a number, the standard deviation idea is the place to start.
Step-by-step derivation
Building the Formula — Why This Equation?
"If averaging deviations would describe spread, why not just do that?" Because the sum is always zero. Here is the fix, one step at a time.
Calculate each deviation
Subtract the mean from each individual value.
deviation = individual value − meanSquare each deviation to eliminate negatives
Summing raw deviations gives zero — positives and negatives cancel. Squaring every deviation makes them all positive. (Taking absolute values would also work, but squaring is standard because it is mathematically tractable in later analysis.)
deviation² = (individual value − mean)²Sum all squared deviations → Sum of Squares
Add up all the squared deviations. This total is called the sum of squares (SS). In the example above: 1 + 16 + 81 + 9 + 36 = 143.
SS = Σ(xᵢ − x̄)²Divide by n → Variance
Dividing the sum of squares by the number of data points gives the variance — the average squared deviation. For the example: 143 ÷ 5 = 28.6 cm². Variance is a statistically powerful quantity but its unit is squared (cm², mm², etc.).
variance = SS ÷ n = 28.6 cm²Take the square root → Standard Deviation
The square root undoes the squaring from Step 2, restoring the original unit. √28.6 ≈ 5.35 cm. This is the standard deviation.
σ = √variance = √28.6 ≈ 5.35 cmStandard deviation formula (dividing by n)
σ = √[ Σ(xᵢ − x̄)² ÷ n ]
xᵢ : each individual data value · x̄ : mean · n : number of data points · Σ : sum over all data points
Fig.03 — The calculation chain
Five steps: deviation → square → sum of squares → variance → standard deviation.
Estimating the population
Unbiased Standard Deviation (n−1)
In real data analysis, you almost never measure the entire population. You measure a sample and estimate the properties of the whole. You cannot measure every high school student in the country — you measure a few hundred and infer from them.
Fig.04 — Population vs. sample
A sample's standard deviation is mathematically biased to be smaller than the true population σ. Dividing by n−1 instead of n corrects for this underestimation.
There is a mathematical fact: standard deviation calculated from a sample tends to underestimate the true population standard deviation. To correct this, the denominator is changed from n to n−1. Making the denominator smaller makes the result larger — pushing the estimate closer to the true value.
σ = √[ Σ(xᵢ−x̄)² ÷ n ]
Use when:
- Describing the data you have (not estimating a broader population)
- You measured the entire population (100% inspection)
Excel: STDEV.P
s = √[ Σ(xᵢ−x̄)² ÷ (n−1) ]
Use when:
- Estimating the population from a sample
- Virtually all real-world quality analysis
Excel: STDEV.S
Why "unbiased"?
Dividing by n alone produces an estimate that is systematically low — it has a downward bias. Dividing by n−1 removes that bias, making the estimator unbiased (Japanese: 不偏, fuhen). In practice, the difference is small when n is large, but it matters for small samples common in quality work.
Practical tools
Excel Functions
Standard deviation is easy to compute in Excel. Understanding what each function actually does — after building the formula from scratch — makes the choice obvious.
| Function | Formula used | When to use |
|---|---|---|
| STDEV.S | ÷ (n−1) — unbiased | Recommended for almost all practical use. Estimates population from a sample. |
| STDEV.P | ÷ n | Use only when your data IS the entire population (full inspection, descriptive stats only). |
| STDEV | ÷ (n−1) — same as STDEV.S | Legacy function kept for backwards compatibility. Identical behavior to STDEV.S. |
| VAR.S / VAR.P | Returns variance (before √) | When you need the variance itself. Equal to STDEV.S² or STDEV.P². |
When in doubt
Use STDEV.S (or the legacy STDEV). Full inspection is rare in manufacturing; almost every quality dataset is a sample drawn from a broader process or lot. STDEV.S is the right choice by default. The difference between .S and .P is negligible for large samples anyway.
Key takeaways
Summary
Point 01
Deviations always sum to zero
The simple average of deviations is always zero — which is why squaring (then taking the square root) is necessary to describe spread.
Point 02
Variance → √ → Standard Deviation
Variance is the average squared deviation. Taking the square root restores the original unit. That result is standard deviation.
Point 03
Use n−1 in practice (STDEV.S)
Samples underestimate population spread. Dividing by n−1 corrects the bias. Use STDEV.S in Excel for virtually all shop floor analysis.
Point 04
Ra is a cousin of σ
Surface roughness Ra uses the same "average deviation from a reference line" idea. The standard deviation mindset applies anywhere unevenness needs a number.
Memorizing the formula is the starting point, not the destination. Understanding why we square, why we take the square root, and why n−1 corrects for sampling bias opens the door to applying this logic wherever scatter, unevenness, or roughness needs to be quantified. Next time you encounter "variation" — on a chart, on a surface, or in a dataset — you have the tools to turn it into a single, defensible number.