Japanese Quality by Jaw

The Hardest Part of Normal Distribution Is Checking Whether It Actually Is One

Normal distribution appears everywhere in manufacturing and statistical analysis. But the more useful tools it unlocks, the more dangerous it is to skip the check. Here is why histograms are unreliable for this purpose — and how QQ plots give you a trustworthy answer.

01

What Is Normal Distribution — and Why Does It Appear in Manufacturing?

The bell curve and why production creates it

Normal distribution (also called the Gaussian distribution or bell curve) is the natural pattern of variation that appears when you try to make something to a target value.

For example: manufacturing pen shafts to a target diameter of 10 mm produces results like 10.01 mm, 9.98 mm, and so on. Values far from the target — 8 mm or 12 mm — are extremely unlikely. The closer to the target, the higher the probability; the further away, the faster the probability falls. That shape is the normal distribution.

Fig. 01 — Normal distribution: symmetric around the mean μ

μ (mean) −1σ +1σ −3σ +3σ Higher probability near μ — probability falls rapidly with distance

The bell curve. Most values cluster near the mean; extreme values become exponentially rarer.

Why manufacturing creates normal distributions

The act of aiming at a target value — and having many small, independent sources of variation push results slightly above or below — is precisely the mechanism that generates normal distribution. Conversely, data that deviates from normality is often a signal of something abnormal: tool wear, a material lot change, or a process shift.

02

σ and Probability — What the Numbers Mean

The foundation of control charts and Cp/Cpk

The power of normal distribution lies in this: once you know the mean (μ) and standard deviation (σ), you can calculate exactly what proportion of data falls within any given range. This is the foundation on which control charts and process capability indices (Cp, Cpk) are built.

Range Data within this range Outside (defect rate estimate)
μ ± 1σ ~68.3% ~31.7%
μ ± 2σ ~95.4% ~4.6%
μ ± 3σ ~99.7% ~0.3% (3 per 1,000)

Connection to control charts

Control chart upper and lower control limits (UCL / LCL) are typically set at ±3σ. A point outside those limits represents something that should only happen 0.3% of the time by chance — statistically, a signal that something has changed in the process.

03

Standard Normal Distribution — One Ruler for All

How standardization makes any normal distribution comparable

There are infinitely many possible normal distributions — each defined by its own mean and standard deviation. But a single transformation converts any of them into a common reference: the standard normal distribution (mean = 0, σ = 1).

z = (x − μ) ÷ σ
x: measured value  ·  μ: mean  ·  σ: standard deviation  ·  z: standardized value (z-score)
The resulting distribution is the standard normal distribution (mean = 0, σ = 1)

The standard normal distribution has a lookup table — the z-table — that gives the probability for any z-score. Whatever the original mean and standard deviation, once standardized you can read off the probability directly from that table.

Why normal distribution is called the "universal tool"

Know μ and σ → standardize → look up the z-table → get the probability. This chain is why normal distribution underlies process management, inspection, and experimental design. But the entire chain only works if the data is actually normally distributed — which is exactly what needs checking.

04

The Histogram Trap — Why Visual Inspection Is Dangerous

Two fundamental problems with using histograms to check normality

The most common approach to checking normality is drawing a histogram and eyeballing it. This is unreliable — and in some situations, actively misleading.

Problem 01
The shape changes with bin width
The same dataset can look perfectly bell-shaped with one bin width and clearly skewed with another. "Looks normal" can flip to "looks skewed" just by changing the class interval — with no change to the underlying data.
Problem 02
Skewness and kurtosis cannot be judged by eye
Normal distribution requires strict conditions: skewness = 0 and kurtosis = 3. Detecting meaningful departures from these values visually — even for experts — is unreliable. A distribution that "looks like a hill" can be significantly non-normal by these measures.
Skewed distribution
Skewness ≠ 0
Long tail on the right → skewness ≠ 0
Peaked distribution
Kurtosis ≠ 3
Too sharp a peak → kurtosis ≠ 3

A common floor mistake

"I drew a histogram and it looks like a hill — normal distribution confirmed." This logic is flawed. A hill-shaped histogram can hide significant departures from normality in its skewness or kurtosis. Calculating control charts or process capability indices on that assumption leads to systematically wrong conclusions.

05

Why QQ Plots Work

A reliable alternative to histogram eyeballing

A QQ plot (quantile-quantile plot, also called a normal probability plot) is significantly more reliable than histogram inspection for assessing normality.

The principle: plot the theoretical quantiles of a normal distribution on the horizontal axis against the actual quantiles of your data on the vertical axis. If the data follows a normal distribution, the points fall along a straight line.

Normal data → straight line
Theoretical quantile Sample quantile
Points follow the line → normal distribution ✓
Non-normal data → curve
Theoretical quantile Sample quantile
Points curve away from the line → non-normal ✗

Combining the QQ plot with a normality test:

After visual inspection with the QQ plot, add a normality test such as the Shapiro-Wilk test for a more objective judgment. A p-value below 0.05 indicates that the data is likely not normally distributed. Use both together — the plot for visual understanding, the test for the number.

Avoid
Histogram visual inspection only
Shape varies with bin width. Subtle departures in skewness and kurtosis are easy to miss. False confidence is common.
Recommended
QQ plot + normality test
Straight-line fit assessed visually; p-value gives objective confirmation. Both available in free statistical software.
06

Summary

Three things to take away
Point 01
Why normal distribution is powerful

Knowing μ and σ gives you the probability for any range. This single property underlies control charts, Cp/Cpk, and most of statistical process control.

Point 02
What ±3σ means

99.7% of data falls within ±3σ. A point outside this range is a "0.3% event" — treated as an alarm signal in process management.

Point 03
How to check normality

Histograms are unreliable. Use a QQ plot to check visually, and a normality test (e.g. Shapiro-Wilk) for an objective p-value. Use both together.

Normal distribution is powerful precisely because it unlocks so many analytical tools — which is exactly why checking the assumption matters most. Using powerful methods on a false premise leads to wrong conclusions. QQ plots are available in any free statistics package. Make normality verification a standard step before applying any normal-distribution-based analysis.

⚙️

Jaw

Based in Shiga Prefecture, Japan. 36 years in quality management and precision measurement at an automotive parts manufacturer — specializing in CMM measurement and surface roughness measurement of cylinder blocks and crankshafts. Currently supporting the floor as a manager and mentoring the next generation. This blog shares practical measurement, quality, and statistics knowledge from real manufacturing experience.

← Back to Articles