MSA / Gauge R&R: Is Your Measurement System Trustworthy?

The Problem MSA Solves

Imagine two inspectors measuring the same shaft with the same micrometer. Inspector A measures 10.02mm. Inspector B measures 10.05mm. Which one is right? And more importantly: if your specification is 10.00 ± 0.05mm, could either of them be accepting a non-conforming part or rejecting a conforming one?

This scenario plays out on shop floors constantly, and most of the time it is invisible. Measurement System Analysis (MSA) — and specifically the Gauge Repeatability and Reproducibility study (Gauge R&R) — makes this variation visible and quantifiable.

The fundamental insight of MSA is this: when you measure a part, you observe a number. That number is not the true dimension. It is the true dimension plus measurement error. MSA separates how much of what you observe is real and how much is noise generated by the measurement system.

Two Components of Measurement Error

Gauge R&R focuses on two sources of measurement error that together are called Gauge R&R (or GRR):

Repeatability (EV — Equipment Variation) is the variation you see when the same operator measures the same part multiple times with the same gauge under the same conditions. It represents the fundamental resolution and consistency of the gauge itself. High repeatability variation means the gauge gives different readings on the same part — the instrument is the problem.

Reproducibility (AV — Appraiser Variation) is the variation you see when different operators measure the same part with the same gauge. It represents differences in technique, feel, fixturing, and reading between people. High reproducibility variation means the people using the gauge are introducing inconsistency — training or procedure is the problem.

Gauge R&R Variance Equation

σ²_Total = σ²_Part + σ²_GRR
σ²_GRR = σ²_Repeatability + σ²_Reproducibility

%GRR = (σ_GRR / σ_Total) × 100

How a Gauge R&R Study Works

The standard AIAG Gauge R&R study (often called a crossed design) uses:

10 parts selected to represent the full range of part-to-part variation expected in production. Not 10 parts from the same batch — 10 parts that span the tolerance range.

2–3 appraisers who normally use this gauge in production. They should not be selected for skill — they should represent the normal range of people who do this measurement.

2–3 trials each: each appraiser measures each part multiple times, in random order, without seeing the previous reading. The randomisation is critical — if appraisers know their previous reading they will unconsciously repeat it, artificially deflating the repeatability variation.

The study generates 60–90 measurements (10 parts × 3 appraisers × 2–3 trials) from which the repeatability and reproducibility components are calculated using ANOVA or the range method.

Interpreting the Results — the %GRR Acceptance Criteria

%GRR	Verdict	Interpretation
< 10%	Acceptable	Measurement system is acceptable for production use. The gauge introduces less than 10% of the total variation you observe.
10% – 30%	Conditionally acceptable	May be acceptable depending on importance of the characteristic, cost of gauge improvement, and customer/standard requirements. Investigate and document the decision.
> 30%	Unacceptable	The measurement system contributes too much variation. Decisions based on this gauge are unreliable. The gauge must be improved before it can be used for process control or product acceptance.

Note: %GRR is typically expressed as a percentage of either the total study variation or the process tolerance (called %P/T — precision-to-tolerance ratio). Both calculations provide information — %GRR relative to study variation tells you about the measurement system's ability to detect differences between parts; %P/T tells you how much of the specification the gauge consumes.

What a Bad Result Tells You to Do

A %GRR above 30% is not a failure — it is information. The pattern of which component is high tells you where to focus:

High repeatability (EV dominates): The gauge itself is the problem. Investigate whether the gauge is appropriate for this tolerance (resolution sufficient?), whether it is worn or damaged, whether fixturing is inconsistent, or whether the part surface condition is affecting measurement.

High reproducibility (AV dominates): The operators are the problem. Investigate whether there is a written procedure for how to use this gauge, whether training has been given, whether the gauge is ergonomically consistent to use, and whether different appraisers are interpreting the measurement differently.

The most common root cause of high reproducibility variation I have seen in 36 years is the absence of a work instruction. Three operators measuring the same shaft diameter — one holds the micrometer in their left hand, one in their right, one uses a V-block fixture. Three different results from one drawing callout.

Number of Distinct Categories — the Hidden Metric

Beyond %GRR, the Gauge R&R study produces a metric called the Number of Distinct Categories (NDC), sometimes called the discrimination ratio. NDC tells you how many groups within the part variation the measurement system can reliably distinguish.

AIAG requires NDC ≥ 5 for a measurement system to be used in process control. If NDC = 2, your gauge can only distinguish "large" from "small" — effectively giving you the resolution of a go/no-go gauge even if it displays numbers. Using it for SPC charts is meaningless.

Number of Distinct Categories

NDC = √2 × (σ_Part / σ_GRR) — round down to nearest integer
Requirement: NDC ≥ 5 for control chart use

Bias, Linearity, and Stability — the Other MSA Studies

Gauge R&R is the most commonly performed MSA study, but it is not the only one. A complete MSA program includes:

Bias (偏り): the average difference between measurements on a reference standard and the true value. A biased gauge consistently reads high or low — it is accurate in the statistical sense of precision, but inaccurate in the sense of trueness.

Linearity: whether bias is consistent across the measurement range. A gauge may be unbiased at 10mm but biased at 50mm. For gauges used across their full range, linearity must be verified.

Stability: whether the gauge performance changes over time. A stability study measures the same reference part at regular intervals over weeks or months to detect drift.

For most production gauging applications, Gauge R&R + bias check is sufficient. Linearity and stability studies are typically required for CMM systems, laboratory instruments, and instruments used for product acceptance decisions where measurement uncertainty directly affects conformance.

測定システム解析（MSA）/ ゲージR&R

Measurement System Analysis / Gauge Repeatability & Reproducibility

測定値のばらつきを「部品のばらつき」と「測定システムのばらつき（繰り返し性・再現性）」に分離・定量化する手法。AIAG MSA第4版が標準的な実施手順を規定する。%GRR（ゲージR&Rのパーセント）が10%未満なら合格、30%超は不合格。自動車産業（IATF 16949）では計測器の管理計画に必須。NDC≥5でなければSPC管理図への使用は不適切。

— ◇ —

The Question That Changes Everything

The most important question in measurement is not "what does the gauge read?" It is "how much does this gauge read matter?" A gauge with 25% GRR used to measure a characteristic with a 2mm tolerance on a low-risk cosmetic part is a different problem from the same gauge measuring a 0.05mm tolerance on a safety-critical aerospace component.

MSA forces that conversation. Before accepting a measurement system, someone must ask what decisions will be made based on it, what happens if those decisions are wrong, and whether the gauge is fit for that level of risk. Every quality system that skips MSA is making decisions with unknown amounts of measurement error built into them — and attributing process variation to the process when some of it belongs to the ruler.

MSA / Gauge R&R — Is Your Measurement System Trustworthy?