Japanese Quality by Jaw

MSA / Gauge R&R — Is Your Measurement System Trustworthy?

Every quality decision — accept, reject, adjust, investigate — depends on a measurement. If the measurement system is adding more variation than the process itself, every decision based on it is suspect. Gauge R&R quantifies exactly how much the gauge and the operator are lying to you.

TOTAL OBSERVED VARIATION — DECOMPOSITION Total Observed 100% Part Variation 70% — parts differ Gauge R&R 30% Repeatability EV — equipment Reproducibility AV — appraiser %GR&R <10%: Acceptable %GR&R 10–30%: Conditional %GR&R >30%: Unacceptable
In this article

    The Problem MSA Solves

    Imagine two inspectors measuring the same shaft with the same micrometer. Inspector A measures 10.02mm. Inspector B measures 10.05mm. Which one is right? And more importantly: if your specification is 10.00 ± 0.05mm, could either of them be accepting a non-conforming part or rejecting a conforming one?

    This scenario plays out on shop floors constantly, and most of the time it is invisible. Measurement System Analysis (MSA) — and specifically the Gauge Repeatability and Reproducibility study (Gauge R&R) — makes this variation visible and quantifiable.

    The fundamental insight of MSA is this: when you measure a part, you observe a number. That number is not the true dimension. It is the true dimension plus measurement error. MSA separates how much of what you observe is real and how much is noise generated by the measurement system.

    Two Components of Measurement Error

    Gauge R&R focuses on two sources of measurement error that together are called Gauge R&R (or GRR):

    Repeatability (EV — Equipment Variation) is the variation you see when the same operator measures the same part multiple times with the same gauge under the same conditions. It represents the fundamental resolution and consistency of the gauge itself. High repeatability variation means the gauge gives different readings on the same part — the instrument is the problem.

    Reproducibility (AV — Appraiser Variation) is the variation you see when different operators measure the same part with the same gauge. It represents differences in technique, feel, fixturing, and reading between people. High reproducibility variation means the people using the gauge are introducing inconsistency — training or procedure is the problem.

    Gauge R&R Variance Equation
    σ²_Total = σ²_Part + σ²_GRR
    σ²_GRR = σ²_Repeatability + σ²_Reproducibility

    %GRR = (σ_GRR / σ_Total) × 100

    How a Gauge R&R Study Works

    The standard AIAG Gauge R&R study (often called a crossed design) uses:

    10 parts selected to represent the full range of part-to-part variation expected in production. Not 10 parts from the same batch — 10 parts that span the tolerance range.

    2–3 appraisers who normally use this gauge in production. They should not be selected for skill — they should represent the normal range of people who do this measurement.

    2–3 trials each: each appraiser measures each part multiple times, in random order, without seeing the previous reading. The randomisation is critical — if appraisers know their previous reading they will unconsciously repeat it, artificially deflating the repeatability variation.

    The study generates 60–90 measurements (10 parts × 3 appraisers × 2–3 trials) from which the repeatability and reproducibility components are calculated using ANOVA or the range method.

    Interpreting the Results — the %GRR Acceptance Criteria

    %GRRVerdictInterpretation
    < 10% Acceptable Measurement system is acceptable for production use. The gauge introduces less than 10% of the total variation you observe.
    10% – 30% Conditionally acceptable May be acceptable depending on importance of the characteristic, cost of gauge improvement, and customer/standard requirements. Investigate and document the decision.
    > 30% Unacceptable The measurement system contributes too much variation. Decisions based on this gauge are unreliable. The gauge must be improved before it can be used for process control or product acceptance.

    Note: %GRR is typically expressed as a percentage of either the total study variation or the process tolerance (called %P/T — precision-to-tolerance ratio). Both calculations provide information — %GRR relative to study variation tells you about the measurement system's ability to detect differences between parts; %P/T tells you how much of the specification the gauge consumes.

    What a Bad Result Tells You to Do

    A %GRR above 30% is not a failure — it is information. The pattern of which component is high tells you where to focus:

    High repeatability (EV dominates): The gauge itself is the problem. Investigate whether the gauge is appropriate for this tolerance (resolution sufficient?), whether it is worn or damaged, whether fixturing is inconsistent, or whether the part surface condition is affecting measurement.

    High reproducibility (AV dominates): The operators are the problem. Investigate whether there is a written procedure for how to use this gauge, whether training has been given, whether the gauge is ergonomically consistent to use, and whether different appraisers are interpreting the measurement differently.

    The most common root cause of high reproducibility variation I have seen in 36 years is the absence of a work instruction. Three operators measuring the same shaft diameter — one holds the micrometer in their left hand, one in their right, one uses a V-block fixture. Three different results from one drawing callout.

    Number of Distinct Categories — the Hidden Metric

    Beyond %GRR, the Gauge R&R study produces a metric called the Number of Distinct Categories (NDC), sometimes called the discrimination ratio. NDC tells you how many groups within the part variation the measurement system can reliably distinguish.

    AIAG requires NDC ≥ 5 for a measurement system to be used in process control. If NDC = 2, your gauge can only distinguish "large" from "small" — effectively giving you the resolution of a go/no-go gauge even if it displays numbers. Using it for SPC charts is meaningless.

    Number of Distinct Categories
    NDC = √2 × (σ_Part / σ_GRR) — round down to nearest integer
    Requirement: NDC ≥ 5 for control chart use

    Bias, Linearity, and Stability — the Other MSA Studies

    Gauge R&R is the most commonly performed MSA study, but it is not the only one. A complete MSA program includes:

    Bias (偏り): the average difference between measurements on a reference standard and the true value. A biased gauge consistently reads high or low — it is accurate in the statistical sense of precision, but inaccurate in the sense of trueness.

    Linearity: whether bias is consistent across the measurement range. A gauge may be unbiased at 10mm but biased at 50mm. For gauges used across their full range, linearity must be verified.

    Stability: whether the gauge performance changes over time. A stability study measures the same reference part at regular intervals over weeks or months to detect drift.

    For most production gauging applications, Gauge R&R + bias check is sufficient. Linearity and stability studies are typically required for CMM systems, laboratory instruments, and instruments used for product acceptance decisions where measurement uncertainty directly affects conformance.

    Measurement System Analysis / Gauge Repeatability & Reproducibility

    測定値のばらつきを「部品のばらつき」と「測定システムのばらつき(繰り返し性・再現性)」に分離・定量化する手法。AIAG MSA第4版が標準的な実施手順を規定する。%GRR(ゲージR&Rのパーセント)が10%未満なら合格、30%超は不合格。自動車産業(IATF 16949)では計測器の管理計画に必須。NDC≥5でなければSPC管理図への使用は不適切。

    — ◇ —

    The Question That Changes Everything

    The most important question in measurement is not "what does the gauge read?" It is "how much does this gauge read matter?" A gauge with 25% GRR used to measure a characteristic with a 2mm tolerance on a low-risk cosmetic part is a different problem from the same gauge measuring a 0.05mm tolerance on a safety-critical aerospace component.

    MSA forces that conversation. Before accepting a measurement system, someone must ask what decisions will be made based on it, what happens if those decisions are wrong, and whether the gauge is fit for that level of risk. Every quality system that skips MSA is making decisions with unknown amounts of measurement error built into them — and attributing process variation to the process when some of it belongs to the ruler.

    — ◇ —