FMEA Step by Step

故

Every engineer has experienced it: a problem surfaces in production — or worse, at the customer — and someone says "we should have thought of this earlier." FMEA is the discipline of thinking of it earlier. It is not a bureaucratic form. It is a structured, team-based conversation about everything that could go wrong, and what you plan to do about it before it does.

FMEA comes in two main forms: Design FMEA (DFMEA), which focuses on potential failures in the product design, and Process FMEA (PFMEA), which focuses on the manufacturing process. Both follow the same logic. This article walks through the PFMEA, which is more commonly encountered by production quality engineers.

What FMEA Is — and What It Is Not

FMEA is a risk assessment tool. It asks three questions for every potential failure: How severe is the effect if it happens? How likely is it to occur? How likely are existing controls to detect it before it reaches the customer? The answers are scored and multiplied to produce a Risk Priority Number (RPN), which guides where to focus improvement effort.

故障モード影響解析

FMEA — Failure Mode and Effects Analysis

製品や工程における潜在的な故障モードを事前に洗い出し、その影響の深刻さ・発生頻度・検出困難度を評価してリスクの優先順位をつける手法。設計FMEAと工程FMEAの2種類があり、設計・開発段階での問題予防に使われる。

What FMEA is not is a checklist of things that have already gone wrong. It is a forward-looking exercise. The goal is to identify risks that have not yet caused problems — and close them before they do. An FMEA completed after the product is in full production has missed most of its value.

Step 1 — Define the Scope and Assemble the Team

Before filling in a single row, define what you are analyzing. For a PFMEA, this means identifying which process steps are in scope — typically using the Process Flow Diagram as the input. Each process step becomes a section of the FMEA.

FMEA is a team activity, not a desk exercise. The team should include the process engineer, quality engineer, design engineer (for DFMEA), manufacturing supervisor, and — when available — the customer's application engineer. Each person brings a different failure vocabulary. A quality engineer sees inspection escapes; a process engineer sees machine variation; a supervisor sees what happens at shift change. You need all of these perspectives.

"An FMEA written by one person is a list of one person's worries. An FMEA written by a team is a map of the process's real risks." — Common observation in automotive supplier development

Step 2 — List Functions and Failure Modes

For each process step, first write down its function — what it is supposed to achieve. "Drill mounting hole to Ø12.0 ±0.05 mm at specified location." This sounds obvious, but stating the function explicitly forces you to think about what success looks like before asking what failure looks like.

Then list every failure mode — the way in which the function could fail to be achieved. Failure modes are not effects; they are the specific physical deviation. For a drilling operation, failure modes might include: hole diameter oversized, hole diameter undersized, hole position shifted, hole not drilled (missed), hole drilled at wrong angle. Each function typically has multiple failure modes.

故障モード

Failure Mode

工程や部品が意図した機能を果たせなくなる具体的な状態。「寸法過大」「位置ずれ」「亀裂」などの物理的・機能的な不具合の形態を指す。影響（Effect）や原因（Cause）とは区別して記載する。

Step 3 — Identify Effects and Rate Severity (S)

For each failure mode, list the effect on the customer — both the immediate next operation (internal customer) and the end user (external customer). Then rate the severity of the worst-case effect on a 1–10 scale. Severity ratings are defined in AIAG's FMEA manual and customer-specific requirements, but the key anchors are:

Severity	Criteria
9–10	Safety or regulatory non-compliance. May affect vehicle operation without warning.
7–8	Loss of primary function. Customer very dissatisfied. 100% scrap or sort required.
5–6	Degraded performance. Customer experiences discomfort or annoyance. Rework required.
3–4	Minor performance impact. Customer notices but is not significantly inconvenienced.
1–2	No discernible effect or negligible impact on fit, form, or function.

Severity is the one rating you cannot reduce through process improvements — only a design change can lower severity. If the effect of a failure would harm a person or violate a regulation, severity stays at 9 or 10 regardless of what you do to the process.

Step 4 — List Causes and Rate Occurrence (O)

For each failure mode, identify the potential causes — the specific mechanism that would produce the failure. "Tool wear" is a cause of "hole diameter oversized." "Fixture locating pin broken" is a cause of "hole position shifted." Be specific: vague causes like "operator error" produce vague corrective actions.

Rate the occurrence of each cause on a 1–10 scale based on historical data, field data, or engineering judgment. A rating of 10 means the failure cause occurs almost inevitably; a rating of 1 means it is extremely unlikely based on comparable processes. Where actual process data exists, use it — occurrence ratings based on real Cpk values are far more meaningful than team guesses.

発生頻度

Occurrence (Hassei Hindo)

故障の原因が実際に発生する確率を1〜10で評価したもの。過去の不良実績・工程能力指数（Cpk）・類似工程のデータを根拠とする。発生頻度を下げるには工程設計の変更または防錯（ポカヨケ）の導入が有効。

Step 5 — List Controls and Rate Detection (D)

For each cause and failure mode, list the current process controls — what is already in place to prevent the cause from occurring, or to detect the failure mode before it reaches the next operation or customer. Controls include SPC, in-process gauging, visual inspection, end-of-line functional testing, poka-yoke devices, and operator self-checks.

Rate detection on a 1–10 scale — but note that detection is scored inversely: a rating of 1 means the control almost certainly catches the failure; a rating of 10 means there is no detection mechanism at all. A process with 100% automated inspection rates very differently from one relying on periodic manual sampling.

Step 6 — Calculate RPN and Prioritize Actions

The Risk Priority Number is calculated as: RPN = Severity × Occurrence × Detection. The scale runs from 1 to 1,000. High RPN values flag where action is most needed.

However, RPN alone should not drive prioritization blindly. A failure mode with Severity = 9, Occurrence = 2, Detection = 2 produces RPN = 36 — apparently low. But the severity is near-critical. Industry practice in AIAG FMEA 4th edition and the newer AIAG-VDA FMEA handbook both emphasize that any item with Severity of 9 or 10 must receive action regardless of RPN. Similarly, an occurrence rating of 8 or higher indicates a control that is fundamentally broken.

RPN Range	Typical Response
≥ 200	Immediate action required. Assign owner and due date. Escalate if not resolved.
100–199	Priority action. Include in improvement plan within current or next quarter.
50–99	Moderate risk. Document and monitor. Address when resources allow.
< 50	Low risk. Accept or address opportunistically. Review periodically.

Step 7 — Define Recommended Actions and Reassess

For each high-risk item, define recommended actions with a responsible person and a target completion date. Actions fall into three categories: those that reduce occurrence (process controls, poka-yoke, design changes), those that reduce detection rating (better gauging, automated inspection, tighter sampling), or those that reduce severity (design changes only).

After actions are completed, reassess the Severity, Occurrence, and Detection ratings and recalculate the RPN. This "after" column is what distinguishes a working FMEA from a static document. If the recalculated RPN shows meaningful improvement, the action was effective. If not, a different approach is needed.

リスク優先数

RPN — Risk Priority Number (Risuku Yūsen-sū)

深刻度（S）・発生頻度（O）・検出困難度（D）の積で算出するリスク評価値（最大1,000）。数値が高いほど優先的に対策が必要だが、深刻度が9〜10の場合はRPNの高低にかかわらず必ず対策を講じるのが原則。

Keeping the FMEA Current

Like the Control Plan, the FMEA only retains its value if it is kept up to date. It should be reviewed — and updated if necessary — whenever a process change is made, a new failure mode is discovered in production, a customer complaint is received, or a new machine or tooling is introduced. In well-run quality systems, the FMEA, Process Flow Diagram, and Control Plan are reviewed together as a triad, since a change to any one document affects the others.

The shift from the original AIAG FMEA 4th edition to the newer AIAG-VDA handbook (first published in 2019) introduced a revised scoring approach and replaced the single RPN threshold with an Action Priority (AP) system — High, Medium, Low — that gives greater weight to severity. Whether your organization uses the old or new system, the underlying discipline is the same: think ahead, score honestly, act on what you find, and keep the document alive.