
Failure Mode and Effects Analysis (FMEA) is a structured, proactive risk assessment methodology that systematically identifies every potential way a product design or production process can fail, evaluates the severity of each failure's effect, estimates how often each failure is likely to occur, and assesses how well current controls can detect the failure before it reaches the customer. Originating in the United States military in the late 1940s under MIL-P-1629 and adopted by NASA and the aerospace industry in the 1960s, FMEA became the foundational quality risk tool of the automotive industry through AIAG standardization in the 1980s and has since spread across manufacturing sectors as the primary method for preventing defects before they enter production rather than discovering them after they have generated scrap, rework, warranty claims, or recalls. [Total Quality Management: Principles and Manufacturing Application] establishes the management philosophy and organizational conditions within which FMEA operates as a prevention tool.
The defining characteristic of FMEA is its proactive orientation. Where root cause analysis tools such as the 5 Whys and fishbone diagrams are applied after a failure has occurred to identify what caused it, FMEA is applied before production begins to identify what could fail and prioritize prevention investment before any failure occurs. An FMEA conducted rigorously on a new production process identifies which failure modes carry the highest risk and directs engineering resources toward eliminating or controlling those risks before the first production unit is made, avoiding the scrap, rework, and warranty costs that the [Cost of Poor Quality: Calculation and Reduction Framework] quantifies as 10 to 30 percent of annual revenue in typical manufacturing organizations. Manufacturing organizations that conduct FMEA reactively, after customer complaints or production disruptions have already occurred, are using the tool at a fraction of its designed value. [Right First Time Manufacturing: Principles and Implementation] describes the organizational culture within which proactive FMEA use is the designed standard rather than the exception.
DFMEA vs PFMEA: The Two Primary FMEA Types
FMEA applies to two distinct domains in manufacturing, each with different scope, different participants, and different outputs. Understanding the difference between Design FMEA and Process FMEA is a prerequisite to applying the right type at the right stage of the product and process lifecycle.
Design FMEA (DFMEA)
Design FMEA analyzes a product design to identify failure modes that could prevent the product from performing its intended function under expected use conditions. DFMEA is conducted during the product development phase, before the design is finalized, when changes can still be made to the design itself without incurring tooling, production, or field service costs.
DFMEA participants are primarily design engineers, materials engineers, and reliability engineers who understand the product's functional requirements, operating environment, and interface conditions. The failure modes analyzed in a DFMEA are design-related: incorrect tolerances, material selection errors, interface incompatibilities, and functional performance gaps under worst-case operating conditions.
The output of a DFMEA is a prioritized list of design risk actions that address high-priority failure modes before the design is released for production tooling and process development.
Process FMEA (PFMEA)
Process FMEA analyzes a production process to identify failure modes that could prevent the process from producing conforming output consistently. PFMEA is conducted during process development, before full-scale production begins, when process changes are still feasible without disrupting live production.
PFMEA participants are primarily process engineers, quality engineers, production supervisors, and operators who understand the process steps, the sources of process variation, and the quality controls currently in place. The failure modes analyzed in a PFMEA are process-related: machine capability gaps, fixturing errors, operator error modes, material variation effects, and measurement system inadequacies.
The output of a PFMEA is a prioritized list of process risk actions, including poka-yoke devices, control plan updates, inspection frequency changes, and process capability studies, that address high-priority process failure modes before full-scale production begins.
Key Insight: DFMEA prevents design-driven defects before the design is released. PFMEA prevents process-driven defects before production begins. Applying either type after failures occur is reactive quality management, not FMEA.
The SOD Scoring System: Severity, Occurrence, and Detection
The core analytical mechanism of FMEA is the Severity-Occurrence-Detection (SOD) scoring system. Each identified failure mode is scored on three dimensions using a 1 to 10 scale standardized in the AIAG and VDA FMEA guidelines. Understanding what each dimension measures and how the scores interact is essential for using FMEA as a genuine prioritization tool rather than a documentation exercise.
Severity (S)
Severity rates the seriousness of the effect if the failure mode occurs. The scale runs from 1 (no discernible effect on product or process) to 10 (hazardous effect that may occur without warning, creating a safety or regulatory compliance failure). Per the AIAG VDA FMEA Handbook, severity ratings of 9 or 10 require immediate priority action regardless of occurrence and detection scores. A failure mode with a severity of 9 that has low occurrence and excellent detection still requires action because the consequence of the failure, if it occurs and escapes, is unacceptable.
Severity cannot be reduced through better detection or lower occurrence. It can only be reduced by changing the design or process so that the failure mode's effect on the customer or production system is less severe.
Occurrence (O)
Occurrence rates how frequently the failure cause is expected to occur. The scale runs from 1 (failure is unlikely, with historical data showing near-zero occurrence) to 10 (failure is persistent, occurring on a significant proportion of production units). Occurrence is reduced by process design changes, material changes, or process capability improvements that address the root cause of the failure mode rather than by adding inspection to detect it after it occurs.
Detection (D)
Detection rates how effectively current controls would detect the failure mode or its cause before it reaches the customer or the next process step. The scale is inverted: a detection rating of 1 indicates that the current controls will almost certainly detect the failure, while a rating of 10 indicates that current controls have no ability to detect the failure. Detection is improved by adding or upgrading in-process controls, poka-yoke devices, or inspection checkpoints that identify the failure mode before it advances downstream.
Key Insight: A severity rating of 9 or 10 requires action regardless of occurrence and detection scores. High severity failure modes cannot be managed by detection or low probability alone.
RPN vs Action Priority: Understanding the Current Standard
For several decades, FMEA prioritization relied on the Risk Priority Number (RPN), calculated by multiplying severity, occurrence, and detection scores:
RPN = Severity x Occurrence x Detection
The RPN ranges from 1 to 1,000. Higher RPNs indicate higher risk and higher priority for corrective action. The RPN system was the industry standard through the AIAG 4th Edition FMEA handbook and remains widely used in manufacturing organizations that have not yet adopted the updated standard.
The AIAG VDA FMEA Handbook (2019) introduced the Action Priority (AP) system as the replacement for RPN-based prioritization. The AP system uses a logic-based lookup table that assigns High, Medium, or Low priority based on the combination of severity, occurrence, and detection scores rather than their product.
The key improvement the AP system provides is that it prevents the RPN's mathematical flaw from leading to incorrect prioritization. The RPN flaw: a failure mode with Severity 10, Occurrence 1, Detection 1 produces an RPN of 10, appearing as low priority, despite having a catastrophic consequence if it occurs. The AP system assigns High priority to any failure mode with Severity 9 or 10 regardless of occurrence and detection scores, ensuring that potential safety and regulatory failures receive appropriate attention.
Manufacturing organizations working to automotive quality standards should use AP. Organizations in other sectors that have not yet adopted the AIAG VDA handbook may continue to use RPN while understanding its mathematical limitations.
Key Insight: RPN can mathematically assign low priority to failure modes with catastrophic severity. The Action Priority system corrects this by treating high-severity failure modes as High priority regardless of occurrence and detection scores.
The Seven-Step AIAG FMEA Process
The AIAG VDA FMEA Handbook formalizes FMEA execution into a seven-step process that provides a structured sequence from planning through documentation. Each step produces a specific output that becomes the input for the next.
Step 1: Planning and Preparation. Define the scope of the FMEA, assemble the cross-functional team, establish the FMEA boundary (what is included and what is outside scope), and gather relevant reference documents including design drawings, process flow diagrams, and historical quality data.
Step 2: Structure Analysis. Break down the product or process into its component elements. For PFMEA, this means mapping each process step in sequence. For DFMEA, this means decomposing the product design into its functional systems and subsystems. The structure analysis creates the framework against which failure modes are identified in Step 3.
Step 3: Function Analysis. Define the intended function of each element identified in Step 2. Each process step or design element must have a clear function statement that describes what it is supposed to accomplish. Function analysis creates the reference against which failure modes are defined: a failure mode is any way the element could fail to perform its intended function.
Step 4: Failure Analysis. Identify potential failure modes for each element and function. For each failure mode, identify the potential effect on the customer or downstream process and the potential cause or mechanism that could produce the failure. The failure chain (cause, failure mode, effect) is the core analytical unit of the FMEA.
Step 5: Risk Analysis. Score each failure chain on severity, occurrence, and detection using the 1 to 10 scales. Assign Action Priority ratings based on the SOD combination. Identify current prevention and detection controls for each failure mode and document them alongside the scores.
Step 6: Optimization. Develop and assign recommended actions for High and Medium Action Priority failure modes. Actions must specify who is responsible, what the action is, and by when it will be completed. After actions are implemented, rescore the affected failure chains to confirm that the risk level has been reduced to acceptable levels.
Step 7: Results and Documentation. Document the completed FMEA including all failure chains, original scores, recommended actions, implemented actions, and post-action scores. The FMEA is a living document that should be updated when process changes occur, new failure modes are identified from production experience, or customer requirements change.
[FMEA Workshop: Risk Assessment and Prioritization Guide] covers how to facilitate the FMEA process effectively with a cross-functional team, including brainstorming techniques, scoring calibration, and action planning facilitation.
Key Insight: FMEA is a living document, not a one-time exercise. Failure modes discovered in production that were not identified in the original FMEA represent gaps in the analysis that must be added and addressed.
FMEA Common Failures and How to Prevent Them
Manufacturing organizations that report FMEA producing paperwork without preventing failures are experiencing one of four consistent failure modes in their FMEA process.
Conducting FMEA after production begins. FMEA conducted reactively, after process design is complete and production has started, can still produce valuable improvements but loses the primary benefit: the ability to change the process design before tooling, fixtures, and production routines are established. The earlier FMEA is conducted in the process development timeline, the lower the cost of acting on its findings.
Scoring without calibration. Teams that assign severity, occurrence, and detection scores without reference to the AIAG VDA standard rating tables produce scores that reflect team opinion rather than standardized risk assessment. Scores that are not calibrated to a common reference cannot be compared across failure modes, across products, or across time. [Measurement System Analysis: Validating Gauge Reliability in Manufacturing] covers measurement system consistency principles that apply equally to quantitative measurement and to scoring systems like FMEA.
No action on high-priority failure modes. An FMEA that identifies High Action Priority failure modes and assigns no corrective actions before production begins has documented risk without managing it. The FMEA becomes a liability rather than a prevention tool if high-priority failure modes are known and unaddressed when a failure occurs.
Treating FMEA as a one-time activity. Production experience continuously reveals failure modes that were not anticipated in the original FMEA. Customer complaints, non-conformance reports, and warranty data are inputs to FMEA updates that keep the document current as a risk management tool.
Key Insight: An FMEA that identifies High Action Priority failure modes and assigns no corrective actions is a documented record of known risk without risk management.
Within the Lean System
Connection to Lean Principles
FMEA operationalizes the lean pursuit of perfection by applying structured analysis to identify and eliminate quality risk before it generates waste. Every failure mode that FMEA identifies and prevents from entering production eliminates the defect waste, rework waste, and inspection waste that the failure would have generated. Total Quality Management establishes the management philosophy within which FMEA operates as a prevention tool, connecting process-level risk analysis to the organization-wide quality system.
Connection to Lean Tools
FMEA outputs directly inform the control plan and in-process quality check requirements that are embedded in [Quality at the Source: Building Quality Into the Production Process]. High-detection-rated failure modes in the PFMEA identify where poka-yoke devices are needed, connecting FMEA risk analysis to [Poka-Yoke: Error Proofing Methods in Manufacturing] device design and selection. The failure modes identified in FMEA that are not prevented by process design become the inputs for [Non-Conformance Reports: Managing Quality Deviations in Manufacturing] when they occur in production, closing the loop between proactive risk identification and reactive quality management.
Connection to Continuous Improvement
FMEA connects to the continuous improvement cycle through its living document status. Production experience that reveals new failure modes feeds back into the FMEA as an update, which triggers new action priority assessment and countermeasure development. [CAPA Systems in Manufacturing: Corrective and Preventive Action Explained] is the operational system through which FMEA-identified failure modes that occur in production are investigated, corrected, and prevented from recurring, making FMEA and CAPA complementary tools in the same quality management cycle.
Frequently Asked Questions
What is FMEA in manufacturing? Failure Mode and Effects Analysis (FMEA) is a structured proactive risk assessment methodology that identifies potential failure modes in a product design or production process, evaluates the severity of each failure's effect, estimates occurrence likelihood, and assesses detection capability. Originating in the US military in the 1940s and standardized for manufacturing through AIAG, FMEA directs prevention investment toward the highest-risk failure modes before they produce defects, customer complaints, or recalls.
What is the difference between DFMEA and PFMEA? Design FMEA (DFMEA) analyzes product designs during development to identify failure modes that could prevent the product from performing its intended function under use conditions. Process FMEA (PFMEA) analyzes production processes during process development to identify failure modes that could prevent consistent production of conforming output. DFMEA participants are primarily design engineers. PFMEA participants are process engineers, quality engineers, and production personnel. Both are conducted before their respective phases are complete, when changes are still feasible.
How is FMEA RPN calculated? The Risk Priority Number is calculated by multiplying three scores: Severity (S) x Occurrence (O) x Detection (D). Each score uses a 1 to 10 scale standardized in the AIAG VDA FMEA Handbook. The RPN ranges from 1 to 1,000. The 2019 AIAG VDA handbook introduced the Action Priority system as the preferred replacement for RPN, using a logic-based lookup table that prevents high-severity failure modes from receiving low priority due to the RPN's mathematical limitations.
What is Action Priority in FMEA and how does it differ from RPN? Action Priority assigns High, Medium, or Low priority to FMEA failure modes using a logic-based table that considers the combination of severity, occurrence, and detection scores. Unlike RPN, which is the product of three numbers, AP ensures that failure modes with severity ratings of 9 or 10 always receive High priority regardless of occurrence and detection scores. This prevents the mathematical flaw of RPN where a catastrophic but rare and detectable failure mode could receive a low priority number.
When should FMEA be updated in manufacturing? FMEA should be updated when process changes are made that could introduce new failure modes or change the risk level of existing ones, when production experience reveals failure modes not anticipated in the original analysis, when customer complaints or warranty data identify failures not covered in the FMEA, and on a defined periodic review cycle to confirm that the document reflects current process conditions and control plan status. FMEA is a living document, not a one-time project deliverable.
LeanSuite: A complete lean manufacturing software
Schedule Demo








