Excellent PLC Co.,Ltd

PLC and DCS professional supplier

The Controller That Never Started: Boot-Stage Flash Check Failure in Honeywell 10012/1/2

Troubleshooting

The Controller That Never Started: Boot-Stage Flash Check Failure in Honeywell 10012/1/2

The Controller That Never Started: Boot-Stage Flash Check Failure in Honeywell 10012/1/2

Incident summary prepared by Laura McKenzie – Control Systems Reliability Team


Incident Overview

At 02:17 local time, a scheduled power restoration was completed following planned electrical maintenance.

One controller failed to return to service.

The affected unit was a Honeywell 10012/1/2 CPU module.


Observed Condition

  • CPU powered on normally

  • Status LEDs remained static

  • No transition to RUN state

  • No communication established with peer nodes

The system did not crash.
It simply never finished starting.


Initial Assumptions

The usual assumptions were made:

  • Power instability during restoration

  • Firmware mismatch

  • Incomplete startup sequence

None of these were confirmed.

Voltage levels were correct.
No recent firmware changes had been applied.


Boot Process Behavior

During startup, the 10012/1/2 performs a flash integrity check:

  • Firmware image validation

  • Configuration block verification

  • Bootloader checksum comparison

Failure at this stage prevents execution transfer to the runtime kernel.

In this case, the process stopped silently.


Why No Explicit Error Was Reported

The bootloader operates before:

  • Full diagnostics

  • Communication services

  • Event logging

If flash validation fails early, there is no channel available to report the reason.

From the outside, the CPU appears “alive but frozen.”


Root Cause Determination

Post-removal analysis showed:

  • One flash memory sector intermittently unreadable

  • Checksum results inconsistent between power cycles

  • No physical damage visible

The flash device had degraded just enough to pass occasionally — and fail unpredictably.


Why Power Cycling Made It Worse

Repeated power cycles increased stress:

  • Marginal sectors failed more frequently

  • Validation timing varied

  • Boot success probability dropped to zero

Once the failure became consistent, recovery was no longer possible.


Recovery Actions

  • CPU module replaced

  • Firmware and application restored from validated backup

  • Startup verified under controlled power conditions

The system returned to normal operation without further anomalies.


Preventive Measures Implemented

IF CPU_Service_Life > Defined_Threshold THEN
Schedule_Replacement()
END_IF
  • Controlled power restoration procedures

  • Flash health considered during lifecycle reviews

  • Cold-start testing added to maintenance routines


Key Findings

  1. Flash degradation can block startup without alarms

  2. Boot-stage failures are often invisible to operators

  3. Power events accelerate marginal flash failures

  4. Backup alone does not prevent startup failure


Closing Statement

The Honeywell 10012/1/2 CPU module did not fail under load.

It failed before it could even begin.

In control systems, the most dangerous failures are the ones that happen before the system can explain itself.

Laura McKenzie

Prev:

Next:

Leave a message