Excellent PLC Co.,Ltd

PLC and DCS professional supplier

02:43 AM and Something Was Off: Night Shift Notes on a Honeywell 10012/1/2 CPU

Troubleshooting

02:43 AM and Something Was Off: Night Shift Notes on a Honeywell 10012/1/2 CPU

02:43 AM and Something Was Off: Night Shift Notes on a Honeywell 10012/1/2 CPU

By Kevin Moore – Shift Control Engineer


Night shifts teach you to trust patterns.

That’s why this one bothered me.


02:43 – The System Was “Running”

No alarms.
No watchdog trips.
No communication loss.

But outputs didn’t behave like they usually did.

A valve closed slower than expected.
A calculated value drifted — just slightly.

Enough to notice. Not enough to panic.


03:05 – Checking the Usual Things

I checked:

  • Field wiring

  • I/O status

  • Network latency

  • Controller load

Everything looked normal.

The Honeywell 10012/1/2 CPU module was in RUN, healthy by every visible metric.

And yet the logic didn’t feel right.


03:27 – Reboot Changed Everything

We scheduled a controlled restart.

After reboot:

  • The drift disappeared

  • Timing returned to normal

  • Outputs behaved as expected

Same hardware.
Same program.

Different behavior.

That’s when I suspected memory.


What We Later Learned About This CPU

The controller lived in a bad place:

  • Shared power bus

  • Frequent short-duration outages

  • No full power loss — just dips

Enough to stress flash.
Not enough to trigger obvious failures.


Flash Bit Flips Don’t Announce Themselves

In the 10012/1/2, flash memory holds:

  • Application code

  • Constants

  • Configuration parameters

A single flipped bit doesn’t crash the CPU.

It changes behavior.

Quietly.


Why Diagnostics Didn’t Catch It

  • No checksum validation during runtime

  • No memory scrubbing mechanism

  • No alarm threshold for “almost wrong”

From the system’s perspective, the logic was valid.

From reality’s perspective, it wasn’t.


How We Confirmed It Later

In daylight, engineering compared:

Online_Image != Offline_Reference

A single constant value differed.

One bit.

That was enough.


Corrective Actions

  • CPU module replaced

  • Power supply isolated and stabilized

  • UPS added specifically for controller rack

  • Post-restart verification added to night-shift checklist


What I Took Away From That Night

  1. Flash errors don’t always break systems

  2. Subtle behavior changes matter

  3. Reboots can mask deeper problems

  4. Power quality affects memory integrity


End of Shift

By morning, everything looked fine again.

But I logged it anyway.

Because in control systems,
the most dangerous failures are the ones that almost don’t happen.

Kevin Moore

Prev:

Next:

Leave a message