Excellent PLC Co.,Ltd

PLC and DCS professional supplier

After the Surge, Nothing Failed — Except the Margins

Troubleshooting

After the Surge, Nothing Failed — Except the Margins

After the Surge, Nothing Failed — Except the Margins

By Robert Sinclair – Industrial Electronics Reliability Engineer


The most expensive failures are not the ones that break equipment.
They are the ones that quietly erode safety margins.

This case involved a Honeywell 07191/1/1 RS485 communication board that survived a nearby lightning event. Power stayed on. Communication stayed up. No alarms were generated.

Everyone moved on.

They shouldn’t have.


What Happened Before the Symptoms Appeared

  • Severe thunderstorm

  • No direct strike on the control building

  • Multiple inductive loads tripped and recovered

  • RS485 network resumed operation automatically

From a system perspective, everything looked normal.

Weeks later, communication issues began to surface.


Early Symptoms (Often Ignored)

  • Occasional checksum retries

  • Slight increase in response time

  • One device occasionally missing a poll

  • No consistent fault pattern

Nothing dramatic enough to justify replacing hardware.

That’s exactly how latent damage hides.


Why Surge Damage Is Different

A surge doesn’t need to destroy a component to damage it.

In RS485 transceivers, surge stress can cause:

  • Input protection diode leakage

  • Reduced output drive strength

  • Slower edge transitions

  • Increased susceptibility to noise

The 07191/1/1 board was still functional — just operating with reduced electrical margins.


What the Bus Looked Like Electrically

Measured at the far end of the network:

  • Differential voltage lower than historical baseline

  • Signal rise/fall times noticeably slower

  • Noise immunity reduced under load

  • Reflections more pronounced

No single parameter violated limits.
All margins were thinner.


Why This Is Hard to Diagnose

Standard checks passed:

  • Continuity OK

  • Termination OK

  • Grounding OK

  • Configuration unchanged

But RS485 doesn’t fail at a threshold.
It fails statistically.

Errors increase gradually until the system becomes unreliable.


The Telltale Comparison

We temporarily replaced the 07191/1/1 board with an identical spare.

Same wiring.
Same network.
Same configuration.

Result:

  • Signal amplitude increased

  • Edge transitions sharpened

  • Error rate dropped to zero

That confirmed it: the original board had suffered soft damage.


Corrective Strategy

  • Permanently replaced the degraded communication board

  • Installed additional surge protection on RS485 lines

  • Improved cable shielding at building entry points

  • Added post-storm communication quality checks

Recommended monitoring logic:

IF Retry_Rate > Baseline * 2 THEN
Flag_Comms_Margin_Degradation
END_IF

Don’t wait for total failure. Watch the trend.


Key Engineering Lessons

  1. Surges don’t have to kill hardware to damage it

  2. Communication boards can degrade invisibly

  3. Error rate trends matter more than alarms

  4. “Still online” does not mean “still healthy”


Final Thought

The Honeywell 07191/1/1 RS485 communication board survived the surge.

But survival is not the same as full recovery.

In communication systems, lost margin is borrowed time — and it always comes due.

Robert Sinclair

Prev:

Next:

Leave a message