Excellent PLC Co.,Ltd

PLC and DCS professional supplier

The Bus Was Alive. That Was the Problem.

Troubleshooting

The Bus Was Alive. That Was the Problem.

The Bus Was Alive. That Was the Problem.

By Laura Bennett – Lead Control Systems Architect


When an RS485 network fails completely, troubleshooting is usually straightforward.
When it fails partially, responsibility becomes blurry.

This case involved a Honeywell 07191/1/1 RS485 communication board in a system where communication never fully stopped — and that made it far more dangerous than a clean failure.


What the System Was Supposed to Be

  • One RS485 master (Honeywell DCS via 07191/1/1)

  • Multiple passive field devices

  • Strict master–slave polling

  • Deterministic timing

That’s the textbook model.

That’s not what we had.


What Actually Happened

During a system expansion, a third-party skid was integrated.
The skid vendor added their own controller — connected to the same RS485 trunk.

No one noticed.

Now the bus had:

  • Two masters

  • Independent polling logic

  • No arbitration

  • No collision detection

And RS485 does not complain about this.


Observed Symptoms

  • Data updates stopped intermittently

  • Values froze, then jumped

  • Communication recovered on its own

  • No hardware faults reported

  • No consistent pattern

From the DCS perspective, the 07191/1/1 board was functioning.
From the bus perspective, it was fighting for control.


Why Diagnostics Didn’t Catch It

RS485 has no concept of “ownership.”

When two masters transmit:

  • Frames overlap

  • Replies collide

  • Devices respond unpredictably

  • Valid data occasionally slips through

To the communication board:

  • TX succeeded

  • RX activity detected

  • No electrical fault

Nothing violates the rules — because there are very few rules.


The Telltale Sign

A protocol capture revealed something subtle:

  • Overlapping request frames

  • Replies truncated mid-frame

  • Identical device responding to different masters

  • Timing jitter far beyond normal limits

This was not noise.
This was contention.


Why the 07191/1/1 Looked Guilty

Because it was visible.

  • It was the DCS interface

  • Operators blamed the card

  • Maintenance wanted to replace it

But replacing it would not have solved anything.

The conflict lived on the wire.


Corrective Actions

Architectural correction

  • Removed second master from the RS485 trunk

  • Converted skid controller to passive mode

  • Enforced single polling authority

Configuration safeguard

IF RS485_TX_WHILE_RX_ACTIVE THEN
Log_Potential_Bus_Conflict
END_IF

Operational policy

  • RS485 networks documented as single-master only

  • Any expansion requires communication topology review


Outcome

  • Communication stabilized immediately

  • No further freezes or jumps

  • Same 07191/1/1 board remained in service

  • No hardware replacement required


Responsibility Lesson

This was not:

  • A hardware defect

  • A firmware bug

  • A wiring issue

This was an architectural violation.

And communication boards are always blamed first when architecture is ignored.


Final Reflection

The Honeywell 07191/1/1 RS485 communication board behaved exactly as designed.

It transmitted when told to transmit.
It listened when it could.

The real failure was assuming RS485 would enforce discipline.

It doesn’t.

Laura Bennett

Prev:

Next:

Leave a message