BACnet Diagnostics explanations
Using WireShark
Use filter for non-standard BACnet network: udp port range 47808-47823
Detected Circular Networks
Overview
- Circular networks happen when you have two or more routes to the same controller.
- Look at the Hop Count value of every packet. Would need to be BACnet IP.
- If the Hop Count drops below 10, it gets flagged as a Circular Network.
- Will always also trigger the Low Hop Count check.
- Could happen with BACnet MS/TP, but it is rarer.
- Typically, the routes are BACnet/IP and BACnet/Ethernet and both are communicating on both networks.
HOP counts refer to the number of devices, usually routers, that a piece of data travels through. Each time that a packet of data moves from one router (or device) to another â say from the router of your home network to the one just outside your county line â that is considered one HOP.
Why it fails
- Checked both boxes on the manufacturerâs controllers.
- Migrated a site from Ethernet to IP.
- Kept the Ethernet network because it is easy to connect to.
How to fix it
- Circular Network will be causing such a traffic storm you wonât be able to reach any of the controllers.
- Need to unplug one of the problematic controllers, isolate it from the network, and reconfigure it.
- You can then add it back onto the network.
Token Disruptions
Overview
- Indicates the number of Reply-To-Poll-For-Master frames sent in the capture. In a stable MS/TP network, there will be Poll-for-Masters every 50 token round-trips.
- âDestinationâ is the device that sent the Poll-for-Master.
- The highest numbered device (last device) or the last device before a break in numbering will Poll-for-Master for all numbers up to its max Poll-for-Master limit. If you donât have a set limit, it defaults to 127.
- If you donât have any new devices, there will not be any responses. If you do have new devices, it will respond with âReply-to-Poll-for-Masterâ.
- If a device doesnât accept the token twice in a row (device offline or token lost because of wiring), the sender will do a Poll-for-Master to identify the next device in line.
- That device will be skipped for 50 round-trips, and then when the token comes to the break in numbers it will Poll-for-Master again. If the device is back online it will âReply-to-Poll-for-Masterâ.
- Fails if you have 1 Reply-to-Poll-for-Master in the capture.
Why it fails
- Token loss due to wiring.
- The device went offline.
- The device is faulty.
- A new device was added.
How to fix it
- Look at the sources to identify the devices replying to the Poll-for-Master.
- Is it a new device you just added? If so, itâs OK.
- Check if itâs in the Unresponsive Devices list.
- Check the wiring and see if there are any problems that may be causing the device to go on and offline.
Checksum Errors
Overview
- Packet is malformed (itâs gibberish).
- On MS/TP, it typically means the packet was clobbered by the network (poor wiring).
- On IP, it can be bad wiring or electrical influence on the wire.
- Not a BACnet router problem, itâs a physical router problem. The actual router itself is failing, or perhaps you have loose cable, or power fluctuations.
Why it fails
- Bad wiring
- Failing devices (overheating, old, internal electronic problems, etc.)
- Power fluctuations
How to fix it
- Use the Source and Destination devices to Isolate the path
- Check if other destinations have received ânormalâ packets from the source
- Look at other destinations with checksum errors to isolate the problem switch or wire
- If itâs a switch, replace it
- If itâs a wire, test and replace it
Duplicate Networks
Overview
- More than one BACnet router routing traffic to the same network.
- There can only be one router per segment, so each router needs a unique network number for each segment.
- When network numbers get duplicated, you end up with two routers routing traffic to the same network.
Why it fails
- Misconfigurations.
- Merging sites.
- There are multiple vendors on a site and no one to coordinate it all.
- This is strictly a logical network problem, not a physical one.
How to fix it
- Go into the network thatâs currently online and change the network number.
- Now it will communicate with the correct controllers, just like you want it to.
- The other router will start communicating, and you can reset it too.
- Make sure you always work with the controller thatâs online. Otherwise youâll have to fight to get controllers online and figure out which router is talking to which controllers.
Duplicate Device ID
Overview
- More than one device on the same BACnet Network with the same BACnet Device Instance (aka Device ID).
- Does not distinguish between different UDP ports, so if they are on different ports, it could be a false fail (have to drill down into frame info).
- Still recommended to give them all unique IDs in case you need to change your applications or reconfigure the site in the future.
Why it fails
- Bad address mapping and planning.
- DIP switch on the controller.
- Software addressing.
- Merging networks (BBMD, etc.).
- Adding controllers from other vendors.
- Factory defaults.
How to fix it
- Look at the Device ID that is duplicated, and at how many there are.
- Drill down, and cross-reference the SNET and SADR with your device map to identify which devices are duplicated.
- Ensure you have a clear Device ID naming convention and map.
- Fix them: Give one a different address or put it offline so you can identify it, then start re-commissioning.
Duplicate BBMD Detection
Overview
- A BACnet Broadcast Management Device (BBMD) sends a unicast message from one BACnet or IP device on a subnet to other subnets.
- When the message gets to the destined subnet, the message is rebroadcast.
- In a duplicate BBMD, multiple devices on the same network are set as BBMDs, and are rebroadcasting the same messages.
Why it fails
- A setup issue, typically on a mixed vendor site.
- The first vendor sets up multiple networks and connects them with BBMDs.
- A site upgrade or change takes place, a new vendor wins the contract, and they add devices to each network.
- To connect the devices, they put a BBMD on each subnet.
- This can continue, resulting in two, three, or more BBMDs on each subnet, and double or triple the traffic. There should only be one BBMD per network.
How to fix it
- Find all the problem vendors using the IP addresses, vendor identifier, and MAC address in Visual BACnet.
- Isolate networks if you donât need integration.
- If not, assign one vendor and their BBMDs to the task.
- A good rule of thumb is to use one vendorâs BBMD, as itâs easier to make site-wide changes.
Duplicate Source Address
Overview
- Similar to Duplicate Device ID, but instead looks at the SNET and SADR.
- This fails if more than one device sends an I-Am with the same Source Network and Source Address.
- In Duplicate Device ID, the DIP switch settings between two devices are different, but the Device IDs are the same. In Duplicate Source Address, you have different Device IDs, but the DIP switch settings are the same.
- If they are on the same network and have the same MAC, this will fail (these are used to derive the Source Address).
- More likely to occur in MS/TP, unlikely in IP and Ethernet (possible, but very unlikely).
Why it fails
- IP or Ethernet: incorrect factory programming.
- Incorrect MS/TP segment setup.
How to fix it
- On an MS/TP device, change the MAC Address
- On an IP or Ethernet device, move it to a different network or replace the device
Unresponsive Routers
Overview
- Similar to unresponsive devices, but the router that is responsible for a network doesnât reply with I-Am router to network (with network specified).
- Does not take into account Global Who-Is Router to Network (there is no network specified, so canât tell if one is missing).
Why it fails
- The network doesnât exist.
- The routerâs offline (power or network connection).
- The routerâs busy (hammered with packets).
- The network number was changed (but the source doesnât know that).
How to fix it
- Reset the source device.
- Get the network number, which you can cross-reference with your site map to figure out which router is responsible for that network.
- See if itâs online and stable.
- Check to see if the network number changed.
- Look at how much traffic is going through it (traffic where itâs the source and the destination).
Busy Routers
Overview
- Devices and networks can be configured to send a Router-Busy-To-Network message based on certain thresholds.
- When that router is busy, it will send a Router-Busy-To-Network message to any networks trying to talk to it.
- When it is no longer busy, it will send a Router-Available-To-Network message, clearing the previously busy signal.
- The diagnostic check fails if one or more routers send out a Router-Busy-To-Network message.
Why it fails
- Too much broadcast or directed/unicast traffic.
- The amount depends on the specific vendor, hardware capabilities, and configuration.
- The router is servicing too many networks.
- The router is servicing other things (firmware, etc.)
How to fix it
- Look at how busy the network is, and which devices are sending the most traffic to it.
- If those devices are sending traffic as they should, look at the router.
- See if the router is chewing up its resources.
- See what itâs trying to service (events or other networks).
- Check if itâs trying to send data or trend log is too big.
Reject-to-Networks
Overview
- Shows the number of networks with at least one Reject-Message-To-Network message in the capture.
- Router responds with a Reject-Message-To-Network if it doesnât like the packet. Fails if there is a Reject-Message-To-Network for one or more networks.
Why it fails
- Device is badly configured.
- Proprietary device is talking on the network.
- Router cannot reach the destination network.
- Non-standard BACnet information is being communicated.
- The router firmware was updated, but changes havenât taken effect.
How to fix it
- Identify the reject reason.
- Identify the source device that is sending the message.
- Ensure the router in the source network and any routers in between the source and the destination network have the latest information.
- Reset the device.
Standard Deviation of Token Round-Trip Time
Overview
- This shows how much the round-trip time of a token is changing.
- A larger number means more fluctuation in round-trip time.
- Large Standard Deviation shows problems that are inconsistent, changing every trip.
- Fail if: the fluctuation is more than 0.5ms.
- Warning if: the fluctuation is more than 0.1ms.
Why it fails
- Token disruptions, caused by:
- Bad wiring.
- Devices coming on and offline.
- A device that is speaking for a long time inconsistently.
How to fix it
- Check for token disruptions.
- If token disruptions, look for offline devices or check wiring.
- If no token disruptions, find the pair that have the longest standard deviation (sort Std. Dev. (ms) column).
- Most commonly the destination is not accepting it.
- Could also be that the source is chatty and holding the token too long.
- Check to ensure the destination is online, configured and wired properly.
Average Token Round-Trip Time
Overview
- Amount of time it takes for a token to complete one full trip.
- Shows the path the token took.
- Time is measured from when a source receives the token to when it receives it again.
- Average of all the token round trips for every master in the capture.
- Long average round-trip time shows problems that may be consistent every trip.
- Fails if Average Token Round-Trip time is more than 2000ms (2 seconds).
- Warning if Average Token Round-Trip time is more than 85ms (0.085 seconds).
Why it fails
- Problems with wiring.
- There may be a device offline.
- Token disruptions (caused by wiring or devices that are offline).
- Too many masters in the network using the token, holding onto token longer because they have more to say.
- Device(s) that are trying to talk too much.
How to fix it
- Check for token disruptions.
- If there are token disruptions, look for offline devices or check wiring.
- If no token disruptions, sort the drill down to find the device pair with the longest token passing time (sort using Mean (ms))
- Check the configuration of the devices with the longest time with the token.
- Check how busy the device is â it could be overwhelmed
Longest Response Time
Overview
- Indicates the longest time between a request and a corresponding response in the entire capture.
Why it fails
- The router is busy/overloaded
- There are too many hops (too many routers in between the requesting device and the responding device)
- The network is congested
- The destination controller is busy
How to fix it
- Identify the path between the requesting and responding device for each slow response time (using your network drawing)
- Look at those routers in the BACnet browser and figure out what traffic is passing through them
- Check the responding device, see if the device is busy.
Missing ACKs
Overview
- A Confirmed-Request needs an acknowledgement, or an Abort, Reject, or Error. If you donât get any of those, itâs a Missing ACK.
Why it fails
- The destination device is offline.
- The router that it goes through is offline or busy
How to fix it
- Check if the destination device is unresponsive.
- If itâs unresponsive, then thatâs why you have a missing ACK. Otherwise, check whatâs in between.
- Check unresponsive devices, to see if theyâre going online and offline.
Global Who-Is Router
Overview
- A router looking for a router to one or more networks.
- It sends out a âWho-Is Router to Networkâ without specifying the network number, and all routers in the network respond with I-Am Router.
- If the router specifies the network, itâs not included in this check.
- Who-Is Router and I-Am Router are both broadcast, so on a large network if they all respond at the same time, it causes a lot of traffic.
- Fails if we see one or more sources sending out more than one Global Who-Is Router.
Why it fails
- The router or controller was offline, came back online, and is trying to figure out if its information is up to date.
- The router is configured to periodically ask for this.
How to fix it
- Identify source device sending the Global Who-Is Router.
- Check to see if router is stable and staying online.
Global Who-Is
Overview
- Devices use Who-Is service (broadcast) to discover other BACnet devices on the network.
- Usually itâs targeted at a specific device, but in some cases they broadcast a Global Who-Is to the whole network, and they will all respond with I-Ams (broadcast).
- Global Who-Is should never be sent automatically, or should be used very sparingly so as not to flood the network.
- Fails if a single source sends more than one Global Who-Is during a capture.
- Any segmented Who-Is is included as a Global Who-Is.
- This is a way to break up the requests into many smaller requests that look for ranges with the entire BACnet device ID range.
Why it fails
- Misconfiguration of the BMS (constantly or regularly discovering the network).
- This could be a setting or incorrect default.
- Poor implementation of the discover process.
- Good intentions can cause this problem. Keeping an accurate BACnet device count is crucial to a network, especially with security, but the network can pay for that.
- Manual global discovery during the capture.
- This is a great way to get as much BACnet device information as possible for troubleshooting. Be careful, though, as it can have drastic effects on the network. As the networkâs size increases, this becomes more problematic.
- General
- Many systems will do this check to make sure that the BACnet device list is up to date. There are other ways to do this, such as a targeted Who-Is, but every system will need to find the controllers on a network.
How to fix it
- Identify the source device causing the problems.
- Determine which device and how often it is sending it.
- Is there a regular frequency? Is it random?
- Is it the same source every time or multiple sources? It could be different servers or vendors.
- Check the configuration on the source device(s) sending the Global Who-Is.
- Is it normal? Can it be turned off or reduced?
- If manually triggered, note the time and the source so you can easily track it throughout the system.
Global Private-Transfers
Overview
- Check if thereâs a global broadcast of a proprietary service.
- Warn: 10 sources that send out a global broadcast of proprietary service.
- Fail: 300 sources that send out a global broadcast of proprietary service.
Why it fails
- There are too many devices sending out global broadcast of proprietary service.
How to fix it
- Need to know what theyâre sending.
- Get a list of the sources.
- Remove unnecessary sources.
BACnet Broadcast Traffic
Overview
- Looks at the overall average rate of Broadcast Traffic over the length of the capture.
- Divides the total number of broadcast packets by the length of the capture.
- Fails if: there is an average of more than 10 pps.
- Warning if: the average is 1 â 9.99 pps.
Why it fails
- Left defaults.
- Didnât properly configure devices based on site scaling up.
- Common things that can broadcast:
- Time
- Events
- Etc.
How to fix it
- Identify the device sending Broadcast Traffic.
- Manually direct the traffic to the appropriate destination devices, rather than broadcasting it.
Unresponsive Devices
Overview
- If a Who-Is is sent to a device, and no I-Am is received, that device is flagged as unresponsive.
Why it fails
- Bad programming links.
- Bad graphics links (phantom devices).
- Incorrect Data Exchange.
- No power (unplugged, failed, or intermittent power supply).
- No network (not physically plugged in, wire broken, intermittent).
- Loss of communication.
- Too many devices on the same transformer.
- Router above it is offline or overloaded (*Check for Unresponsive Routers first).
How to fix it
- Check to ensure there are no Unresponsive Routers first. If there are, fix them. (This is very common)
- Identify the devices that are unresponsive.
- Check if they are physically there (look at a list or check in person).
- If they are there and are supposed to be there, put them back online.
- If they arenât supposed to be there, trace the source in Visual BACnet, and fix the device that keeps asking for it.
- If it keeps going offline, track how much traffic is being sent to the destination to see if itâs being overloaded (use the graphs).
Error Responses
Overview
- This comes from Confirmed-Request.
- When there are problems with the request or with the device that is servicing the request, the response can be either Error, Abort, or Reject.
- For each of those, there will be a list of reasons associated with it.
Why it fails
- Depending on the response (Error, Abort, or Reject), there will be a list of reasons provided.
How to fix it
- Look to the list of reasons provided and proceed accordingly.
Alarm
Overview
- The initiator (source) of the request is the device that asks another device (destination) for a summary of active alarms.
- This is primarily done from the server.
- The alarms belong to the device being queried.
- In Visual BACnet under Alarms, the destinations are the devices with alarm states (which are found in the second table).
- Warning if more than 1 GetAlarmSummary is sent during capture
- Fail if more than 8 GetAlarmSummaries are sent during capture
Why it fails
- Software misconfiguration
- Manually triggered GetAlarmSummary request
- Multiple front ends
How to fix it
- Check configuration to see how often it is asking for the GetAlarmSummary
- Set confirmed notification when alarm gets sent
- Reduce the amount of traffic on the network
- Add more routing
Write-Property Traffic
Overview
- Writing data to another controller. Writing to an object to do an action.
- Checks for excessive writes to any object.
- For security reasons, you shouldnât have a lot of writes.
Why it fails
- Misconfigured programming â devices doing Write-Properties unnecessarily.
- Incorrect data exchange.
- Bad device selection (e.g. device with output doesnât have programming space).
- A configurable device has an output on it.
- Cybersecurity attack.
How to fix it
- Identify who is sending the Write-Property traffic.
- See if you can change the write to a read.
- If not, reduce the amount of Write-Property traffic.
Read-Property Traffic
Overview
- Check for Read-Property service.
- Commonly used for polling (every 30 seconds controller goes to get data from other devices, instead of them sending the data to the controller) and COV.
- Some devices do not support COV, so they would use Read-Property to get the latest value.
Why it fails
- The check fails because thereâs too much Read-Property Traffic.
How to fix it
- If itâs because of polling, you can reduce the rate at which you poll.
- Do you really need this data?
- Consider COV instead of Read-Property.
Confirmed-COV Traffic
Overview
- Thereâs too much threshold in the Change of Value (COV).
- For example, when the lights turn on, thereâs a response back saying the lights turned on. Or, the temperature fluctuates by a certain amount, and a notification gets sent.
- Have to send back that you received the Change of Value.
- Itâs a configuration issue that can create a lot of traffic.
Why it fails
- Configuration issue, where the value has increments that are too small. E.g. Change of Value should be 2° instead of 0.001°.
- Leftover commissioning trend logs.
- Getting Confirmed-COVs for irrelevant changes. E.g. a confirmation on temperature, or lighting. Usually you want a Confirmed COV with outputs, not inputs.
How to fix it
- Find the device thatâs sending out too many COVs, and reconfigure the device.
- Change the increments.
- Evaluate if Confirmed-COV is needed.
Unconfirmed-COV Traffic
Overview
- Thereâs too much threshold in the Change of Value (COV).
- For example, when the lights turn on, thereâs a response back saying the lights turned on. Or, the temperature fluctuates by a certain amount, and a notification gets sent.
- No confirmation of receipt (different from Confirmed-COV).
- Itâs a configuration issue that can create a lot of traffic.
Why it fails
- Configuration issue, where the value has increments that are too small. E.g. Change of Value should be 2° instead of 0.001°.
- Leftover commissioning trend logs.
- Getting Unconfirmed-COVs for irrelevant changes. E.g. a confirmation on temperature, or lighting. Usually you want an Unconfirmed COV with outputs, not inputs.
How to fix it
- Find the device thatâs sending out too many COVs, and reconfigure the device.
- Change the increments.
- Evaluate if Unconfirmed-COV is needed.
Confirmed-Event Traffic
Overview
- Thereâs too much threshold in the Events.
- For example, if temperature triggers an event threshold, Confirmed-Event is triggered and sent.
- Have to send back that you received the Event.
- Itâs a configuration issue that can create a lot of traffic.
Why it fails
- Too many Confirmed-Event triggers, or too often. Both will trigger a check failure.
How to fix it
- Itâs a configuration issue.
- Change the event trigger and time.
- See if that event trigger is really needed, or if it can be turned off.
- Could also change the threshold for the trigger.
Unconfirmed-Event Traffic
Overview
- Thereâs too much threshold in the Events.
- For example, if temperature triggers an event threshold, Confirmed-Event is triggered and sent.
- No confirmation of receipt required.
- Itâs a configuration issue that can create a lot of traffic.
Why it fails
- Too many Confirmed-Event triggers, or too often. Both will trigger a check failure.
How to fix it
- Itâs a configuration issue.
- Change the event trigger and time.
- See if that event trigger is really needed, or if it can be turned off.
- Could also change the threshold for the trigger.
BACnet Buffer Full Broadcasts
Overview
- Trend logs only.
- The device that is writing to the log is saying that the log is ready to be read, and it hasnât been read.
- Any traffic going into the log either doesnât go in, or it overrides what was already there. This leads to gaps in data, either at the end of your data, or the front of your data.
- Traffic problems are causing BACnet Buffer Full problems, because the Buffer cannot be emptied. It cannot be emptied because it has not been read.
Why it fails
- The log is not being read.
- The archiver is not online, or somehow cannot reach the controller.
- The controllerâs limits. Every embedded device has X amount of memory, so you can only put so much data on it.
- There might be traffic issues.
- It could be a configuration issue, if trend log source is too small. Maybe youâre collecting only five, instead of 2,000 entries. That would cause this notification to go off all the time.
- There should be a cleanup service. If there isnât, the Buffer will continually fill up.
How to fix it
- Identify which device it was. Look to the source.
- Check the archiver.
- Check your configurations: how much data itâs storing, and how fast itâs filling up (in Visual BACnet, pps = how often itâs been sending out the notification on average).
- Check the router. If the routerâs full or busy, it might not be able to read the log.