mDNSResponder failing Bonjour Conformance Test

I'm using the following:

  • mDNSResponder 1790.80.10
  • Bonjour Conformance Test (BCT) 1.5.2
  • Linux 6.1.y kernel

I'm testing an Airplay 2 speaker as part of our self certification.

When BCT gets into the mDNS tests mDNSResponder fails the subsequent conflict test with this message:

ERROR 2023-06-12 10:37:29.398711-0500 _sub_conflict 03570: Device did not complete its probing sequence for a new name after a subsequent conflict arose for its previously acquired name.

BCT then retries three times with each retry failing with the same message.

Am I missing something from my software that interacts with the mdns daemon? Is this a known issue with the posix build for mDNSResponder? What can I do to get this test to pass?

Any help would be appreciated.

Ethan

Answered by ewhite-brane in 756570022

Taking Kevin's suggestion to inspect the network traffic during the failing test I found mdnsd to be sending out two defensive probes when it should have only been sending one. The change detailed below allows mdnsd to pass the mDNS subsequent conflict test.

--- a/mDNSCore/mDNS.c
+++ b/mDNSCore/mDNS.c
@@ -10959,7 +10959,7 @@ mDNSlocal void mDNSCoreReceiveResponse(mDNS *const m, const DNSMessage *const re
                                         rr->ProbingConflictCount++;
                                         rr->LastConflictPktNum = m->MPktNum;
                                         if (ResponseMCast && (!intf || intf->SupportsUnicastMDNSResponse) &&
-                                            (rr->ProbingConflictCount <= kMaxAllowedMCastProbingConflicts))
+                                            (rr->ProbingConflictCount < kMaxAllowedMCastProbingConflicts))
                                         {
                                             LogMsg("mDNSCoreReceiveResponse: ProbeCount %d; restarting probing after %d-tick pause due to possibly "
                                                 "spurious multicast conflict (%d/%d) via interface %d for %s",
-- 

Hi,

This first thing to do with this sort of failure is to capture a packet trace of the failure and look at what actually happened. The reason the BCT includes such specific setup instructions is that the test itself isn't particularly "smart". For example, if you run the BCT on an "active" network with multiple devices, many of the test will always fail. Similarly, the failure that's printed by the BCT isn't always a great description of what actually happened on the network. The BCT is basically built around the "correct" flow and, in many cases, doesn't really try to directly describe EXACTLY why the test failed.The test basically assumes that the ONLY bonjour traffic that will occur is traffic that's part of the test, so the "extra" packets of the other devices will immediately case failures.

In any case, from past experience, many failures are caused by this sort of issue. For example, if your accessory or the test computer generates an "extra" bonjour packet, that will then fail many of the tests. With luck, the details of the traffic that actually caused the failure will point toward the underlying issue.

-Kevin Elliott
DTS Engineer, CoreOS/Hardware

Accepted Answer

Taking Kevin's suggestion to inspect the network traffic during the failing test I found mdnsd to be sending out two defensive probes when it should have only been sending one. The change detailed below allows mdnsd to pass the mDNS subsequent conflict test.

--- a/mDNSCore/mDNS.c
+++ b/mDNSCore/mDNS.c
@@ -10959,7 +10959,7 @@ mDNSlocal void mDNSCoreReceiveResponse(mDNS *const m, const DNSMessage *const re
                                         rr->ProbingConflictCount++;
                                         rr->LastConflictPktNum = m->MPktNum;
                                         if (ResponseMCast && (!intf || intf->SupportsUnicastMDNSResponse) &&
-                                            (rr->ProbingConflictCount <= kMaxAllowedMCastProbingConflicts))
+                                            (rr->ProbingConflictCount < kMaxAllowedMCastProbingConflicts))
                                         {
                                             LogMsg("mDNSCoreReceiveResponse: ProbeCount %d; restarting probing after %d-tick pause due to possibly "
                                                 "spurious multicast conflict (%d/%d) via interface %d for %s",
-- 

Two comment here:

  1. Once you're sure that it's a bug (see below), please file a bug on this and post the bug number here.

  2. I would start from the assumption that something "outside" of mDNSResponder is causing the ultimate failure, not that this is simply a bug in mDNSResponder. As I'm sure you've noticed, the RFC's around Bonjour are extremely "intricate", expecting very specific and precise behavior from all the devices on the network. Indeed, the main reason the BCT exists is that small shifts in behavior outside of the expected "boundary" tend to either cause unacceptable network "noise", complete failure, or both. Modifying it's implementation requires a very complete understanding of both the underlying specifications AND it's full implementation, otherwise you risk introducing new/unexpected problems.

Looking at the specific code you modified you're basically, disabling that entire if statement. ProbingConflictCount is basically always "0" (or higher) and kMaxAllowedMCastProbingConflicts is "1", so your code edits down to:

...
	rr->ProbingConflictCount++;
	//0 + 1 = 1
	...
	
	(rr->ProbingConflictCount < kMaxAllowedMCastProbingConflicts))
	//1 < 1 = false
...

...which means you're basically deleting an entire "if" statement from mDNSResponder. That's not something I would be confident doing, given the complexit involved. Have you enabled the logging in mDNSResponder? Are you sure two probes are coming out of that if statement? And, if so, what are the specifics it's logging? Just looking at the code, I'm curious if there might be an interface level issue which is causing it to emit multiple probes because it thinks there are different interfaces involved (but that's just a blind guess).

-Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi Kevin,

I appreciate your help and interest in my problem. I agree the mDNS RFC and its implementation are complex and I am not confident in making changes to mDNSResponder. I have some images (before and after the proposed 'fix') of the mDNS subsequent conflict test. For these tests I disabled mDNS over IPv6 to cut down on the captured packet clutter. The packets were captured in the same PC running the BCT.

First the failing test with BCT output:

START (SUBSEQUENT CONFLICT)
NOTICE  2023-06-15 17:23:46.533867-0500        _sub_conflict 03498: Sending conflicting announcements for Brane-10A51D4ACE28-400.local.
ERROR   2023-06-15 17:23:47.753928-0500        _sub_conflict 03570: Device did not complete its probing sequence for a new name after a subsequent conflict arose for its previously acquired name.

FAILED (SUBSEQUENT CONFLICT)

Packet no. 277 is the conflict. Packet no. 279 and 282 are defensive probes. Note that the test fails when the second defensive probe is received.

Second the passing test with BCT output:

START (SUBSEQUENT CONFLICT)
NOTICE  2023-06-20 13:06:37.668160-0500        _sub_conflict 03498: Sending conflicting announcements for Brane-10A51D4ACE28-581.local.
NOTICE  2023-06-20 13:06:38.836215-0500    recv_announcement 03419: Received announcement for Brane-10A51D4ACE28-670.local.
NOTICE  2023-06-20 13:06:38.836316-0500        _sub_conflict 03598: Device acquired new name Brane-10A51D4ACE28-670.local.
NOTICE  2023-06-20 13:06:48.841450-0500        _sub_conflict 03498: Sending conflicting announcements for edesk (128)._brane-link._tcp.local.
NOTICE  2023-06-20 13:06:54.201280-0500    recv_announcement 03419: Received announcement for edesk (210)._brane-link._tcp.local.
NOTICE  2023-06-20 13:06:54.201382-0500        _sub_conflict 03598: Device acquired new name edesk (210)._brane-link._tcp.local.
PASSED (SUBSEQUENT CONFLICT)

Packet no. 404 is the conflict. Packet no. 405 is the defensive probe. Packet 407 is the DUT releasing the record. Note that the DUT running this modified mdnsd passes all the mDNS (ipv4 and ipv6) related tests.

I'm interested in collecting more data to better understand and fix this problem if you have any suggestions.

Regards,

Ethan

I'm doing similar things to run BCT against airplay2, and I got "WARNING 2023-10-08 14:04:10.153885-0500 init_probing 00305: Have not received initial probe from device. Listening...", the airplay2 plays music fine via Bonjour, however BCT never can get initial probe from the airplay2 device.

mDNSResponder failing Bonjour Conformance Test
 
 
Q