Reproducible EXC_BAD_ACCESS in NEDNSProxyProvider when using async/await variants of NEAppProxyUDPFlow

Description

I am seeing a consistent crash in a NEDNSProxyProvider on iOS when migrating from completion handlers to the new Swift Concurrency async/await variants of readDatagrams() and writeDatagrams() on NEAppProxyUDPFlow.

The crash occurs inside the Swift Concurrency runtime during task resumption. Specifically, it seems the Task attempts to return to the flow’s internal serial executor (NEFlow queue) after a suspension point, but fails if the flow was invalidated or deallocated by the kernel while the task was suspended.

Error Signature

Thread 4: EXC_BAD_ACCESS (code=1, address=0x28) 
Thread 4 Queue : NEFlow queue (serial)
#0	0x000000018fe919cc in swift::AsyncTask::flagAsAndEnqueueOnExecutor ()
#9	0x00000001ee25c3b8 in _pthread_wqthread ()

Steps

The crash is highly timing-dependent. To reproduce it reliably:

  1. Use an iOS device with Developer Settings enabled.

  2. Go to Developer > Network Link Conditioner -> High Latency DNS.

  3. Intercept a DNS query and perform a DoH (DNS-over-HTTPS) request using URLSession.

  4. The first few network requests should trigger the crash

Minimum Working Example (MWE)

class DNSProxyProvider: NEDNSProxyProvider {
    override func handleNewFlow(_ flow: NEAppProxyFlow) -> Bool {
        guard let udpFlow = flow as? NEAppProxyUDPFlow else { return false }
        
        Task(priority: .userInitiated) {
            await handleUDPFlow(udpFlow)
        }
        return true
    }
    
    func handleUDPFlow(_ flow: NEAppProxyUDPFlow) async {
        do {
            try await flow.open(withLocalFlowEndpoint: nil)
            
            while !Task.isCancelled {
                // Suspension point 1: Waiting for datagrams
                let (flowData, error) = await flow.readDatagrams()
                if let error { throw error }
                guard let flowData, !flowData.isEmpty else { return }
                
                var responses: [(Data, Network.NWEndpoint)] = []
                for (data, endpoint) in flowData {
                    // Suspension point 2: External DoH resolution
                    let response = try await resolveViaDoH(data)
                    responses.append((response, endpoint))
                }
                
                // Suspension point 3: Writing back to the flow
                // Extension will crash here on task resumption
                try await flow.writeDatagrams(responses)
            }
        } catch {
            flow.closeReadWithError(error)
            flow.closeWriteWithError(error)
        }
    }
    
    private func handleFlowData(_ packet: Data, endpoint: Network.NWEndpoint, using parameters: NWParameters) async throws -> Data {
        let url = URL(string: "https://dns.google/dns-query")!
        
        var request = URLRequest(url: url)
        request.httpMethod = "POST"
        request.httpBody = packet
        request.setValue("application/dns-message", forHTTPHeaderField: "Content-Type")
        
        let (data, _) = try await URLSession.shared.data(for: request)
        return data
    }
}

Crash Details & Analysis

The disassembly at the crash point indicates a null dereference of an internal executor pointer (Voucher context):

ldr x20, [TPIDRRO_EL0 + 0x340]
ldr x0, [x20, #0x28]   // x20 is NULL/0x0 here, resulting in address 0x28

It appears that NEAppProxyUDPFlow’s async methods bind the Task to a specific internal executor. When the kernel reclaims the flow memory, the pointer in x20 becomes invalid. Because the Swift runtime is unaware that the NEFlow queue executor has vanished, it attempts to resume on non-existing flow and then crashes.

Checking !Task.isCancelled does not prevent this, as the crash happens during the transition into the task body before the cancellation check can even run.

Questions

  1. Is this a known issue of the NetworkExtension async bridge?

  2. Why does Task.isCancelled not reflect the deallocation of the underlying NEAppProxyFlow?

  3. Is the only safe workaround?

Please feel free to correct me if I misunderstood anything here. I'll be happy to hear any insights or suggestions :) Thank you!

Reproducible EXC_BAD_ACCESS in NEDNSProxyProvider when using async/await variants of NEAppProxyUDPFlow
 
 
Q