# [CRITICAL] Metal RHI Memory Leak - Resource exhaustion vulnerability (CWE-400) - Bug Report

[CRITICAL] Metal API Memory Leak - Heap Memory Never Released to OS (CWE-400)

Security Classification

This issue constitutes a resource exhaustion vulnerability (CWE-400):

AspectDetails
TypeUncontrolled Resource Consumption
CWECWE-400
VectorLocal (any Metal application)
ImpactSystem instability, denial of service
User ControlNone - no mitigation available
RecoveryRequires application restart

Summary

Metal heap allocations are never released back to macOS, even when the memory is entirely unused. This causes continuous, unbounded memory growth until system instability or crash. The issue affects any application using Metal API heap allocation.

This was discovered in Unreal Engine 5, but reproduces in a completely blank UE5 project with zero application code - confirming this is Metal framework behavior, not application-level.

Environment

  • OS: macOS Tahoe 26.2
  • Hardware: Apple Silicon M4 Max (also reproduced on M1, M2, M3)
  • API: Metal

Reproduction Steps

  1. Run any Metal application that allocates and deallocates GPU buffers via Metal heaps
  2. Open Activity Monitor and observe the application's memory usage
  3. Let the application run idle (no user interaction required)
  4. Observe memory growing continuously at ~1-2 MB per second
  5. Memory never plateaus or stabilizes
  6. Eventually system becomes unstable

For testing: Any Unreal Engine 5.4+ project on macOS will reproduce this. Even a blank project with no gameplay code exhibits the leak. (Tested on UE 5.7.1)

Observed Behavior

Memory Analysis

Using Unreal's memreport -full command, two reports taken 86 seconds apart:

MetricReport 1 (183s)Report 2 (269s)Delta
Process Physical4373.64 MB4463.39 MB+89.75 MB
Metal Heap Buffer7168 MB8192 MB+1024 MB
Unused Heap3453 MB4477 MB+1024 MB
Object Count73,84073,8400 (no change)

Key Finding

Metal Heap grew by exactly 1 GB while "Unused Heap" also grew by 1 GB. This demonstrates:

  1. Metal is allocating new heap blocks in ~1 GB increments
  2. Previously allocated heap memory becomes "unused" but is never released
  3. The unused memory accumulates indefinitely
  4. No application-level objects are leaking (count remains constant)

Memory Growth Pattern

  • Continuous growth while idle (no user interaction)
  • Growth rate: approximately 1-2 MB per second
  • No plateau or stabilization occurs
  • Metal allocates new 1 GB heap blocks rather than reusing freed space
  • Eventually leads to system instability and crash

What is NOT Causing This

We verified the following are NOT the source:

  • Application objects - Object count remains constant
  • Application code - Blank project with no code reproduces the issue
  • Texture streaming - Disabling texture streaming had no effect
  • CPU garbage collection - Running GC has no effect (this is GPU memory)

Mitigations Attempted (None Worked)

setPurgeableState

Setting resources to purgeable state before release:

[buffer setPurgeableState:MTLPurgeableStateEmpty];

Result: Metal ignores this hint and does not reclaim heap memory.

Avoiding Heap Pooling

Forcing individual buffer allocations instead of heap-based pooling. Result: Leak persists - Metal still manages underlying allocations.

Aggressive Buffer Compaction

Attempting to compact/defragment buffers within heaps every frame. Result: Only moves data between existing heaps. Does NOT release heaps back to OS.

Reducing Pool Sizes

Minimizing all buffer pool sizes to force more frequent reuse. Result: Slightly slows the leak rate but does not stop it.

Root Cause Analysis

How Metal Heap Allocation Appears to Work

  1. Metal allocates GPU heap blocks in large chunks (~1 GB observed)
  2. Application requests buffers from these heaps
  3. When application releases buffers, memory becomes "unused" within the heap
  4. Metal does NOT release heap blocks back to macOS, even when entirely unused
  5. When fragmentation prevents reuse, Metal allocates new heap blocks
  6. Result: Continuous memory growth with no upper bound

The Core Problem

There appears to be no Metal API to force heap memory release. The only way to reclaim this memory is to destroy the Metal device entirely, which requires restarting the application.

Expected Behavior

Metal should:

  1. Release unused heaps - When a heap block is entirely unused, release it back to macOS
  2. Respect purgeable hints - Honor setPurgeableState calls from applications
  3. Compact allocations - Defragment heap allocations to reduce fragmentation
  4. Provide control APIs - Allow applications to request heap compaction or release
  5. Enforce limits - Have configurable maximum heap memory consumption

Security Implications

  1. Local Denial of Service - Any Metal application can exhaust system memory, causing instability affecting all running applications
  2. Memory Pressure Attack - Forces other applications to swap to disk, degrading system-wide performance
  3. No Upper Bound - Memory consumption continues until system failure
  4. Unmitigable - End users have no way to prevent or limit the leak
  5. Affects All Metal Apps - Any application using Metal heaps is potentially affected

Impact

  • Applications become unstable after extended use
  • System-wide performance degrades as memory pressure increases
  • Users must periodically restart applications
  • Developers cannot work around this at the application level
  • Long-running applications (games, creative tools, servers) are particularly affected

Request

  1. Investigate Metal heap memory management behavior
  2. Implement heap release when blocks become entirely unused
  3. Honor setPurgeableState hints from applications
  4. Consider providing an API for applications to request heap compaction
  5. Document any intended behavior or workarounds

Additional Notes

This issue has been observed across multiple Unreal Engine versions (5.4, 5.7) and multiple Apple Silicon generations (M1 through M4). The behavior is consistent and reproducible.

The Unreal Engine team has implemented various CVars to attempt mitigation (rhi.Metal.HeapBufferBytesToCompact, rhi.Metal.ResourcePurgeInPool, etc.) but none successfully address the issue because the root cause is at the Metal framework level.


Tested: January 2026 Platform: macOS Tahoe 26.2, Apple Silicon (M1/M2/M3/M4)

# [CRITICAL] Metal RHI Memory Leak - Resource exhaustion vulnerability (CWE-400) - Bug Report
 
 
Q