Memory leak with tensorflow-metal

I am trying the new tensorflow-metal release. The good news is that I see a boost in speed when executing my models. The bad news is that whenever it I run it, it leaks memory. So if I run the prediction 1000x I end up with 50GB of memory usage and eventually run out. The same code works when using stock tensorflow, albeit slower.

I tried using pympler to find a leak, but the output didn't really look different between tensorflow-metal and normal tensorflow. I suspect the leak is in native code, so that wouldn't really help.

Known issue? Ideas on how to debug? Open to anything really.

Replies

Hi dandiep! Could you please provide the network or small repro case so we can investigate that? Thanks.

Hi @dandiep, Could you please try with the latest tensorflow-metal plugin (tensorflow-metal==0.1.2) and see if the issue is still seen ? We have fixed bunch of different memory leaks observed. If the issue still exists, please provide a reproducible case so that we can verify it at our end.

  • When I use tensorflow-metal== 0.1.1, there will be no memory leakage(out of Memory), and 0.1.2 will have memory leakage.

    MacBook Pro (13-inch, M1, 2020) macOS Big Sur 11.5.2 Python 3.9 TensorFlow 2.5

  • @Shaohan, can you please update to macOS 12.0 and try on that. Also it will be great if you could provide a reproducible test case for it.

Add a Comment