How do we use the computational power of A17 Pro Neural Engine?

Hi.

A17 Pro Neural Engine has 35 TOPS computational power. But many third-party benchmarks and articles suggest that it has a little more power than A16 Bionic.

Some references are,

How do we use the maximum power of A17 Pro Neural Engine? For example, I guess that logical devices of ANE on A17 Pro may be two, not one, so we may need to instantiate two Core ML models simultaneously for the purpose.

Please let me know any technical hints.

Replies

Now I found the descriptions in coremltools document.

In newer hardware, e.g. iPhone 15 pro (A17 pro), there is increased int8-int8 compute available on NE

Impact on Latency and Compute Unit Considerations https://apple.github.io/coremltools/docs-guides/source/quantization-overview.html#impact-on-latency-and-compute-unit-considerations

Linear 8-Bit Quantization https://apple.github.io/coremltools/docs-guides/source/performance-impact.html#linear-8-bit-quantization

The key point for A17 Pro is to quantize both weights and activations by per-tensor quantization.

  • Correction.

    Not "by per-tensor quantization", but "by training-time quantization".

Add a Comment