how to speed up open source packages on M1max by Acclerate and Metal


Please help me, really urgent, please. The compatibilty of m1max chip troubled me hundreds of hour.


1、Please show me how to speed up source downloaded from github, such as numpy 、pandas or any other source, by fully using the CPU and GPU chips. (python3.8 and 3.9)

can I do it just like this?

Step 1: download source from github


Step 2: create a file named "site.cfg"in this souce file, and add content: [accelerate] libraries=Metal, Acelerate, vecLib


Step 3: Terminal: NPY_LAPACK_Order=accelerate python3 setup.py build


Step 4: pip3 install . or python3 setup.py install ? (I am not sure which method to apply)


2、How is the compatibility of Accelate and Metal? Can work with most of the source? Any tips? such as https://github.com/microsoft/qlib


3、which gcc to install? show me the code when I do it, some error happens, gcc(version 4.2.1 installed by brew) cannot compile some source, such as "ecos". Moreover, I cannot compile many sources directly by python3 setup.py install (without accelerate) How to config the gcc? which version to use on m1max


4、sometimes I can compile source by brew. but extremely unconvenient, because I need to install packages on vitual environment (e.g. conda env)other than on base path. what should I do? can I install brew on vitual environment? or just use brew to build the source, then I install by pip on vitual env? or can I config the brew to install on only vitual environment? Just show me the code

5、to compile, do I also need to install g++? witch version, show me the code


6、show me how to speed up python program by GPU and parallel computing on Accelerate

Add a Comment

Replies

TL;DR: It seems that, in the case of NumPy (in isolation), you can follow the instructions here https://github.com/conda-forge/numpy-feedstock/issues/253. But the problems with e.g. SciPy run deeper.

I agree that the question could be posed more constructively, but I too am curious about these topics.

As a new M1 Max owner, I would love to see the full potential of this awesome hardware fully exploited in my scientific computing/data analysis workflows. While it's great that NumPy can now benefit from the Accelerate framework if one is happy to compile from sources or install with miniforge using the switch described here, there seem to be deeper problems with SciPy (and, I would expect, scikit-learn too). SciPy dropped support for Accelerate a while back. One of the main technical blockers then (2018) seemed to be that

The APIs implemented by the LAPACK and BLAS libraries are outdated by about a decade. Currently the LAPACK version is 3.7.1 vs. Accelerate's 3.2.1 from 2009. This is an issue because Scipy cannot make use of recently introduced functionality in LAPACK (e.g. gh-6831, #7500). Internal LAPACK deprecations create extra maintenance efforts across different versions (e.g., #5266).

It is now 2022 and it is not clear to me whether this remains a major blocker (and whether Apple even plans to implement suitably recent APIs). The ball seems to be mostly in Apple's court as far as I can see, though maybe I am misreading the situation.