Data Science on Apple Silicon: new distros and builds for R, Python, Julia?

This question does not come from a developer working on any of these languages. I am a data scientist working *in* these languages. But I'd like to see some clarity how these ecosystems will transition from Intel to Apple Silicon.

Intel has specifically built tools for Python lately. R became much more efficient with Revolution (now Microsoft) bundling Intel's Math Kernel Library (and more) into R. R can also be much faster on the Mac with the Accelerate framework (esp. BLAS and LAPACK from veclib, though these are not the officially supported default for the Mac build).

As we are investing into these platforms (both Apple hardware and our own codebase, not to mention human capital), it would be great to get more advance guidance on what performance we can expect on what front. Data scientists are more than just pro consumers needing an Adobe update for the new architecture (though for Matlab or Stata, the situation is similar), but less than full-blown developers who will use Swift anyway.

Converters from coremltools can save some models (say, scikit-learn under Python) to use in apps. Does this promise any further optimization and support for Python on Apple Silicon?
Answered by DTS Engineer in 613830022
Apple has announced that we'll be submitting patches to enable Python3 to build natively for Apple Silicon. Otherwise we’re unable to comment on any future plans or features.

For R and Julia, you would need to ask the maintainers of those projects as we cannot comment on their behalf.
Accepted Answer
Apple has announced that we'll be submitting patches to enable Python3 to build natively for Apple Silicon. Otherwise we’re unable to comment on any future plans or features.

For R and Julia, you would need to ask the maintainers of those projects as we cannot comment on their behalf.
It looks like the R developers are already testing on Apple Silicon and are confident that they can provide a native R version for Apple Silicon. Check the "The R Blog" "Will R Work on Apple Silicon?" for details.
What they are saying is that:
"It turns out there is hope that R will work on Apple silicon. A usable Fortran 90 compiler for Apple silicon will hopefully be available relatively soon, since the development version of GFortran already seems to be working (check-all passed for R including reference LAPACK/BLAS) and there is a strong need for such compiler not only for R, but any scientific computing on that platform."

The "there is hope" part sends a message to me, that if I have to use R in the near future, or everyday even -as in my case - do not buy an Apple silicon mac yet....

What do you think?
Hi Guys,

Here are some benchmark results leman and I just did:

forums.macrumors.com/threads/data-science-r-and-spss-26-etc-under-rosetta-2-apple-silicon-m1.2269302/?post=29326680#post-29326680

R under Rosetta2 is basically 70% faster on average then on my i5 16GB RAM late 2017 MBPro
The biggest hurdle for Data Science on Apple Silicon is gcc (the GNU Compiler Collection). The compiler hasn’t supported Apple’s ARM architecture (instruction set, calling convention, object format, etc) since an ancient version of iOS. Work on that is in progress, but as with all open-source efforts, there is “no timeline” since commitments are done on a “time-available” basis.

Lack of GCC implies lack of FORTRAN support. The other notable FORTRAN compiler is Intel’s, and the latter is very unlikely to be ported to Apple Silicon. No FORTRAN also means a lot of numerical libraries are being held back (e.g. SciPy, BLAS, LAPACK, etc).

In any case, you could probably run your data science workloads under Rosetta 2 (i.e. Intel emulation/translation). Geekbench has shown that the M1 processors are faster than many Mac portables that came before it, even when running Intel apps. Search the web for “How to Run Legacy Command Line Apps on Apple Silicon” to set up your Terminal sessions to prefer running Intel applications.


It would appear a commercial Fortran M1 compiler is available from NAG in Oxford,UK. Is there any good reason it cannot be used in place of llvm? At least to build the Fortran libraries delivered with R & Python.


Data Science on Apple Silicon: new distros and builds for R, Python, Julia?
 
 
Q