Author Topic: AMD OpenCL / ROCm  (Read 12158 times)

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
AMD OpenCL / ROCm
« on: October 22, 2021, 01:35:41 pm »
I'm trying to get OpenCL working on a recent AMD graphics card (Navi 14 chipset). I have tried downloading and compiling the Radeon Open Compute (ROCm) framework from https://github.com/RadeonOpenCompute .

There are a number of hardcoded -D__x86_64__ declarations in the CMakeLists.txt. After clearing those out there are maybe five or six places where functions include a couple of lines of x86 assembly, doing things that ought to be equally possible on POWER9.

Does anyone know of any existing efforts to port ROCm or another OpenCL runtime to POWER9?

Failing that, can someone suggest a good place to learn the basics of POWER9 assembly (whatever dialect is used by GCC)?

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 467
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: AMD OpenCL / ROCm
« Reply #1 on: October 22, 2021, 07:58:15 pm »
Can you post a couple links to the specific places in the source you're looking at?

Modern OpenPOWER is a superset of PowerPC, so any 64-bit PowerPC instruction reference will broadly apply. POWER9 is compliant to ISA 3.0. https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #2 on: October 24, 2021, 09:49:19 am »
There are several short assembly snippets in "os_posix.cpp", one example is https://github.com/ROCm-Developer-Tools/ROCclr/blob/df870b565cf7f7d6d5fc8dd66aa07cd868874f9b/os/os_posix.cpp#L694

Another assembly snippet is https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime/blob/12d926d06d36fe74876a82f8f8e1ce8ce7902728/amdocl/glibc_functions.cpp#L30 which seems to be requiring the memcpy ABI from a particular version of Glibc, with no explanation as to why that particular ABI is needed.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 467
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: AMD OpenCL / ROCm
« Reply #3 on: October 24, 2021, 06:18:33 pm »
The first one you may not actually need. Note that the ARM version is in fact unimplemented. The return address is always in the link register in ABI-compliant code for both ARM and Power, but in x86(_64) it's on the stack. Similarly, Os::cpuid and Os::xgetbv seem only to compile on x86_64, not ARM. I'd just make sure your logic follows ARM (and have appropriate assertions to enforce this for unimplemented functions).

The memcpy situation is actually because the semantics changed between glibc versions. Here's an explanation: https://stackoverflow.com/questions/35656696/explanation-of-memcpy-memmove-glibc-2-14-2-2-5#35678441
« Last Edit: October 24, 2021, 06:20:14 pm by ClassicHasClass »

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #4 on: October 24, 2021, 11:25:33 pm »
Thanks for the tips. While following the link you gave about memcpy I found this comment: https://stackoverflow.com/questions/8823267/linking-against-older-symbol-version-in-a-so-file?rq=1#comment100378482_8862631

Quote
This doesn't work if you're compiling on an architecture that wasn't built back in 2002 when x86-64 was first added - you'll get an error that the requested versioned symbols are not available.

I guess ppc64le didn't exist in 2002 so the 2.2.5 symbol version does not exist.

I will try omitting the .symver assembly and see how it behaves - if everything is being built from source then maybe it will be OK. If that doesn't work then I guess I can try adding a source file with an implementation of a simple non-IFUNC memcpy so that the ROCm library will use that instead of the glibc one.

MPC7500

  • Hero Member
  • *****
  • Posts: 588
  • Karma: +41/-1
    • View Profile
    • Twitter
Re: AMD OpenCL / ROCm
« Reply #5 on: October 28, 2021, 04:17:03 pm »

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #6 on: October 29, 2021, 12:04:27 am »
I assume this is already outdated?
https://wiki.nikhef.nl/grid/AMD_GPU_on_IBM_POWER

Also I found this:
https://www.phoronix.com/scan.php?page=news_item&px=AMD-AOMP-On-POWER

Both those links refer to the AOMP compiler. I don't remember if I specifically tried AOMP, but my experience was that compilers like "llvm-roc" would build without any problems - the issues with x86 assembly or intrinsics are all in the "runtime" parts.

Once I have a full development environment set up to my liking then I will look into this in more detail.

Woof

  • Jr. Member
  • **
  • Posts: 77
  • Karma: +20/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #7 on: December 28, 2021, 02:54:57 pm »
Any update on this? OpenCL is on my list of things I'd like to have working.

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #8 on: January 19, 2022, 12:23:02 am »
Any update on this? OpenCL is on my list of things I'd like to have working.

It took me a bit longer than I planned to get my development environment set up.

Some of the assembly sections (e.g. saving and restoring the stack pointer) were simple enough. Right now I am trying to work out how to port the assembly sections that handle the floating point exception status. (I haven't yet found any PowerPC assembly documentation that deals with this.)

If I get completely stuck then I will just post a version with broken floating point exception handling that crashes if there is a divide by zero etc.

Woof

  • Jr. Member
  • **
  • Posts: 77
  • Karma: +20/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #9 on: January 19, 2022, 01:17:51 am »
Thanks for the update. I'm surprised no-one has a working version, but I guess OpenCL/CUDA support is on Nvidia A100 and other server-class cards.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 467
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: AMD OpenCL / ROCm
« Reply #10 on: January 20, 2022, 11:15:23 pm »
What's the specific code you're looking at? Usually stuff with the FPSCR doesn't necessarily need to be ported directly; the Power FPU tends to "do the right thing."

MPC7500

  • Hero Member
  • *****
  • Posts: 588
  • Karma: +41/-1
    • View Profile
    • Twitter
Re: AMD OpenCL / ROCm
« Reply #11 on: May 28, 2022, 05:01:06 pm »

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #12 on: March 26, 2023, 05:45:05 am »
Still no progress on my own attempt, unfortunately. AMD are releasing new versions faster than I can merge the little work I have already done.

I found this other page detailing another person's attempts to build ROCm on POWER9, which might be worth looking at.

 https://systems.nic.uoregon.edu/internal-wiki/index.php?title=Rocm_on_power9

chatcannon

  • Newbie
  • *
  • Posts: 13
  • Karma: +3/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #13 on: March 26, 2023, 05:47:47 am »
Still no progress on my own attempt, unfortunately.

On the plus side, at least I have managed to get POCL (dummy driver to run OpenCL code on the CPU), the hardware-independent parts of OpenCL (e.g. headers, ICD etc.), and PyOpenCL all keyworded for ppc64 on Gentoo.
« Last Edit: March 26, 2023, 12:47:33 pm by chatcannon »

Hasturtium

  • Full Member
  • ***
  • Posts: 155
  • Karma: +10/-0
    • View Profile
Re: AMD OpenCL / ROCm
« Reply #14 on: March 26, 2023, 09:39:38 am »
Still no progress on my own attempt, unfortunately. AMD are releasing new versions faster than I can merge the little work I have already done.

I found this other page detailing another person's attempts to build ROCm on POWER9, which might be worth looking at.

 https://systems.nic.uoregon.edu/internal-wiki/index.php?title=Rocm_on_power9

ROCm is an organizational clusterfuck. Around two years ago I determined the best way to handle it on Ubuntu x86_64 was to sync to their repo, then when a new version was pushed, ppa-purge the repo, remove all folders and files ROCm left behind, then add the repo again and install from scratch. To my knowledge it hasn’t gotten better since, and that’s not even touching on other platforms. I respect the work you have put in, but I am not sure things will improve for AMD beyond this “beat to fit, paint to match” approach to software design.

God, I hope Intel sticks with Arc and gets OneAPI and their compute stack running everywhere. That’d save a lot of headaches.