Raptor Computing Systems Community Forums (BETA)

Third Party Hardware => GPU Compute / Accelerators => Topic started by: Woof on January 31, 2022, 10:05:58 am

Title: AMD GPU at boot
Post by: Woof on January 31, 2022, 10:05:58 am
Before typing I'm already feeling the burning shame of a n00b question, but anyway, here goes...

I finally built my Talos II (dual 18-core), temporarily installed a Radeon RX 580 (I plan on a Radeon Pro W5700 but it currently has a waterblock fitted) and installed Ubuntu 21.10 (server install followed by desktop packages). It's early days and it's mostly working (Firefox 96 has issues, but one thing at a time). So far so good. I've disabled the built-in VGA via the jumper but it still starts up with this and then hands over to the Radeon card. Is this normal?

See the attached image.

Title: Re: AMD GPU at boot
Post by: MPC7500 on January 31, 2022, 10:46:00 am
If you're on System Package v1.00 (https://wiki.raptorcs.com/wiki/Talos_II/Firmware#System_Package_v1.01), the jumper doesn't work. So this is "normal".
From my POV it's anyway better to use the AST till Petitboot and then to disable it via Kernel command line (https://wiki.raptorcs.com/wiki/Troubleshooting/GPU#I_want_Petitboot_via_AST_but_the_subsequent_Linux_OS_console_on_a_discrete_GPU) to switch then to the AMD GPU.
Title: Re: AMD GPU at boot
Post by: Woof on January 31, 2022, 12:10:53 pm
Thanks, I'll check tomorrow (I'm assuming being a brand new mobo it wouldn't have the v1.0 firmware, but in any case I'll look at this).

Where can I find info like this, or does everyone just know and I'm late to the party? The quick start guide (https://forums.raptorcs.com//wiki.raptorcs.com/wiki/Talos_II_Beginner%27s_Quick_Start_Guide) seems to trail off when it should just be getting started, but I did (just now) find the other guides (https://forums.raptorcs.com//wiki.raptorcs.com/wiki/Category:Guides) and I'll look through them (in particular adding the GPU firmware (https://forums.raptorcs.com//wiki.raptorcs.com/wiki/Add_GPU_Firmware_To_BOOTKERNFW)).
Title: Re: AMD GPU at boot
Post by: ClassicHasClass on January 31, 2022, 07:46:13 pm
That's a stonking system you have there. Makes this dual-8 look like a toy. What's your aim for it? Are you using the HSFs, or a custom cooling solution?

The jumper works properly on my two T2s, or at least it seems to. But blacklisting the AST is never a bad idea with a GPU.
Title: Re: AMD GPU at boot
Post by: Woof on February 01, 2022, 01:28:00 am
I currently have the HSFs but I'm working on a water cooling solution for it (which I'll report back on once it's finished). We do custom 3D engines and tooling at work and I was looking into alternatives to our current Threadripper setups for texture processing. It's a highly parallel task that benefits from many threads, currently running best on the 64C/128T 3990X, but I was curious as to what else is out there.

I've not had much time to experiment so far, and it'll be at least a month before I get some quality time with this machine.
Title: Re: AMD GPU at boot
Post by: MauryG5 on February 01, 2022, 07:04:30 am
Unfortunately the problem for us at Power is always the same, on the software side that you will have to use for your works, what optimization will you have ?!  We usually suffer on this front compared to X86 ...
Title: Re: AMD GPU at boot
Post by: Woof on February 01, 2022, 08:26:48 am
I'll be doing some comparison between the 3990X and my dual Power9 running our custom texture compression code. I'll not get around to it for a while though. The code scales quite well with more cores but eventually becomes memory bound. Our 3990X machines here are all overclocked and hand tuned, so I think it'll be a tough one to beat.
Title: Re: AMD GPU at boot
Post by: MPC7500 on February 01, 2022, 09:20:43 am
POWER9 is already aged. It will easily be slaughtered by the Threadripper.
Ah, BTW you can overclock POWER9, too.
Maybe you could provide some benchmarks, if you're finished.
Title: Re: AMD GPU at boot
Post by: Woof on February 01, 2022, 09:44:09 am
I'll post real findings but I found the 3990X doesn't scale out linearly for our use. For our highly parallelised task (imagine 65'000 blocks of the same size being processed) going from 64 to 128 cores yields a 37% speed increase. I think we get hit by the four memory channels.

Overclocking the Power9 is on my list to look at (I'm assuming the power stages to be the limit here though). OC'ing the 3990X is a complete power hog, and we're able to draw over 800W for the CPU/mobo alone (for the TR Pros we have here with 4x RTX6000 we struggle to keep in the 2kW power limit of the PSU and wall socket).
Title: Re: AMD GPU at boot
Post by: MauryG5 on February 01, 2022, 01:12:51 pm
Woof do you like me use Ubuntu on Power? We are few who use it here, you too have found the Firefox problems I have been talking about for some time, such as the bug of the bookmark that does not work and the problem that occurs when you use Firefox on certain sites ...
Title: Re: AMD GPU at boot
Post by: Woof on February 01, 2022, 01:59:43 pm
Yes, I installed Ubuntu and tried Firefox. On the first day I only noticed the font issue (easily repro'd by opening the web dev tools, which just show white). Then today, second day, I noticed FF complains on launch about not being able to save the bookmarks.

I wanted to try a few OS variants, and Ubuntu was my go-to since we have it on lots of machines successfully at work (all x64).

I was going to look at Chrome, but perhaps some other Linux too. Ideally I'd like multiple installed (big and little endian, for testing, though more out of habit now since I've not shipped anything on BE for over 10 years), though I wasn't sure whether this would need multiple disks rather than multiple partitions.
Title: Re: AMD GPU at boot
Post by: MauryG5 on February 02, 2022, 01:10:47 am
I personally took 3 hard disks and installed Debian, Ubuntu and Fedora.  I was fine with Fedora until it made the switch to Gnome 40, it literally sucks as they haven't fully optimized the code and it was slow, jerky and almost unusable.  I then used Ubuntu which in fact I realized that as fluidity and use it turned out to be the best but unfortunately it has several small bugs that have never been solved unfortunately.  Besides Firefox for example there is another bug which consists in not reading audio CDs, you put a common CD and it does not load the tracks.  Debian 11 has improved a lot I must say and now that Chromium 97 has arrived, I am starting to use it a little more and it is good overall.  I have compiled the Kernel in version 5.15.3 with 4K pages and with a firmware update I can make the AMD Radeon 5700 XT 50 th Anniversary edition work well.  I'd like to have Ubuntu troubleshoot but I still haven't figured out how to report it to them directly if I have to tell you the truth ...
Title: Re: AMD GPU at boot
Post by: Woof on February 02, 2022, 07:47:13 am
Answering one of my own questions: multi-boot off the same disk is straightforward, I'm running both Void and Ubuntu on different partitions (I just need to get the video bios loaded now).
Title: Re: AMD GPU at boot
Post by: ClassicHasClass on February 02, 2022, 12:30:20 pm
The main problem with Fedora is libgraphene, which is known, and sharkcz has been looking into it. A simple rebuild suffices to patch in place in the meantime if you are affected.
Title: Re: AMD GPU at boot
Post by: MauryG5 on February 02, 2022, 01:03:48 pm
Classic the problem is that I do not know if I am capable of making this modification, it depends on the complexity of this procedure you are saying. However, before unmounting the Fedora 35 hardisk, I had installed libgraphene, I had also written it in the discussion concerning Fedora 35 but unfortunately nothing has changed because obviously we also need this patch you are talking about. I don't know if I can put this patch but ...
Title: Re: AMD GPU at boot
Post by: Woof on July 29, 2022, 10:51:22 am
Resurrecting this thread for anyone interested/curious... I mentioned forever ago I'd post some findings comparing the Power9 with other CPUs for my CPU-based number crunching needs.

For my latest work, the single-threaded initial implementation took about half an hour to run on my Xeon desktop, and any experimentation meant really thinking it through before a run.

Breaking down the calculations and into chunks then running on all threads, the slowest Threadripper at work, a 3060X, took 20 seconds, and the fastest, an overclocked water-cooled 3990X, took 8 seconds! The 24 core Xeon took 1m13s.

In comparison my 144 core Power9 takes 19 seconds, but, since the algorithm is broken into 256 chunks, it processes the first 128 batches followed by 112, with the processor showing 77% usage partway through, whereas the Threadrippers keep the CPU at 100%. Still, it's a good indication and the machine compares well with a 3060X.

All this said, the software was never optimised for Power (it has some SIMD for Intel), and I've found both Clang and GCC to be quite variable on Power. GGC 10 gave the best results, with Clang 15 the worst (27 seconds; I have a whole Clang/GCC rant for another day).
Title: Re: AMD GPU at boot
Post by: ClassicHasClass on July 29, 2022, 12:39:20 pm
Interesting numbers. I would expect that figure to improve with software refinement.
Title: Re: AMD GPU at boot
Post by: Woof on July 29, 2022, 03:05:52 pm
To add to the numbers, a MacBook Pro with M1 Max runs the same in 47s.

When we finally get some 5995X TRs at work I'll run the same on there (I have a feeling it won't beat the OC'd 3990X).

(If anyone has an OC'd Power or something else esoteric I can share the source - it's a project that'll be open sourced after it's shipped anyway, it's a graphics tool.)
Title: Re: AMD GPU at boot
Post by: MPC7500 on July 29, 2022, 04:45:48 pm
Was this test on POWER only for fun or does your company also develop on POWER? Will there be some optimizations, too?
A chart would be good. The numbers aren't that bad for POWER9 36c/144t against the 64c/128t
Title: Re: AMD GPU at boot
Post by: Woof on July 30, 2022, 05:14:43 am
I'm using the Talos II at work actually for work, and specifically for crunching numbers (lots of RAM, lots of cores).

Short-ish version of the story: I bought a machine partially to test the possibility of rolling them out to our Linux deep learning teams, and partially out of personal interest. The T2s are too finicky to give them out to others but it ticks a lot of boxes for me personally, plus getting workstation CPUs is difficult, waiting months to get Threadrippers (and supply has totally dried up here now) so I did the little work required to get my tools running on POWER and started using it. I've still not found a good IDE for native dev, so I'm working in Visual Studio on my Windows desktop and then running tasks on the T2.

As for the comparisons with the 64c/128t 3990X, Friday was a slow day ahead of the public holiday here and I thought I'd take a look. On a regular 3990X the code was taking 15s (and on the OC'd one 13s), vs the 19s on my T2 (which given I couldn't keep the cores at 100% is very good), but I spent some time to tune for the specific thread group requirement of the 3990X, taking the timing down to 8s and a very comfortable lead!

As for optimisations, to this code probably not. Once finished it'll run once to precalculate a series of numbers which will go into the end product (a texture tool).

A graph would've been better though.
Title: Re: AMD GPU at boot
Post by: MPC7500 on July 30, 2022, 06:51:14 am
You are aware that VScode is available, too. Right?
Title: Re: AMD GPU at boot
Post by: DKnoto on July 30, 2022, 06:53:43 am
Apache NetBeans 14 works good. I tested it on AlmaLinux 9 nad Fedora 36.
Title: Re: AMD GPU at boot
Post by: Woof on July 30, 2022, 08:15:52 am
I tried the Kate IDE, for which CMake has a generator, but the step debugger didn't work for me (but it could build). I took a look at Eclipse but only for Java (I maintain some legacy Java stuff at work) but I'm a C++ guy.

I didn't think VSCode was up-to-date? And I'll try Netbeans - I forget these Java-based IDEs have native builders.
Title: Re: AMD GPU at boot
Post by: MPC7500 on July 30, 2022, 08:32:38 am
On Void Linux it's up-to-date
https://github.com/void-ppc/void-packages/tree/master/srcpkgs/vscode

CodeBlocks is also a possibility.