Raptor Computing Systems Community Forums (BETA)

OpenPOWER ISA => General Discussion => Topic started by: vmlinuz on November 02, 2021, 01:16:22 am

Title: Learning POWER9 assembly
Post by: vmlinuz on November 02, 2021, 01:16:22 am
Can anyone recommend some good resources for someone who wants to get into POWER9 assembly programming? I have an old book somewhere about 32-bit PowerPC assembly but don't know how much things have changed since the days of the Power Macintosh G4.
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on November 02, 2021, 07:56:00 pm
32-bit assembly is a start but it will only get you part of the way -- you should really start with 64-bit and just think in those terms. Fortunately PowerPC was very forward looking from the beginning and 64-bit instructions were specified even from the days of the 601: my best reference for Power ISA is a very old book called Optimizing PowerPC Code by Gary Kacmarcik, supplementing it with the Power ISA 3.0b documentation and the ppc64le ELF v2 ABI specification. I have a physical copy of OPPCC but there are PDF copies around.
Title: Re: Learning POWER9 assembly
Post by: mparnaudeau on November 08, 2021, 04:33:25 am
I also have the physical book Optimizing PowerPC Code that I also recommend to be more familiar with the PowerPC instruction set in general, what is described by category (load-store, integer, FP, branches ...). The book also explains well stack frames and lists instructions with one described per pagen what includes PPC64 instructions, as this clean architecture came with 32-bit and 64-bit very early. The part on optimization is good but rather short and it may not match current POWER architecture on some points.

Another good resource I remember is "Ensamblador del PowerPC con Mac OS X". Even if it is in spanish, this document is more oriented as a tutorial.

Searching "powerpc asm tutorial" also reports good resources from IBM but not only.
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 08, 2021, 11:34:48 pm
I did some reading of the recommended materials thus far. How in the actual hell is this considered a Reduced Instruction Set?!?!? PowerPC from almost 30 years ago makes the z/Architecture look like a paragon of simplicity and elegance - God only knows what abominations await in POWER9.

Update:

(https://forums.raptorcs.com/index.php?action=dlattach;topic=317.0;attach=334;image)

Are you f**king kidding me
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on November 09, 2021, 01:09:54 pm
No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455

ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 09, 2021, 04:47:40 pm
https://twitter.com/ppcinstructions (https://twitter.com/ppcinstructions)

oh no

And to be fair, from what I can tell the z/Architecture has MSCDFR0 instructions as well... seems to just be an IBM-ism
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 10, 2021, 10:41:25 pm
No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455

ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.

Would it be quite accurate to say that to get anything resembling performance out of a POWER9 CPU, you need to be extremely careful about the order in which you schedule instructions, cache hints, etc., therefore requiring an extreme degree of knowledge regarding the CPU's inner workings? If so, since it clearly would be infeasible to just write programs in assembly for fun, where did you learn to do this? I would expect it to be at least a little more programmer-friendly than some older RISC designs.
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on November 11, 2021, 11:51:31 pm
No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 12, 2021, 01:02:56 am
No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.

Can I just say, thank you for making Firefox not a complete juddering mess on POWER9? My Talos II just got here and I definitely got my money's worth... at least!

And that's great to hear, I was afraid it would be like Itanium.
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on November 13, 2021, 10:18:59 am
Thanks! Most of what's in Firefox now is just taking better advantage of VMX and VSX. The JIT should blow those doors open, though just remember that the MVP is the first-stage JIT (there are three stages) and Wasm support, so it should not be interpreted as "as good as it gets." And don't forget to do an LTO-PGO build if your distro doesn't; I keep those patches up to date: https://www.talospace.com/2021/11/firefox-94-on-power.html
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 14, 2021, 08:49:11 pm
Well, I'm on Fedora 35 and I can hardly tell it's not an x86 PC. As an aside, you wouldn't happen to know how to stop 60 FPS video on YouTube from stuttering and freezing? That's really the only weak point, video playback. I can just download the video and use VLC if I really need to... It would appear to be strictly an issue with vp9 playback as when I use an extension to disable vp9, I get better (though still not perfect) results.
Title: Re: Learning POWER9 assembly
Post by: MPC7500 on November 15, 2021, 06:32:37 am
I guess, you already installed ffmpeg-libs from the RPMFusion repo?
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 15, 2021, 03:04:15 pm
Wait what? That's not installed by default?

Update: Video playback for non-vpx is now vastly improved. There are probably more libraries I need to install for vp8/9 to work?

Update 2: Could be an issue with my Polaris GPU; it's not clear whether they ever supported vp9
Title: Re: Learning POWER9 assembly
Post by: MPC7500 on November 16, 2021, 05:00:46 pm
Maybe you want to look at this wiki:
https://wiki.archlinux.org/title/Hardware_video_acceleration
Title: Re: Learning POWER9 assembly
Post by: vmlinuz on November 16, 2021, 08:06:02 pm
That did the trick, now I can watch penguinz0 complain about video games in glorious 1080p60 on PowerPC. Above 1080p60 is kind of a mess - not sure, but I think that's my internet connection's fault this time.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 06, 2021, 02:48:25 pm
Guys excuse some curiosity, I read some time ago that the PowerPc instructions have been incorporated into the most recent Power ISAs, so theoretically, the old instructions that IBM and Motorola created specifically for PowerPC, are now also usable on the new Power, right? I ask because I remember that when PowerPc was born from Power, it eliminated some instructions from Power, which they said at the time were useless for the home desktop environment and instead inserted instructions made specifically for Power PC in its place ... Another curiosity that I I ask is related to floating point computation units such as the VSX and the like, deriving from the legendary Altivec. Because some people say, for example I have read some of these comments on Phoronix, that the floating point compute units of the Power processors are poor compared especially to the X86 equivalents! I wonder is such a thing ever possible when it was PowerPC's Altivec at the time, he taught Intel how to make floating point computing units, much less performing up to the SS2 version ... I don't explain this what, indeed, should be one of Power's greatest strengths, especially with the evolution that certainly all these units have had over the years ...
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on December 06, 2021, 08:10:10 pm
Almost all, but not every single instruction, of the PowerPC instruction set is in Power ISA. Obviously 601-specific instructions didn't make it past the 603, but also a couple oddballs like mcrxr aren't in Power ISA (instead there is, more recently, mcrxrx), and there are differences in cache line which affects some instructions like dcbz. These were also issues on the G5, which is more POWER4 than it is PowerPC. In general, however, the vast majority of user-level code will translate to the extent it is 64-bit aware.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 07, 2021, 07:39:04 am
I understand, so we have a good part of those instructions available, certainly those 601s would not even make sense to date, I think they are too dated by now and that's right in the end. The article I read a few years ago was correct so fine. What about floating point vector and compute units instead, has Intel really outdone us in this area or are we still the best with Alti Vec's direct successors?
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on December 07, 2021, 11:09:39 am
Vectors are still "limited" to 128 bit, though I would also argue that the efficiency improvements with 256 and 512 bit vectors are more questionable. A good scalar system is what Power ISA needed, and VSX is a big improvement (no more spilling to memory to exchange between FP registers and integer registers, for example). I use it heavily in the Firefox JIT.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 07, 2021, 01:34:20 pm
Understood, so you think that basically it is not really an advantage that of X86 that has vectors at 256 or 512 from what I understand ... Do you think we will be able in the end to have Firefox without those limits that up to now this Browser has on Power also compared to Chromium? I see that unfortunately I still can't go to all the sites like I do on Chromium ... I think it would be nice, given the work you do with so much passion on Firefox, to be able to finally package a version specially dedicated to us of Power, maybe diversifying some colors of the logo, for example instead of orange, use the fire red which in my opinion would represent a Power version well ...
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on December 13, 2021, 06:27:17 pm
I don't really want to be in the business of making a Power-specific Firefox long term (I did that for over a decade with TenFourFox and it's a pain); I'd rather have a build that distros can pick up. The aim with the JIT is to do just that. There will necessarily be a separate branch for a bit while the code stabilizes but the goal is not to have a fork.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 14, 2021, 01:05:27 pm
Yes, I understand what you mean, I made this hypothesis only to favor our build as much as possible, if it were specific it would have all the code optimized for us and would have even better performance but maybe already for how you are working, equally the final result is that of the version optimized for us. Instead as soon as possible, I need your advice since Firefox knows it better than anyone else in here perhaps, for an annoying problem that I can't solve and that afflicts the Ubuntu version and that no one has solved until now ... If you tell me where to write then to expose the problem you will be grateful ... Thanks
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on December 15, 2021, 10:07:36 pm
Well, what's the exact problem? You can certainly post a report on Bugzilla but they'd probably appreciate it being triaged first.
Title: Re: Learning POWER9 assembly
Post by: amock on December 15, 2021, 11:17:06 pm
Are you talking about the problem reported at https://bugzilla.mozilla.org/show_bug.cgi?id=1591164 (https://bugzilla.mozilla.org/show_bug.cgi?id=1591164)?  I never got around to reopening that bug, but if you are having that problem I think that's a good place to start.  I have been building my own Firefox to get around it, but I just tested with the system Firefox and it still happens.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 16, 2021, 07:40:47 am
No I'm talking about the problem that I had already exposed some time ago on the anomalies related to Ubuntu on Power.  Whenever you open Firefox on Ubuntu, it always tells you that the history is not working because there is some other part of Ubuntu that is using that file that Firefox needs to make the history work properly.  This thing has never been solved and the bad thing is that this problem only has Firefox on Ubuntu, in other distributions including Debian, it does not happen ... I tried to delete that file that Firefox recommends me to delete in the guide but unfortunately it continues to always do it ...
Title: Re: Learning POWER9 assembly
Post by: ClassicHasClass on December 16, 2021, 10:37:53 am
I think that's the same bug. I don't encounter it on Fedora. Maybe it's Ubuntu-specific.
Title: Re: Learning POWER9 assembly
Post by: MauryG5 on December 16, 2021, 03:40:42 pm
Yes Classic, the problem is specific to the Ubuntu version, it seems that the security software that Ubuntu uses is in conflict with the Firefox bookmark and even trying to disable this Ubuntu security software, it still does not work the same ...