Author Topic: Learning POWER9 assembly  (Read 16256 times)

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Learning POWER9 assembly
« on: November 02, 2021, 01:16:22 am »
Can anyone recommend some good resources for someone who wants to get into POWER9 assembly programming? I have an old book somewhere about 32-bit PowerPC assembly but don't know how much things have changed since the days of the Power Macintosh G4.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 462
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Learning POWER9 assembly
« Reply #1 on: November 02, 2021, 07:56:00 pm »
32-bit assembly is a start but it will only get you part of the way -- you should really start with 64-bit and just think in those terms. Fortunately PowerPC was very forward looking from the beginning and 64-bit instructions were specified even from the days of the 601: my best reference for Power ISA is a very old book called Optimizing PowerPC Code by Gary Kacmarcik, supplementing it with the Power ISA 3.0b documentation and the ppc64le ELF v2 ABI specification. I have a physical copy of OPPCC but there are PDF copies around.

mparnaudeau

  • Newbie
  • *
  • Posts: 26
  • Karma: +8/-0
  • Freelance software developer and PPC fan
    • View Profile
Re: Learning POWER9 assembly
« Reply #2 on: November 08, 2021, 04:33:25 am »
I also have the physical book Optimizing PowerPC Code that I also recommend to be more familiar with the PowerPC instruction set in general, what is described by category (load-store, integer, FP, branches ...). The book also explains well stack frames and lists instructions with one described per pagen what includes PPC64 instructions, as this clean architecture came with 32-bit and 64-bit very early. The part on optimization is good but rather short and it may not match current POWER architecture on some points.

Another good resource I remember is "Ensamblador del PowerPC con Mac OS X". Even if it is in spanish, this document is more oriented as a tutorial.

Searching "powerpc asm tutorial" also reports good resources from IBM but not only.

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #3 on: November 08, 2021, 11:34:48 pm »
I did some reading of the recommended materials thus far. How in the actual hell is this considered a Reduced Instruction Set?!?!? PowerPC from almost 30 years ago makes the z/Architecture look like a paragon of simplicity and elegance - God only knows what abominations await in POWER9.

Update:



Are you f**king kidding me
« Last Edit: November 08, 2021, 11:47:01 pm by vmlinuz »

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 462
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Learning POWER9 assembly
« Reply #4 on: November 09, 2021, 01:09:54 pm »
No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455

ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #5 on: November 09, 2021, 04:47:40 pm »
https://twitter.com/ppcinstructions

oh no

And to be fair, from what I can tell the z/Architecture has MSCDFR0 instructions as well... seems to just be an IBM-ism
« Last Edit: November 09, 2021, 04:50:15 pm by vmlinuz »

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #6 on: November 10, 2021, 10:41:25 pm »
No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455

ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.

Would it be quite accurate to say that to get anything resembling performance out of a POWER9 CPU, you need to be extremely careful about the order in which you schedule instructions, cache hints, etc., therefore requiring an extreme degree of knowledge regarding the CPU's inner workings? If so, since it clearly would be infeasible to just write programs in assembly for fun, where did you learn to do this? I would expect it to be at least a little more programmer-friendly than some older RISC designs.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 462
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Learning POWER9 assembly
« Reply #7 on: November 11, 2021, 11:51:31 pm »
No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #8 on: November 12, 2021, 01:02:56 am »
No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.

Can I just say, thank you for making Firefox not a complete juddering mess on POWER9? My Talos II just got here and I definitely got my money's worth... at least!

And that's great to hear, I was afraid it would be like Itanium.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 462
  • Karma: +35/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Learning POWER9 assembly
« Reply #9 on: November 13, 2021, 10:18:59 am »
Thanks! Most of what's in Firefox now is just taking better advantage of VMX and VSX. The JIT should blow those doors open, though just remember that the MVP is the first-stage JIT (there are three stages) and Wasm support, so it should not be interpreted as "as good as it gets." And don't forget to do an LTO-PGO build if your distro doesn't; I keep those patches up to date: https://www.talospace.com/2021/11/firefox-94-on-power.html

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #10 on: November 14, 2021, 08:49:11 pm »
Well, I'm on Fedora 35 and I can hardly tell it's not an x86 PC. As an aside, you wouldn't happen to know how to stop 60 FPS video on YouTube from stuttering and freezing? That's really the only weak point, video playback. I can just download the video and use VLC if I really need to... It would appear to be strictly an issue with vp9 playback as when I use an extension to disable vp9, I get better (though still not perfect) results.
« Last Edit: November 15, 2021, 12:00:24 am by vmlinuz »

MPC7500

  • Hero Member
  • *****
  • Posts: 587
  • Karma: +41/-1
    • View Profile
    • Twitter
Re: Learning POWER9 assembly
« Reply #11 on: November 15, 2021, 06:32:37 am »
I guess, you already installed ffmpeg-libs from the RPMFusion repo?

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #12 on: November 15, 2021, 03:04:15 pm »
Wait what? That's not installed by default?

Update: Video playback for non-vpx is now vastly improved. There are probably more libraries I need to install for vp8/9 to work?

Update 2: Could be an issue with my Polaris GPU; it's not clear whether they ever supported vp9
« Last Edit: November 15, 2021, 07:21:29 pm by vmlinuz »

MPC7500

  • Hero Member
  • *****
  • Posts: 587
  • Karma: +41/-1
    • View Profile
    • Twitter
Re: Learning POWER9 assembly
« Reply #13 on: November 16, 2021, 05:00:46 pm »

vmlinuz

  • Newbie
  • *
  • Posts: 25
  • Karma: +0/-0
    • View Profile
Re: Learning POWER9 assembly
« Reply #14 on: November 16, 2021, 08:06:02 pm »
That did the trick, now I can watch penguinz0 complain about video games in glorious 1080p60 on PowerPC. Above 1080p60 is kind of a mess - not sure, but I think that's my internet connection's fault this time.