OpenPOWER ISA > General Discussion
Learning POWER9 assembly
vmlinuz:
https://twitter.com/ppcinstructions
oh no
And to be fair, from what I can tell the z/Architecture has MSCDFR0 instructions as well... seems to just be an IBM-ism
vmlinuz:
--- Quote from: ClassicHasClass on November 09, 2021, 01:09:54 pm ---No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455
ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.
--- End quote ---
Would it be quite accurate to say that to get anything resembling performance out of a POWER9 CPU, you need to be extremely careful about the order in which you schedule instructions, cache hints, etc., therefore requiring an extreme degree of knowledge regarding the CPU's inner workings? If so, since it clearly would be infeasible to just write programs in assembly for fun, where did you learn to do this? I would expect it to be at least a little more programmer-friendly than some older RISC designs.
ClassicHasClass:
No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.
vmlinuz:
--- Quote from: ClassicHasClass on November 11, 2021, 11:51:31 pm ---No, you don't need to be extremely careful, and POWER9 has made this easier by getting rid of dispatch groups (this used to be sensitive to how instructions were packaged up into dispatch groups and made their way through the execution pipeline). Register renaming is also a lot more powerful than it used to be, though if you can make use of more registers manually, that always helps. The biggest sources of slowdowns are inappropriate use of the link register for branching or trampolines, which will foul the return address cache, and avoidable or aliased spills to memory, which is why the FPR<->GPR moves in VSX are so great (to move a GPR to the FPR used to require a memory spill and load, which inevitably aliased if you did this on the stack, so you also needed nops). As a general rule, Power chips are also way better at straightline code than branching, even if the straightline code seems to do more work.
--- End quote ---
Can I just say, thank you for making Firefox not a complete juddering mess on POWER9? My Talos II just got here and I definitely got my money's worth... at least!
And that's great to hear, I was afraid it would be like Itanium.
ClassicHasClass:
Thanks! Most of what's in Firefox now is just taking better advantage of VMX and VSX. The JIT should blow those doors open, though just remember that the MVP is the first-stage JIT (there are three stages) and Wasm support, so it should not be interpreted as "as good as it gets." And don't forget to do an LTO-PGO build if your distro doesn't; I keep those patches up to date: https://www.talospace.com/2021/11/firefox-94-on-power.html
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version