No, actually, the mscdfr0-type instructions are the ones I hate. eieio is just fun for jokes though it has practical use as a lightweight barrier. But sheer number of instructions aside, RISC really is now just parlance for load-store. In that sense FISC might be more appropriate: https://news.ycombinator.com/item?id=28601455
ARM is just as bad (in fact worse due to their crazy encodings), and I think RISC-V will eventually metastasize. Even MIPS is getting that way. That said, I'll also add as someone handcoding assembly right now for the Firefox JIT that it was so nice to finally have instructions for GPR<->FPR moves in VSX plus a lot more rounding-type instructions rather than having to serialize the FPSCR by twiddling bits. And VSX is way more complete than VMX used to be (I had to tie myself in knots to write good AltiVec routines). ISA 3.0 added a whopping number of instructions to what was already a large instruction set but I find I'm actually using them.
Would it be quite accurate to say that to get anything resembling performance out of a POWER9 CPU, you need to be extremely careful about the order in which you schedule instructions, cache hints, etc., therefore requiring an extreme degree of knowledge regarding the CPU's inner workings? If so, since it clearly would be infeasible to just write programs in assembly for fun, where did you learn to do this? I would expect it to be at least a little more programmer-friendly than some older RISC designs.