Author Topic: powerpc equivalent to x86 intrinsics?  (Read 225 times)

atomicdog

  • Newbie
  • *
  • Posts: 46
  • Karma: +4/-0
    • View Profile
powerpc equivalent to x86 intrinsics?
« on: January 29, 2025, 11:07:59 pm »
Does anyone know of PPC equivalent asm or builtins for these gcc intel intrinsics?
Code: [Select]
_mm_sfence();
_mm_lfence();  __builtin_ia32_lfence (void);
_mm_mfence();

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 479
  • Karma: +39/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: powerpc equivalent to x86 intrinsics?
« Reply #1 on: January 30, 2025, 09:54:51 pm »
The semantics are not exact, but _mm_lfence() basically holds until all prior local loads have completed (approximately a load fence), _mm_sfence() holds until all prior stores have completed (approximately a store fence), and _mm_mfence() is effectively a complete memory fence. The GCC __builtin_ia32_lfence built-in is effectively LFENCE.

There are no precise Power equivalents. If you're in doubt and you can afford the hit, something like __asm__("sync; isync\n") should always work as a substitute for any x86 fence instruction in any situation, but is the slowest option. This combines an instruction sync with a memory sync, forcing all instructions prior to have completed and committed their results to memory, making a consistent result visible to other threads, and all succeeding instructions will execute in that context.

That said, a plain __asm__("sync\n") or its synonym hwsync may be sufficient to replace any of these fences. Note that it doesn't discard any prefetched instructions, so it's possible such instructions may still run in the old context. In most cases __asm__("lwsync\n") will also work, and is lighter still; it won't work for certain weird cache situations but shouldn't affect user programs.

Another option is the eieio instruction (the best-named instruction in the ISA, right up there with xxlxor), which is intended for memory-mapped I/O. This makes pending loads and stores run in order. That's not exactly a memory fence but it can act like one and is also pretty quick.

I'd start with replacing them with the heavyweight version, making sure it works, and then seeing what you can get away with. x86's strong memory model can sometimes make things difficult for RISC, especially multi-core/multi-processor systems.

atomicdog

  • Newbie
  • *
  • Posts: 46
  • Karma: +4/-0
    • View Profile
Re: powerpc equivalent to x86 intrinsics?
« Reply #2 on: January 31, 2025, 10:07:38 pm »
Thanks.  I bought a Grayskull of Ebay and now need to try and port the user mode driver to PPC.