9
« on: December 12, 2022, 03:24:28 am »
A tough question I think ...
It looks to me that the skiroot 5.5-based kernel isn't able to cope with some situations where an XFS filesystem written by much newer kernel can be left in.Thus the solution could be to switch to upstream PNOR firmware builds which is using 5.10 kernels (still quite old, no development for p9, a little bit for p10 ...). Should be doable for Blackbird, there is some old work in progress for Talos from me. I agree the safe way around is to avoid XFS (and/or btrfs) for the host OS rootfs or /boot and rely on ext4. I think the mis-behaviour won't be ppc64le specific, but it's uncommon to use 5.5 kernel to read a filesystem written by a much more kernel ...
And how to recover from the failure without a physical access. If there would be a way to set a host's nvram variable from the BMC, one could disable the xfs module for the skiroot kernel and boot from another media.