Raptor Computing Systems Hardware > Talos II
Petitboot lock after Fedora 36 update
DKnoto:
Problem solved. I took out the SSD drive, checked the XFS partitions on my laptop. Nothing wasn't happening, there weren't any mistakes. I put it in Talos and it works.
Magic ;)
DKnoto:
I've had two more XFS falls in the past two weeks. Once after another kernel update and the other after an unexpected power outage. This is definitely too much, fixing this situation is tedious and time-consuming, I have to remove the drive from Talos II and put it in another computer. Petitboot is unable to skip the stage of reading the state of the file systems on the SSD and inserting a rescue system on the USB does nothing.
With the current state of Petitboot software, it is impractical to install the main file system on XFS.
ClassicHasClass:
Well, this just happened to me after the Fedora 37 upgrade and I'm pissed. I don't have another machine set up to fix the filesystem and Petitboot won't respond to any keys before trying to mount filesystems. I'm not sure what I can do with it yet.
@sharkcz, what can we do?
sharkcz:
A tough question I think ...
It looks to me that the skiroot 5.5-based kernel isn't able to cope with some situations where an XFS filesystem written by much newer kernel can be left in.Thus the solution could be to switch to upstream PNOR firmware builds which is using 5.10 kernels (still quite old, no development for p9, a little bit for p10 ...). Should be doable for Blackbird, there is some old work in progress for Talos from me. I agree the safe way around is to avoid XFS (and/or btrfs) for the host OS rootfs or /boot and rely on ext4. I think the mis-behaviour won't be ppc64le specific, but it's uncommon to use 5.5 kernel to read a filesystem written by a much more kernel ...
And how to recover from the failure without a physical access. If there would be a way to set a host's nvram variable from the BMC, one could disable the xfs module for the skiroot kernel and boot from another media.
ClassicHasClass:
All right, I'm back up. The Blackbird did the fixing of the XFS volume - there was a stuck log entry. Arguably Fedora may not have unmounted it cleanly, but Petitboot shouldn't just crash as a result. There really needs to be a way to bypass mounting completely (more details https://www.talospace.com/2022/12/when-petitboot-barfs-everythings-vomit.html ).
Thanks to @sharkcz for letting me bounce ideas off him in E-mail.
Navigation
[0] Message Index
[*] Previous page
Go to full version