Author Topic: OpenBMC reporting critical error Proc.FSI.Error.MasterDetectionFailure  (Read 529 times)

r34per

  • Newbie
  • *
  • Posts: 22
  • Karma: +1/-0
    • View Profile
I checked the OpenBMC web interface to find it reported in being critical health with 200 high priority errors logged yesterday over the course of about 3 hours, and as far as I can tell it's the same error for all of them.

The error is
Code: [Select]
org.open_power.Proc.FSI.Error.MasterDetectionFailureWhen I expand the entry this is what it reads-
Code: [Select]
CALLOUT_DEVICE_PATH=/sys/devices/platform/gpio-fsi/fsi0/slave@00:00/raw CALLOUT_ERRNO=0 _PID=8606
Like I said they all appear to be the same error with that same message, although the PID= number is different. My blackbird appeared to have powered off at some point as well, though I don't know when(can I check that somewhere in openbmc?). I stepped away from my pc for the evening before the first was logged and forgot to shut it down for the night, and when I went to use it this morning it was not running.

I'm running void linux as the os, and I couldn't find any logs that would shed any light on it. It seems by default void does have a syslog daemon and I never bothered installing one, oops.

Is this a cause for concern, and should I put a ticket in with RCS about it? It happened once before a few weeks or so ago but I chalked it up to a fluke, I cleared the logs and it seemed to be fine.
« Last Edit: April 28, 2023, 04:50:04 pm by r34per »

Hasturtium

  • Full Member
  • ***
  • Posts: 115
  • Karma: +10/-0
    • View Profile
You may be interested to know that Raptor weighed in on their Twitter after it was pointed out to them, with a first suggestion to try reseating the CPU.

The link:
https://nitter.poast.org/RaptorCompSys/status/1652858741635665920#m

r34per

  • Newbie
  • *
  • Posts: 22
  • Karma: +1/-0
    • View Profile
I'll give that a try, thanks for the heads up!

cy384

  • Newbie
  • *
  • Posts: 8
  • Karma: +1/-0
    • View Profile
    • http://cy384.com/
I got this error in my Blackbird's BMC log when there was a brief (like one second) power outage (brownout maybe?) last week.

MPC7500

  • Hero Member
  • *****
  • Posts: 562
  • Karma: +37/-1
    • View Profile
    • Twitter
Then I would try this:
https://wiki.raptorcs.com/wiki/Troubleshooting/Guard_Partition

Otherwise, if that doesn't help, I would re-flash the BMC and OpenPOWER firmware:
https://wiki.raptorcs.com/wiki/Updating_Firmware
« Last Edit: May 05, 2023, 01:31:09 pm by MPC7500 »

r34per

  • Newbie
  • *
  • Posts: 22
  • Karma: +1/-0
    • View Profile
I got this error in my Blackbird's BMC log when there was a brief (like one second) power outage (brownout maybe?) last week.

That could be what's happening to mine too when I think about. Brief brown-outs aren't uncommon when it's windy or stormy, which it has been when it happened. I should probably invest in a UPS for it and see if it still gives me any trouble. I'll try MPC7500's suggestions too, thanks for the help!
« Last Edit: May 09, 2023, 05:17:33 pm by r34per »