Software > User Zone

petitboot error: PHB#0005[0:5]: PHB Freeze/Fence detected !

(1/4) > >>

FlyingBlackbird:
My Blackbird does not boot into the petitboot menu anymore (black screen) even though I can see the execution log on screen and via the ssh console with:


--- Code: ---ssh -p 2200 root@my.ip.address
--- End code ---

The last available console output is a failure saying:


--- Code: ---[   90.334100762,3] PHB#0005[0:5]: eeh_freeze_clear on fenced PHB
               XE autoconfiguration failed

--- End code ---

I have only installed a SATA-HDD drive with a bootable Ubuntu Server 19.10 (no GPU).
I have not changed/updated any firmware.
Last thing I have done was testing a PCIe GPU (NVIDIA).
After removing the GPU card from the PCIe slot I could no longer boot into petitboot...

What is going wrong?

The relevant log is (full console output is attached as file):


--- Code: -----== Welcome to Hostboot hostboot-3beba24/hbicore.bin ==--

  3.06480|secure|SecureROM valid - enabling functionality
  8.29259|Booting from SBE side 0 on master proc=00050000
  8.46568|ISTEP  6. 5 - host_init_fsi
  8.93808|ISTEP  6. 6 - host_set_ipl_parms
  9.49036|ISTEP  6. 7 - host_discover_targets
 10.09037|HWAS|PRESENT> DIMM[03]=8080000000000000
 10.09038|HWAS|PRESENT> Proc[05]=8000000000000000
 10.09040|HWAS|PRESENT> Core[07]=5565000000000000
 10.49937|ISTEP  6. 8 - host_update_master_tpm
 10.52369|SECURE|Security Access Bit> 0x0000000000000000
 10.52370|SECURE|Secure Mode Disable (via Jumper)> 0x8000000000000000
...
 57.11902|ISTEP 21. 2 - host_verify_hdat
 57.20205|ISTEP 21. 3 - host_start_payload
[   58.179010391,5] OPAL skiboot-c81f9d6 starting...
[   58.179013552,7] initial console log level: memory 7, driver 5
[   58.179015737,6] CPU: P9 generation processor (max 4 threads/core)
...
[   65.266762790,5] PHB#0000:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..ff SLOT=CPU1 Slot2 (16x)
 Petitboot (v1.10.3-pdd2d545)
 ──────────────────────────────────────────────────────────────────────────────

  System information
  System configuration
  System status log
  Language
  Rescan devices
  Retrieve config from URL
  Plugins (0)
 *Exit to shell           










 ──────────────────────────────────────────────────────────────────────────────
 Enter=accept, e=edit, n=new, x=exit, l=language, g=log, h=help
 Info: Waiting for device discovery[   85.086133287,3] PHB#0005[0:5]: PHB Freeze/Fence detected !
[   85.086197573,3] PHB#0005[0:5]:             PCI FIR=2000000000000000
[   85.086249297,3] PHB#0005[0:5]:         PCI FIR WOF=2000000000000000
[   85.086289203,3] PHB#0005[0:5]:            NEST FIR=0000800000000000
[   85.086354836,3] PHB#0005[0:5]:        NEST FIR WOF=0000800000000000
[   85.086394899,3] PHB#0005[0:5]:            ERR RPT0=0000000000000001
[   85.086489826,3] PHB#0005[0:5]:            ERR RPT1=0000000000000000
[   85.086534460,3] PHB#0005[0:5]:             AIB ERR=0000200000000000
[   85.086941635,3] PHB#0005[0:5]:                  brdgCtl = 00000002
[   85.087002150,3] PHB#0005[0:5]:             deviceStatus = 00200020
[   85.087036852,3] PHB#0005[0:5]:               slotStatus = 00402000
[   85.087081358,3] PHB#0005[0:5]:               linkStatus = a8120008
[   85.087137004,3] PHB#0005[0:5]:             devCmdStatus = 00100107
[   85.087181127,3] PHB#0005[0:5]:             devSecStatus = 00000000
[   85.087239088,3] PHB#0005[0:5]:          rootErrorStatus = 00000000
[   85.087285589,3] PHB#0005[0:5]:          corrErrorStatus = 00000000
[   85.087325009,3] PHB#0005[0:5]:        uncorrErrorStatus = 00000000
[   85.087370016,3] PHB#0005[0:5]:                   devctl = 00000020
[   85.087419580,3] PHB#0005[0:5]:                  devStat = 00000020
[   85.087466277,3] PHB#0005[0:5]:                  tlpHdr1 = 00000000
...
[   85.088610802,3] PHB#0005[0:5]:       phbRxeArbErrorLog1 = 0000000000000000
  [Disk: sda2 / ef49aa17-bb70-4fea-a8fc-29e235f7ab9f]
    Ubuntu, with Linux 5.3.0-26-generic (recovery mode)
    Ubuntu, with Linux 5.3.0-26-generic
    Ubuntu, with Linux 5.3.0-29-generic (recovery mode)
    Ubuntu, with Linux 5.3.0-29-generic
    Ubuntu
[   85.088655011,3] PHB#0005[0:5]:     phbRxeMrgErrorStatus = 0000000000000001
...
[   85.089573601,3] PHB#0005[0:5]:                PEST[0ff] = 3740002a01000000 0000000000000000
 [enP4p1s0f2] Probing from base tftp://192.168.178.1/pxelinux.cfg/[   90.311273403,3] PHB#0005[0:5]: PHB Freeze/Fence detected !
[   90.311357669,3] PHB#0005[0:5]:             PCI FIR=2000000000000000
...
[   90.315282185,3] PHB#0005[0:5]:         phbRegbErrorLog1 = 0001020000000000
[   90.315338900,3] PHB#0005[0:5]:                PEST[000] = 8000000000000000 8000000000000000
[   90.315413179,3] PHB#0005[0:5]:                PEST[001] = 8000000000000000 8000000000000000
[   90.315491213,3] PHB#0005[0:5]:                PEST[002] = 8000000000000000 8000000000000000
...
[   90.333937493,3] PHB#0005[0:5]:                PEST[0fe] = 8000000000000000 8000000000000000
[   90.334011680,3] PHB#0005[0:5]:                PEST[0ff] = b740002a01000000 8000000000000000
[   90.334100762,3] PHB#0005[0:5]: eeh_freeze_clear on fenced PHB
               XE autoconfiguration failed

--- End code ---

PS: I have logged in into OpenBMC via SSH and see these strange error messages that may be related:


--- Code: ---root@blackbird:~# journalctl | grep fail
May 10 19:37:25 blackbird kernel: g_mass_storage 1e6a0000.usb-vhub:p1: failed to start g_mass_storage: -22
May 10 19:37:27 blackbird systemd-udevd[789]: Process 'mtd_probe /dev/mtd2ro' failed with exit code 1.
May 10 19:37:27 blackbird systemd-udevd[790]: Process 'mtd_probe /dev/mtd3ro' failed with exit code 1.
May 10 19:37:27 blackbird systemd-udevd[837]: Process 'mtd_probe /dev/mtd4ro' failed with exit code 1.
May 10 19:37:27 blackbird systemd-udevd[792]: Process 'mtd_probe /dev/mtd5ro' failed with exit code 1.
May 10 19:37:28 blackbird systemd-udevd[788]: Process 'mtd_probe /dev/mtd0ro' failed with exit code 1.
May 10 19:37:28 blackbird systemd-udevd[791]: Process 'mtd_probe /dev/mtd1ro' failed with exit code 1.
May 10 19:37:28 blackbird systemd-udevd[836]: Process 'mtd_probe /dev/mtd6ro' failed with exit code 1.
May 10 19:37:29 blackbird kernel: A link change request failed with some changes committed already. Interface eth0 may have been left with an inconsistent configuration, please check.
May 10 19:37:31 blackbird kernel: A link change request failed with some changes committed already. Interface sit0 may have been left with an inconsistent configuration, please check.
Feb 03 22:05:40 blackbird kernel[1052]: [    3.810720] g_mass_storage 1e6a0000.usb-vhub:p1: failed to start g_mass_storage: -22
Feb 03 22:05:43 blackbird kernel[1052]: [   22.397690] A link change request failed with some changes committed already. Interface eth0 may have been left with an inconsistent configuration, please check.
Feb 03 22:05:43 blackbird kernel[1052]: [   24.051627] A link change request failed with some changes committed already. Interface sit0 may have been left with an inconsistent configuration, please check.
Feb 07 21:45:05 blackbird systemd[1]: Starting Stop the ethernet link failover...
Feb 07 21:45:07 blackbird systemd[1]: Started Stop the ethernet link failover.

--- End code ---

PS2: This question is more precise follow-up to that question:
         https://forums.raptorcs.com/index.php?action=post;topic=49.0;last_msg=473

SiteAdmin:

--- Code: ---PHB Freeze/Fence detected !
--- End code ---

This is an unhappy planar; the ASpeed VGA controller is not functioning correctly.  Try carefully removing the system from your case and powering on in a static-free environment (not on an antistatic mat as they are conductive).  If the problem persists you will need to submit an RMA request via the "My Account" link at https://www.raptorcs.com.

FlyingBlackbird:
Thanks a lot for your quick response, I will try it out this weekend and come back with my results

FlyingBlackbird:

--- Quote from: FlyingBlackbird on February 07, 2020, 06:16:24 pm ---I will try it out this weekend and come back with my results

--- End quote ---

OK, I have tested my Blackbird with a minimal hardware attached and the planar unmounted (detached) from the case's mouting points and I can reproduce the error
(see the attached log captured via the serial console + the picture of the planar).

Oh unlucky day, I have to open an RMA...


--- Code: ---...
[    4.095231] IMC PMU (null) Register failed
[    7.230872] kAFS: failed to register: -97
[    7.613717] udevd[1694]: specified group 'kvm' unknown
[    7.621590] udevd[1695]: specified group 'kvm' unknown
nvram process returned non-zero exit status
dmesg: klogctl: Operation not permitted
 )0 [1;24r [m   [4l [?7h [?25l [?1c [H [J [H [J Petitboot (v1.10.3-pdd2d545)
 [2d ────────────────────────────────────────────────────────────────────────────── [4;3HSystem information [5;3HSystem configuration [6;3HSystem status log [7;3HLanguage [8;3HRescan devices [9;3HRetrieve config from URL [10;3HPlugins (0)
 [11d * [0;10;7m  Exit to shell            [22;2H [m  ──────────────────────────────────────────────────────────────────────────────
 [23d Enter=accept, e=edit, n=new, x=exit, l=language, g=log, h=help
 [24d Welcome to Petitboot
 Info: Waiting for device discovery[   86.421879937,3] PHB#0005[0:5]: PHB Freeze/Fence detected !
...
[   92.219717338,3] PHB#0005[0:5]: eeh_freeze_clear on fenced PHB

--- End code ---

MPC7500:
That's sad. Would be interesting to know what caused this error.

Navigation

[0] Message Index

[#] Next page

Go to full version