UPDATE: The issue is still around in the 2.0 version of the firmware.
Hi all
I am trying to make Gigabyte Radeon Graphics Card GV-RXVEGA64GAMING OC-8GD card working on boot. I follow the instructions in
https://wiki.raptorcs.com/wiki/Add_GPU_Firmware_To_BOOTKERNFW to bundle up firmware then flash those firmwares to PNOR
$ sudo dnf install @development-tools
$ git clone https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
$ mkdir -p /tmp/firmwares/amdgpu
$ cp linux-firmware/amdgpu/vega10_* /tmp/firmware/amdgpu/
$ cd /tmp/firmware
$ mksquashfs * /tmp/firmware.bin -all-root -keep-as-directory
$ scp /tmp/firmware.bin root@bmc-ip-address:/tmp/firmware.bin
$ ssh root@bmc-ip-address
$ pflash -P BOOTKERNFW -e -p /tmp/firmware.bin
When the system boot up, no video output is detected on my screen, I switched back to the built-in HDMI of AST2500 and the system seems to hang with following output:
--== Welcome to Hostboot hostboot-3beba24/hbicore.bin ==--
3.08320|secure|SecureROM valid - enabling functionality
8.62232|Booting from SBE side 0 on master proc=00050000
8.66832|ISTEP 6. 5 - host_init_fsi
9.20956|ISTEP 6. 6 - host_set_ipl_parms
9.80358|ISTEP 6. 7 - host_discover_targets
10.45804|HWAS|PRESENT> DIMM[03]=8080000000000000
10.45805|HWAS|PRESENT> Proc[05]=8000000000000000
10.45806|HWAS|PRESENT> Core[07]=5055500000000000
10.76196|ISTEP 6. 8 - host_update_master_tpm
10.83124|SECURE|Security Access Bit> 0x0000000000000000
10.83125|SECURE|Secure Mode Disable (via Jumper)> 0x8000000000000000
10.86796|ISTEP 6. 9 - host_gard
11.33699|HWAS|Deconfig HUID 0x00030000, Physical:/Sys0/Node0/DIMM0
11.33711|HWAS|FUNCTIONAL> DIMM[03]=0080000000000000
11.33712|HWAS|FUNCTIONAL> Proc[05]=8000000000000000
11.33714|HWAS|FUNCTIONAL> Core[07]=5055500000000000
11.33914|ISTEP 6.11 - host_start_occ_xstop_handler
12.86084|ISTEP 6.12 - host_voltage_config
13.01363|ISTEP 7. 1 - mss_attr_cleanup
14.48736|ISTEP 7. 2 - mss_volt
14.84739|ISTEP 7. 3 - mss_freq
15.53796|ISTEP 7. 4 - mss_eff_config
16.81043|ISTEP 7. 5 - mss_attr_update
16.84049|ISTEP 8. 1 - host_slave_sbe_config
16.94450|ISTEP 8. 2 - host_setup_sbe
16.94554|ISTEP 8. 3 - host_cbs_start
16.94665|ISTEP 8. 4 - proc_check_slave_sbe_seeprom_complete
16.95431|ISTEP 8. 5 - host_attnlisten_proc
16.95527|ISTEP 8. 6 - host_p9_fbc_eff_config
16.96162|ISTEP 8. 7 - host_p9_eff_config_links
17.03091|ISTEP 8. 8 - proc_attr_update
17.03201|ISTEP 8. 9 - proc_chiplet_fabric_scominit
17.06735|ISTEP 8.10 - proc_xbus_scominit
17.10310|ISTEP 8.11 - proc_xbus_enable_ridi
17.12720|ISTEP 8.12 - host_set_voltages
17.14691|ISTEP 9. 1 - fabric_erepair
17.35688|ISTEP 9. 2 - fabric_io_dccal
17.37554|ISTEP 9. 3 - fabric_pre_trainadv
17.38003|ISTEP 9. 4 - fabric_io_run_training
17.38356|ISTEP 9. 5 - fabric_post_trainadv
17.38559|ISTEP 9. 6 - proc_smp_link_layer
17.39132|ISTEP 9. 7 - proc_fab_iovalid
17.84717|ISTEP 9. 8 - host_fbc_eff_config_aggregate
17.86589|ISTEP 10. 1 - proc_build_smp
18.32489|ISTEP 10. 2 - host_slave_sbe_update
20.82944|ISTEP 10. 4 - proc_cen_ref_clk_enable
20.96002|ISTEP 10. 5 - proc_enable_osclite
20.96099|ISTEP 10. 6 - proc_chiplet_scominit
21.12679|ISTEP 10. 7 - proc_abus_scominit
21.14613|ISTEP 10. 8 - proc_obus_scominit
21.14745|ISTEP 10. 9 - proc_npu_scominit
21.16494|ISTEP 10.10 - proc_pcie_scominit
21.22242|ISTEP 10.11 - proc_scomoverride_chiplets
21.22526|ISTEP 10.12 - proc_chiplet_enable_ridi
21.24276|ISTEP 10.13 - host_rng_bist
21.27032|ISTEP 10.14 - host_update_redundant_tpm
21.27243|ISTEP 11. 1 - host_prd_hwreconfig
21.87940|ISTEP 11. 2 - cen_tp_chiplet_init1
21.88167|ISTEP 11. 3 - cen_pll_initf
21.88472|ISTEP 11. 4 - cen_pll_setup
21.90654|ISTEP 11. 5 - cen_tp_chiplet_init2
21.90887|ISTEP 11. 6 - cen_tp_arrayinit
21.91180|ISTEP 11. 7 - cen_tp_chiplet_init3
21.91599|ISTEP 11. 8 - cen_chiplet_init
21.91904|ISTEP 11. 9 - cen_arrayinit
21.92274|ISTEP 11.10 - cen_initf
21.92493|ISTEP 11.11 - cen_do_manual_inits
21.92722|ISTEP 11.12 - cen_startclocks
21.93014|ISTEP 11.13 - cen_scominits
21.93381|ISTEP 12. 1 - mss_getecid
23.07910|ISTEP 12. 2 - dmi_attr_update
23.11676|ISTEP 12. 3 - proc_dmi_scominit
23.16648|ISTEP 12. 4 - cen_dmi_scominit
23.17511|ISTEP 12. 5 - dmi_erepair
23.19312|ISTEP 12. 6 - dmi_io_dccal
23.19612|ISTEP 12. 7 - dmi_pre_trainadv
23.19915|ISTEP 12. 8 - dmi_io_run_training
23.21495|ISTEP 12. 9 - dmi_post_trainadv
23.21729|ISTEP 12.10 - proc_cen_framelock
23.22026|ISTEP 12.11 - host_startprd_dmi
23.22243|ISTEP 12.12 - host_attnlisten_memb
23.22604|ISTEP 12.13 - cen_set_inband_addr
23.24306|ISTEP 13. 1 - host_disable_memvolt
24.17510|ISTEP 13. 2 - mem_pll_reset
24.24792|ISTEP 13. 3 - mem_pll_initf
24.30519|ISTEP 13. 4 - mem_pll_setup
24.33331|ISTEP 13. 6 - mem_startclocks
24.34941|ISTEP 13. 7 - host_enable_memvolt
24.35151|ISTEP 13. 8 - mss_scominit
25.70075|ISTEP 13. 9 - mss_ddr_phy_reset
26.14936|ISTEP 13.10 - mss_draminit
26.61721|ISTEP 13.11 - mss_draminit_training
27.51642|ISTEP 13.12 - mss_draminit_trainadv
27.61042|ISTEP 13.13 - mss_draminit_mc
27.78375|ISTEP 14. 1 - mss_memdiag
31.58001|ISTEP 14. 2 - mss_thermal_init
31.64733|ISTEP 14. 3 - proc_pcie_config
31.71197|ISTEP 14. 4 - mss_power_cleanup
31.71609|ISTEP 14. 5 - proc_setup_bars
31.77851|ISTEP 14. 6 - proc_htm_setup
31.79344|ISTEP 14. 7 - proc_exit_cache_contained
31.85528|ISTEP 15. 1 - host_build_stop_image
35.09151|ISTEP 15. 2 - proc_set_pba_homer_bar
35.15084|ISTEP 15. 3 - host_establish_ex_chiplet
35.16463|ISTEP 15. 4 - host_start_stop_engine
35.39656|ISTEP 16. 1 - host_activate_master
36.68418|ISTEP 16. 2 - host_activate_slave_cores
36.77981|ISTEP 16. 3 - host_secure_rng
36.80049|ISTEP 16. 4 - mss_scrub
36.83218|ISTEP 16. 5 - host_load_io_ppe
36.87857|ISTEP 16. 6 - host_ipl_complete
37.75469|ISTEP 18.11 - proc_tod_setup
38.02326|ISTEP 18.12 - proc_tod_init
38.05033|ISTEP 20. 1 - host_load_payload
39.03394|ISTEP 20. 2 - host_load_hdat
40.11956|ISTEP 21. 1 - host_runtime_setup
46.40398|htmgt|OCCs are now running in ACTIVE state
52.37718|ISTEP 21. 2 - host_verify_hdat
52.41944|ISTEP 21. 3 - host_start_payload
[ 53.303670341,5] OPAL skiboot-c81f9d6 starting...
[ 53.303673446,7] initial console log level: memory 7, driver 5
[ 53.303675558,6] CPU: P9 generation processor (max 4 threads/core)
[ 53.303677518,7] CPU: Boot CPU PIR is 0x0024 PVR is 0x004e1203
[ 53.303680283,7] OPAL table: 0x30103930 .. 0x30103f10, branch table: 0x30002000
[ 53.303683401,7] Assigning physical memory map table for nimbus
[ 53.303686056,7] Parsing HDAT...
[ 53.303687465,7] SPIRA-S found.
[ 53.303689697,6] BMC #0: HW version 3, SW version 2, chip DD1.0
[ 53.303770751,6] SP Family is ibm,ast2500,openbmc
[ 53.303776925,7] LPC: IOPATH chip id = 0
[ 53.303778302,7] LPC: FW BAR = f0000000
[ 53.303779833,7] LPC: MEM BAR = e0000000
[ 53.303781399,7] LPC: IO BAR = d0010000
[ 53.303782895,7] LPC: Internal BAR = c0012000
[ 53.303795512,7] LPC UART: base addr = 3f8 (3f8) size = 1 clk = 1843200, baud = 115200
[ 53.303798310,7] LPC: BT [0, 0] sms_int: 0, bmc_int: 0
[ 53.305153464,5] HDAT I2C: found e3p0 - unknown@18 dp:ff (ff:)
[ 53.305273899,5] HDAT I2C: found e3p1 - unknown@1c dp:ff (ff:)
[ 53.306035877,5] CHIP: Chip ID 0000 type: P9N DD2.30
[ 53.306474802,5] PLAT: Detected Blackbird platform
[ 53.306544669,5] PLAT: Detected BMC platform ast2500:openbmc
[ 53.323024841,5] CPU: All 32 processors called in...
[ 53.501903768,7] LPC: Routing irq 10, policy: 0 (r=1)
[ 53.501904733,7] LPC: SerIRQ 10 using route 0 targetted at OPAL
[ 54.509575320,5] HIOMAP: Negotiated hiomap protocol v2
[ 54.510806979,5] HIOMAP: Block size is 64KiB
[ 54.510832611,5] HIOMAP: BMC suggested flash timeout of 8s
[ 54.510876435,5] HIOMAP: Flash size is 64MiB
[ 54.510970596,5] HIOMAP: Erase granule size is 64KiB
[ 56.418150405,5] FLASH: Found system flash: (unnamed) id:0
[ 57.207119491,7] LPC: Routing irq 4, policy: 0 (r=1)
[ 57.207120295,7] LPC: SerIRQ 4 using route 1 targetted at OPAL
[ 57.207242358,5] OCC: All Chip Rdy after 0 ms
[ 58.009544728,3] STB: VERSION verification FAILED. log=0xffffffffffff8160
[ 59.124974141,3] STB: IMA_CATALOG verification FAILED. log=0xffffffffffff8160
[ 59.320998010,3] CAPP: Error loading ucode lid. index=203d1
[ 59.335020541,5] PCI: Resetting PHBs and training links...
[ 60.355590956,5] PCI: Probing slots...
[ 60.412005477,5] PCI Summary:
[ 60.412048918,5] PHB#0000:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..03 SLOT=CPU1 Slot2 (16x)
[ 60.412163589,5] PHB#0000:01:00.0 [SWUP] 1022 1470 R:c1 C:060400 B:02..03 LOC_CODE=CPU1 Slot2 (16x)
[ 60.412317704,5] PHB#0000:02:00.0 [SWDN] 1022 1471 R:00 C:060400 B:03..03
[ 60.412419910,5] PHB#0000:03:00.0 [LGCY] 1002 687f R:c1 C:030000 ( vga) LOC_CODE=CPU1 Slot2 (16x)
[ 60.412569858,5] PHB#0000:03:00.1 [EP ] 1002 aaf8 R:00 C:040300 (multimedia-device) LOC_CODE=CPU1 Slot2 (16x)
[ 60.412755922,5] PHB#0001:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..01 SLOT=CPU1 Slot1 (8x)
[ 60.412866802,5] PHB#0001:01:00.0 [EP ] 144d a802 R:01 C:010802 ( mass-storage) LOC_CODE=CPU1 Slot1 (8x)
[ 60.413042976,5] PHB#0002:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..01 SLOT=Builtin SATA
[ 60.413236298,5] PHB#0002:01:00.0 [LGCY] 1b4b 9235 R:11 C:010601 ( sata) LOC_CODE=Builtin SATA
[ 60.413376374,5] PHB#0003:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..01 SLOT=Builtin USB
[ 60.413520796,5] PHB#0003:01:00.0 [EP ] 104c 8241 R:02 C:0c0330 ( usb-xhci) LOC_CODE=Builtin USB
[ 60.413672005,5] PHB#0004:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..01 SLOT=Builtin Ethernet
[ 60.413905142,5] PHB#0004:01:00.0 [EP ] 14e4 1657 R:01 C:020000 ( ethernet) LOC_CODE=Builtin Ethernet
[ 60.414016347,5] PHB#0004:01:00.1 [EP ] 14e4 1657 R:01 C:020000 ( ethernet) LOC_CODE=Builtin Ethernet
[ 60.414222156,5] PHB#0004:01:00.2 [EP ] 14e4 1657 R:01 C:020000 ( ethernet) LOC_CODE=Builtin Ethernet
[ 60.414373675,5] PHB#0005:00:00.0 [ROOT] 1014 04c1 R:00 C:060400 B:01..02 SLOT=BMC
[ 60.414454930,5] PHB#0005:01:00.0 [ETOX] 1a03 1150 R:04 C:060400 B:02..02 LOC_CODE=BMC
[ 60.414582932,5] PHB#0005:02:00.0 [PCID] 1a03 2000 R:41 C:040000 ( video) LOC_CODE=BMC
[ 60.424173661,5] IPMI: Resetting boot count on successful boot
[ 60.424270580,5] INIT: Waiting for kernel...
[ 73.210357140,3] STB: BOOTKERNEL verification FAILED. log=0xffffffffffff8160
[ 73.210954847,5] INIT: 64-bit LE kernel discovered
[ 73.358043529,5] INIT: Starting kernel at 0x20011000, fdt at 0x306f65d0 217248 bytes
[ 74.469358228,3] LPC[000]: Got SYNC no-response error. Error address reg: 0xd0010080
[ 74.469369652,6] IPMI: dropping non severe PEL event
[ 74.469419009,7] UART: IRQ functional !
[ 4.135656] IMC PMU (null) Register failed
[ 4.995288] kAFS: failed to register: -97
[ 5.393954] squashfs: SQUASHFS error: Xattrs in filesystem, these will be ignored
[ 5.393979] squashfs: SQUASHFS error: unable to read xattr id index table
[ 5.401894] udevd[1677]: specified group 'kvm' unknown
[ 5.410641] udevd[1678]: specified group 'kvm' unknown
nvram process returned non-zero exit status
dmesg: klogctl: Operation not permitted
[ 81.461400396,3] PHB#0000[0:0]: phbRegbFirstErrorStatus = 0000000000000000
[ 81.461461441,3] PHB#0000[0:0]: phbRegbErrorLog0 = 0000000000000000
[ 81.461522427,3] PHB#0000[0:0]: phbRegbErrorLog1 = 0000000000000000
[ 81.461585112,3] PHB#0000[0:0]: PEST[1ff] = 3740002a03000000 0000000000000000
cpu 0x0: Vector: 300 (Data Access) at [c0000001f96df950]
pc: c0080000022312fc: amdgpu_fence_process+0xc0/0x13c [amdgpu]
lr: c0080000022312c0: amdgpu_fence_process+0x84/0x13c [amdgpu]
sp: c0000001f96dfbd0
msr: 900000000280b033
dar: 8
dsisr: 80000
current = 0xc0000001f95b5900
paca = 0xc0000001ff7ff480 irqmask: 0x03 irq_happened: 0x01
pid = 1798, comm = kworker/0:4
Linux version 4.19.0-openpower1 (root@raptor-build-public-staging-01) (gcc version 6.5.0 (Buildroot 2019.02.1-05273-gef2bf42027)) #2 SMP Wed May 22 00:16:10 UTC 2019
enter ? for help
[c0000001f96dfc20] c008000002231640 amdgpu_fence_count_emitted+0x20/0x40 [amdgpu]
[c0000001f96dfc50] c0080000022d8fb4 amdgpu_uvd_idle_work_handler+0x90/0x160 [amdgpu]
[c0000001f96dfcb0] c0000000000943f8 process_one_work+0x204/0x32c
[c0000001f96dfd40] c000000000094af0 worker_thread+0x2d0/0x394
[c0000001f96dfdc0] c00000000009a780 kthread+0x14c/0x154
[c0000001f96dfe30] c00000000000b6b0 ret_from_kernel_thread+0x5c/0x6c
The full kernel log can be found at
https://gist.github.com/runlevel5/89f87c2296f0f82f634921248309e7beand the full OPAL log can be found at
https://gist.github.com/runlevel5/94b44fa48cae84ffcb7d2f15fe4945a7Just wondering firmware is loaded correctly or not, is there anything I could do determine the root cause?