Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - kmarek

Pages: [1]
1
User Zone / Re: Graphics Card install
« on: December 27, 2019, 12:34:28 pm »
They're similar. Both are using Fedora's kernel sources (but mine from 2019-12-05). Packages are easier to reproduce and clean up. Package sources contain their own build steps and list of dependencies.

2
User Zone / Re: Graphics Card install
« on: December 27, 2019, 12:59:48 am »
Woah... RSS stopped working so I fell behind.

I made packages for the kernel with the patches already applied to make this easy.

While there's no reason to trust me, I tried to make it clean and reproducible.
Original fedora kernel source RPM: https://kojipkgs.fedoraproject.org/packages/kernel/5.4.0/2.fc32/src/kernel-5.4.0-2.fc32.src.rpm
My modified source RPM: https://files.gigabyteproductions.net/srv/devel/linux-navi10/fedora/f32/try6/kernel-5.4.0-2.fc32.ppc64le/kernel-5.4.0-2.fc32.src.rpm

You can rebuild using rpmbuild, or Mock.
See: https://wiki.centos.org/HowTos/RebuildSRPM
See: https://fedoramagazine.org/how-rpm-packages-are-made-the-source-rpm/
See: https://blog.packagecloud.io/eng/2015/05/11/building-rpm-packages-with-mock/#building-an-rpm-with-mock

My pre-built package can be installed like:
Code: [Select]
sudo dnf install \
  'https://files.gigabyteproductions.net/srv/devel/linux-navi10/fedora/f32/try6/kernel-5.4.0-2.fc32.ppc64le/kernel-5.4.0-2.fc32.ppc64le.rpm' \
  'https://files.gigabyteproductions.net/srv/devel/linux-navi10/fedora/f32/try6/kernel-5.4.0-2.fc32.ppc64le/kernel-core-5.4.0-2.fc32.ppc64le.rpm' \
  'https://files.gigabyteproductions.net/srv/devel/linux-navi10/fedora/f32/try6/kernel-5.4.0-2.fc32.ppc64le/kernel-modules-5.4.0-2.fc32.ppc64le.rpm' \
  'https://files.gigabyteproductions.net/srv/devel/linux-navi10/fedora/f32/try6/kernel-5.4.0-2.fc32.ppc64le/kernel-modules-extra-5.4.0-2.fc32.ppc64le.rpm' \
  ;

3
User Zone / Re: Graphics Card install
« on: December 02, 2019, 02:04:03 pm »
Make sure you plug a screen into the AMD GPU, because the AST will not show anything since we're disabling it.

Also remove rhgb and quiet to give us more information about where it gets stuck, if we can see anything.

4
User Zone / Re: Graphics Card install
« on: December 02, 2019, 09:49:32 am »
Add to the last line that says "Boot arguments". The original line ends in quiet

You add modprobe.blacklist=ast and video=offb:off

Overall, your "Boot arguments" line will look like:
Code: [Select]
root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet modprobe.blacklist=ast video=offb:off

5
User Zone / Re: Graphics Card install
« on: December 01, 2019, 04:51:42 pm »

6
User Zone / Re: Graphics Card install
« on: December 01, 2019, 04:41:03 pm »
What error? Maybe post picture?

7
User Zone / Re: Graphics Card install
« on: December 01, 2019, 04:21:46 pm »
btw, just FYI, there's no ttyS0 for serial console on talos/blackbird or any openpower hardware, as the kernel does no see a physical serial port; the correct serial console is hvc0 (without a baudrate, as it's not a serial console per se) for powernv, or hvsi0 (baudrate 19200) for pseries machines.

Oh, interesting. I thought hvc0 was for virtual machines and paravirtualized kernels. I'm now noticing that my console output always includes hvc0 regardless of console=*. Perhaps an effect kexec-ing from the bootloader's kernel? (like /sys/devices/virtual/tty/console/active is copied?)

8
User Zone / Re: Graphics Card install
« on: December 01, 2019, 03:47:25 pm »
Sidenote: your dmesg output does not indicate error:
Code: [Select]
[    2.636093] fb0: switching to astdrmfb from OFfb vga
[    2.636255] [drm] platform has no IO space, trying MMIO
[    2.636258] [drm] Using device-tree for configuration
[    2.636258] [drm] AST 2500 detected
[    2.636261] [drm] Using Sil164 TMDS transmitter
[    2.636266] [drm] dram MCLK=800 Mhz type=7 bus_width=16 size=01000000
[    2.639371] ast 0005:02:00.0: fb0: astdrmfb frame buffer device
[    2.737473] [drm] Initialized ast 0.1.0 20120228 for 0005:02:00.0 on minor 0

[    2.859718] [drm] amdgpu kernel modesetting enabled.
[    2.859755] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 0: 0x6000000000000 -> 0x600000fffffff
[    2.859758] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 2: 0x6000010000000 -> 0x60000101fffff
[    2.859760] amdgpu 0000:03:00.0: remove_conflicting_pci_framebuffers: bar 5: 0x600c000000000 -> 0x600c00007ffff
[    2.859775] amdgpu 0000:03:00.0: enabling device (0140 -> 0142)
[    2.859995] [drm] initializing kernel modesetting (NAVI10 0x1002:0x731F 0x1002:0x0B36 0xC0).
[    2.860002] [drm] register mmio base: 0x00000000
[    2.860003] [drm] register mmio size: 524288
[    2.860003] [drm] PCI I/O BAR is not found.
[    2.860013] [drm] PCIE atomic ops is not supported
[    2.902060] [drm] set register base offset for ATHUB
[    2.902062] [drm] set register base offset for CLKA
[    2.902063] [drm] set register base offset for CLKA
[    2.902064] [drm] set register base offset for CLKA
[    2.902064] [drm] set register base offset for CLKA
[    2.902065] [drm] set register base offset for CLKA
[    2.902067] [drm] set register base offset for DF
[    2.902068] [drm] set register base offset for DMU
[    2.902069] [drm] set register base offset for GC
[    2.902070] [drm] set register base offset for HDP
[    2.902071] [drm] set register base offset for MMHUB
[    2.902071] [drm] set register base offset for MP0
[    2.902072] [drm] set register base offset for MP1
[    2.902073] [drm] set register base offset for NBIF
[    2.902074] [drm] set register base offset for NBIF
[    2.902075] [drm] set register base offset for OSSSYS
[    2.902076] [drm] set register base offset for SDMA0
[    2.902077] [drm] set register base offset for SDMA1
[    2.902078] [drm] set register base offset for SMUIO
[    2.902079] [drm] set register base offset for THM
[    2.902080] [drm] set register base offset for UVD
[    2.902083] [drm] add ip block number 0 <nv_common>
[    2.902084] [drm] add ip block number 1 <gmc_v10_0>
[    2.902085] [drm] add ip block number 2 <navi10_ih>
[    2.902086] [drm] add ip block number 3 <psp>
[    2.902087] [drm] add ip block number 4 <smu>
[    2.902088] [drm] add ip block number 5 <gfx_v10_0>
[    2.902089] [drm] add ip block number 6 <sdma_v5_0>
[    2.902090] [drm] add ip block number 7 <vcn_v2_0>
[    2.996953] [drm] VCN decode is enabled in VM mode
[    2.996954] [drm] VCN encode is enabled in VM mode
[    2.996955] [drm] VCN jpeg decode is enabled in VM mode
[    2.996958] [drm] GPU posting now...
[    2.997016] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
[    2.997043] amdgpu 0000:03:00.0: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used)
[    2.997046] amdgpu 0000:03:00.0: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[    2.997050] [drm] Detected VRAM RAM=8176M, BAR=256M
[    2.997052] [drm] RAM width 256bits GDDR6
[    2.997064] [drm] amdgpu: 8176M of VRAM memory ready
[    2.997068] [drm] amdgpu: 8176M of GTT memory ready.
[    2.997111] [drm] GART: num cpu pages 8192, num gpu pages 131072
[    2.997200] [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
[    2.997302] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    2.997303] [drm] Driver supports precise vblank timestamp query.
[    2.997469] [drm] ppt_offset_bytes: 3
[    2.997471] [drm] ppt_size_bytes: 262912
[    2.999386] [drm] use_doorbell being set to: [true]
[    2.999451] [drm] use_doorbell being set to: [true]
[    2.999577] [drm] Found VCN firmware Version ENC: 1.4 DEC: 3 VEP: 0 Revision: 0
[    2.999584] [drm] PSP loading VCN firmware
[    3.437470] [drm] reserve 0x7200000 from 0x8000400000 for PSP TMR
[    3.961257] amdgpu: [powerplay] SMU is initialized successfully!
[    3.961856] [drm] kiq ring mec 2 pipe 1 q 0
[    3.961980] [drm] ring test on 10 succeeded in 65 usecs
[    3.962029] [drm] ring test on 10 succeeded in 11 usecs
[    3.962103] [drm] gfx 0 ring me 0 pipe 0 q 0
[    3.962123] [drm] ring test on 0 succeeded in 8 usecs
[    3.962125] [drm] gfx 1 ring me 0 pipe 1 q 0
[    3.962132] [drm] ring test on 1 succeeded in 1 usecs
[    3.962134] [drm] compute ring 0 mec 1 pipe 0 q 0
[    3.962146] [drm] ring test on 2 succeeded in 3 usecs
[    3.962147] [drm] compute ring 1 mec 1 pipe 1 q 0
[    3.962161] [drm] ring test on 3 succeeded in 1 usecs
[    3.962163] [drm] compute ring 2 mec 1 pipe 2 q 0
[    3.962178] [drm] ring test on 4 succeeded in 1 usecs
[    3.962179] [drm] compute ring 3 mec 1 pipe 3 q 0
[    3.962194] [drm] ring test on 5 succeeded in 1 usecs
[    3.962195] [drm] compute ring 4 mec 1 pipe 0 q 1
[    3.962210] [drm] ring test on 6 succeeded in 1 usecs
[    3.962211] [drm] compute ring 5 mec 1 pipe 1 q 1
[    3.962226] [drm] ring test on 7 succeeded in 1 usecs
[    3.962227] [drm] compute ring 6 mec 1 pipe 2 q 1
[    3.962242] [drm] ring test on 8 succeeded in 1 usecs
[    3.962243] [drm] compute ring 7 mec 1 pipe 3 q 1
[    3.962258] [drm] ring test on 9 succeeded in 1 usecs
[    3.962356] [drm] ring test on 11 succeeded in 37 usecs
[    3.962377] [drm] ring test on 12 succeeded in 5 usecs
[    3.989875] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[    3.989986] amdgpu 0000:03:00.0: ring 0(gfx_0.0.0) uses VM inv eng 4 on hub 0
[    3.989988] amdgpu 0000:03:00.0: ring 1(gfx_0.1.0) uses VM inv eng 5 on hub 0
[    3.989990] amdgpu 0000:03:00.0: ring 2(comp_1.0.0) uses VM inv eng 6 on hub 0
[    3.989992] amdgpu 0000:03:00.0: ring 3(comp_1.1.0) uses VM inv eng 7 on hub 0
[    3.989993] amdgpu 0000:03:00.0: ring 4(comp_1.2.0) uses VM inv eng 8 on hub 0
[    3.989995] amdgpu 0000:03:00.0: ring 5(comp_1.3.0) uses VM inv eng 9 on hub 0
[    3.989997] amdgpu 0000:03:00.0: ring 6(comp_1.0.1) uses VM inv eng 10 on hub 0
[    3.989998] amdgpu 0000:03:00.0: ring 7(comp_1.1.1) uses VM inv eng 11 on hub 0
[    3.990000] amdgpu 0000:03:00.0: ring 8(comp_1.2.1) uses VM inv eng 12 on hub 0
[    3.990001] amdgpu 0000:03:00.0: ring 9(comp_1.3.1) uses VM inv eng 13 on hub 0
[    3.990003] amdgpu 0000:03:00.0: ring 10(kiq_2.1.0) uses VM inv eng 14 on hub 0
[    3.990005] amdgpu 0000:03:00.0: ring 11(sdma0) uses VM inv eng 15 on hub 0
[    3.990006] amdgpu 0000:03:00.0: ring 12(sdma1) uses VM inv eng 16 on hub 0
[    3.990008] amdgpu 0000:03:00.0: ring 13(vcn_dec) uses VM inv eng 4 on hub 1
[    3.990010] amdgpu 0000:03:00.0: ring 14(vcn_enc0) uses VM inv eng 5 on hub 1
[    3.990011] amdgpu 0000:03:00.0: ring 15(vcn_enc1) uses VM inv eng 6 on hub 1
[    3.990013] amdgpu 0000:03:00.0: ring 16(vcn_jpeg) uses VM inv eng 7 on hub 1
[    3.990094] [drm] Initialized amdgpu 3.33.0 20150101 for 0000:03:00.0 on minor 1
[    5.999221] [drm] ib test on ring 0 succeeded
[    5.999846] [drm] ib test on ring 1 succeeded
[    6.000508] [drm] ib test on ring 2 succeeded
[    6.001142] [drm] ib test on ring 3 succeeded
[    6.001793] [drm] ib test on ring 4 succeeded
[    6.002460] [drm] ib test on ring 5 succeeded
[    6.003114] [drm] ib test on ring 6 succeeded
[    6.003773] [drm] ib test on ring 7 succeeded
[    6.004426] [drm] ib test on ring 8 succeeded
[    6.005088] [drm] ib test on ring 9 succeeded
[    6.005720] [drm] ib test on ring 10 succeeded
[    6.006344] [drm] ib test on ring 11 succeeded
[    6.006995] [drm] ib test on ring 12 succeeded
[   19.143576] amdgpu 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none

However, there is never a line indicating that amd framebuffer was initiallized, like:
Code: [Select]
[   10.849100] amdgpu 0000:03:00.0: fb0: amdgpudrmfb frame buffer device

We'll see once you blacklist AST.

9
User Zone / Re: Graphics Card install
« on: December 01, 2019, 03:37:59 pm »
You start your computer. At first it looks like this while it scans disks:


Then you'll see what it finds and it'll start counting down. Arrow down to your Fedora entry with the asterisk (*)::


Press "e" for edit, it'll take you here to temporarily change it:


You want to change "Boot arguments". Notice that the options are separated by spaces:


So you want to add "modprobe.blacklist=ast video=offb:off" without spaces between the equals. This is wrong:


This is correct:


Go to "OK", press enter:


It'll take you back to the menu, with your change applied (for this boot), press enter:


Let us know if it fixes it. Next step will be to permanently fix the boot arguments.

10
User Zone / Re: Graphics Card install
« on: December 01, 2019, 10:46:56 am »
AMD GPUs deal fine with reboots as long as the driver/firmware wasn't previously loaded as a different version, I've been using fast reboot for as long as I can remember on both Talos and Blackbird and there have been no issues whatsoever.

You and I discussed this on IRC before. RX Vega 64 is still subject to the two reboot issue even if firmware version hasn't changed.

Just using `modprobe.blacklist` works fine. Void uses dracut and there have never been any issues with that, nor I experienced them on any other OS. It's a parameter passed on the kernel, and any call to `modprobe` regardless of whether from initramfs or from the target system will not load it unless overridden, and the kernel will not auto-load it either.

Ah, okay. I probably made a typo when I had issues in the past, then...

Also, RX 5700 cards are extremely problematic on Linux right now (even on x86). To get anywhere near having a remotely stable experience, you need a patched kernel 5.4, LLVM 10 from svn, and mesa from git built against this LLVM, any other configuration will result in frequent hangs (https://gitlab.freedesktop.org/drm/amd/issues/892)

I was unaware.

11
User Zone / Re: Graphics Card install
« on: December 01, 2019, 10:28:13 am »
q66:

PCIe device resets will still affect reboots. It affects me even having the AST GPU disabled by jumper.

Was going to suggest blacklisting ast once it was clear that it was the issue. Other people have AST and AMD working at same time. Also add:
Code: [Select]
rd.driver.blacklist=astbecause modprobe.blacklist doesn't blacklist from dracut initrd.

I am also not certain that blacklisting it (not jumper disabling) solves DMA issues.

12
Blackbird / Re: HDMI output for mother board
« on: December 01, 2019, 10:07:14 am »
Is this the HDMI of your AMD GPU from the other thread, or the HDMI that's built into the motherboard?

I'm a little confused by your wording of reloading vs standby vs switch.

However, I think you're describing an issue I've had before with AMD cards. Let's see if I understand:
- You turn off the computer using the power supply switch
- You turn it back on and boot, you can see output on the AMD card
- You "reboot" (computer goes back to bootloader, but the lights don't go out)
- You don't see output in the AMD card this time

It is unclear if we share this part, but this also happens for me:
- You turn on the computer and you see bootloader on AMD card
- You "shutdown" (don't flip switch, but the lights go out until you press power button again)
- You turn it back on and boot
- You should see bootloader (I did)

Basically, powering off the GPU entirely makes it work. It's a bug in AMD GPUs that doesn't have a perfect software workaround yet. My current workaround is to disable fast reboots so all reboots turn into "shutdown, then power on". You can do this by running the following command in the bootloader shell:
Code: [Select]
nvram -p ibm,skiboot --update-config fast-reset=0

13
User Zone / Re: Graphics Card install
« on: December 01, 2019, 09:46:30 am »
You may not be able to use the embedded GPU and the AMD GPU at the same time. There may be DMA issues in having both enabled at the same time.

See: https://wiki.raptorcs.com/wiki/POWER9_Hardware_Compatibility_List/PCIe_Devices#AMD
See: https://wiki.raptorcs.com/wiki/Troubleshooting/GPU#Workaround_1:_Disable_on-board_VGA

Note, as the second link says, it may be necessary to disable the embedded GPU by putting a jumper cap on a specific place on the motherboard. Reference page 37 of the Blackbird manual: https://wiki.raptorcs.com/w/images/c/ce/C1P9S01_users_guide_version_1_0.pdf

There isn't a spare jumper there, so if you don't have one (maybe from the back of a junk IDE hard drive), then you may need to buy some: https://www.amazon.com/dp/B00N552DWK/

Hopefully you don't need to disable onboard. More on that later.

That being said, there's a couple of other notes regardless:

Many AMD GPUs do not honor PCIe device resets, so you might have something working one boot, and not have output after a reboot (vs shutdown and fresh power up). Newer versions of the amdgpu driver have a workaround to reset the card, but for a lot of cards, this seems to work exactly twice. This is probably not what's causing your card to no work in Fedora 31 right now, but this may be interfering with some of your tests if you're rebooting every time. Personally, I'm "avoiding" the issue by configuring skiboot to go through a full power cycle when requested to reboot (run in bootloader):
Code: [Select]
nvram -p ibm,skiboot --update-config fast-reset=0
Xorg (typically your display server) does some really weird things when multiple GPUs are active. Usually I find that having displays on both GPUs causes both GPUs to need to render the whole desktop (to then show only the part that they are displaying), but doing this usually limits your performance or features to that of the lesser card. Someone on IRC said they had an AMD GPU working with the embedded AST GPU, but they were missing gamma (display brightness) in having both enabled. Since you're using GNOME, you might be using Wayland, which may behave differently (I'm not well versed in Wayland).

You need firmware in your bootloader if you want your bootloader to display, but it may not play nicely with your Linux OS, anyway. Your bootloader probably doesn't display out your AMD GPU. I'm guessing you're using the embedded HDMI because of this. The bootloader is itself another Linux environment with the amdgpu driver, and will want firmware to actually enable the card. The firmware is not already included because it's not open source/auditable and because there isn't enough space to generically include all GPU firmware. Raptor might be telling you to copy the amdgpu firmware into the bootloader to get the bootloader to display, which isn't necessary to get your Linux OS to display. However, AMD cards do not accept firmware more than once (with resets being a possible exception). The specific issue I ran into after installing firmware is:
- the bootloader would display on a fresh boot
- Fedora's amdgpu would reset the card and load their more up to date firmware
- I'd reboot and the bootloader (having an older amdgpu driver that doesn't reset the card) would fail to load the firmware and panic.
Some of us are avoiding these sorts of issues by not installing GPU firmware into the bootloader. I use serial console when I need to interact with my bootloader (serial console will work regardless of functioning GPU, too).

We'll learn more from dmesg (the command that shows Linux kernel logs).

Please:

If Fedora boots and you have a desktop on the embedded GPU, please open a terminal (search "terminal" in GNOME) and run:
Code: [Select]
dmesgand copy-paste the output here into a code block.

If Fedora boots, but finally ends up at a black screen or something, switch to a virtual terminal by pressing ctrl-alt-F2 (ctrl-alt-F1 will go back to the first desktop on Fedora, or ctrl-alt-F7 on Ubuntu). When you're at a virtual terminal, log in and run:
Code: [Select]
dmesg | fpaste then go to the link it outputs on another computer, and copy the output from the link (link is only valid for a day).

If Fedora boots but breaks such that you cannot go to a virtual terminal on the embedded card, reboot, press "e" (edit) in bootloader before booting, and add the following to cmdline to temporarily disable using the AMD GPU at all:
Code: [Select]
rd.driver.blacklist=amdgpu modprobe.blacklist=amdgpu rd.driver.blacklist=amdgpudrmfb modprobe.blacklist=amdgpudrmfbthen, instead of dmesg, run the following and paste that output:
Code: [Select]
sudo cat /var/log/messages

If you're having a different issue, please give more details.

We'll suggest a fix if dmesg makes it clear why the GPU doesn't start.

14
User Zone / Re: Graphics Card install
« on: November 30, 2019, 04:25:21 pm »
linux-firmware on Fedora 31 has navi10 .bin files. You're likely not missing firmware files.

Pages: [1]