Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - meklort

Pages: [1]
Firmware / Re: network card to reduce attack surface?
« on: August 19, 2022, 09:46:58 am »
Take a look at the Talos2 schematics: T2P9D01, v3.8, page 96.  The NCSI management interface for the ethernet chip goes to the BMC.  This is essentially the management interface for the tiny switch inside the BCM5719.  It can select the filtering criteria (vlan, mac address, etc) for which packets get passed to the BMC.  Or it can set no filtering at all in order to let the BMC sniff all traffic.

Think about it: both "tcpdump" and "ip link set dev address $MAC_ADDRESS" work on the BMC.  You can sniff traffic and inject packets on either ethernet port from the BMC, without "loading a custom host OS image".
Yes, I understand how NC-SI works and how it is used to connect the BMC and the NIC.

That said, there is *no* switch in the NIC, and it doesn't operate the way you discribed. The NC-SI traffic is handheld by firmware and not by hardware.

You are correct that NC-SI allows setting the VLAN, MAC, etc, however this is done by sending a command to the APE on the BCM5719, which then sets the appropriate registers to enable it.
Please se here for how packets from the BMC are handled:
and here for how the NC-SI packets are handled, such as the set MAC command:
You may also note that the open source firmware does not support setting a different VLAN as well, butin any case, as the firmware is open source, you can always explicitly lock it down to a specific MAC and VLAN.

In this case, there is no checking that the requested MAC is different from the host MAC, and so I will add that in a future release, as it's a very good point.

Firmware / Re: network card to reduce attack surface?
« on: August 18, 2022, 09:46:57 pm »
The original post is a bit old, but since there now activity, I figured I should add a few clarifications.

so to clarify, the BMC on the blackbird is isolated and not accessible if one has network access on the other two ethernet ports. correct?

By all accounts that’s correct- the official documentation says that much, and it’s repeated on several pages on the official wiki.

To clarify this a bit more, as it's partially correct:
- The proprietary firmware, if you are still on it, technically allows all ports to be used for network traffic on the BMC. The latest BMC firmware is configured to select only the correct port for the Blackbird of Talos II, however in some cases, this could malfunction. It is also relatively easy to reconfigure this one connected to the BMC.
- If you are using the open source firmware, this is configured to only connect to the specified port at build time, as such, the BMC cannot communicate on a separate port mistakenly. There are of course ways for the BMC to turn on the host, and then instruct the POWER9 CPU to flash the NIC firmware, but that's not something that the BMC can do as easily as the option with the proprietary code.

I'm not sure about the Blackbird, but please note that this is most definitely not true for Talos2.

The Talos2 BMC is connected directly to the management interface of the two-port ethernet chip, and there is nothing you can do to prevent an attacker with control of the BMC from having total control over both network adapters.

All of my Talos2 machines use separate PCIe cards for networking as a result of this unfortunate situation.  Hopefully Arctic Tern will eventually allow me to re-pinstrap the BMC and hold its reset pin in the asserted state so I can go back to using the on-motherboard Ethernet ports.
Technically speaking, the only way that the BMC can take control of the network controller is by loading a custom host OS image, that then talks to the device. The BMC does not have a way to re-flash the network card firmware directly, nor does it have a way to load new firmware on the device directly. This can only be done from the host, which the BMC does have full control of. Your model of adding a second Ethernet card does make things harder, but the BMC can still take control of this by replacing the host image.
The general threat model for the Talos II and blackbird is that the BMC is in control of the host, and not the other way around, ado so this is how things are designed. The BMC can always compromise the host.

Is there a simpler way to achieve this? Perhaps a BMC configuration trick that disables NC-SI?
You can disable network access to the BMC a couple of ways:
- Remove the firmware on the NIC, specifically the APE. This will disable the BMC from being able to access the network (without first re flashing the firmware)
- Build a custom version of the firmware that disables NC-SI. At this point, there's not much benefit of running any firmware, but it's still an option.
- Use the open source firmware and *don't* use port 0 (Talos II), or port 2 (blackbird), as those are the ports configured for NC-SI access to the network.

For those of you interested in using the open source firmware instead of the proprietary firmware for the BCM5719 network card, the latest stable release (0.6.12) is available via fwupd and LVFS. My understanding is that this version will be included in future shipments from RCS as well.

If you're on a Linux distribution that includes fwupd 1.5.2+, you can switch using the following command:
Code: [Select]
sudo fwupdmgr switch-branch
Note that after the update, it's best to completely power cycle the system - that is to say that the host (POWER9), BMC (AST2500), and the NIC (BCM5719) all have to be restarted, and so powering off the machine and unplugging it for a good 30 seconds is your best option after the update completes

You can find the source code here:

Special thanks to:
  • Hugo Landau for his effort on reverse engineering the proprietary firmware. You can read about that here:
  • Richard Hughes for fwupd and his changes to support flashing the bcm5719.

General CPU Discussion / Re: CPU Power 9 8 core più veloci e prestanti
« on: November 24, 2020, 10:30:58 pm »
02CY977 DD2.2 model or equivalent DD2.3
Just a quick comment, but please be aware that the CPUs you mentioned are not supported out of the box by the existing firmware on the boards:

There's nothing stopping you from adding the needed tables, but the 190W 8 core CPU will *not* turbo unless if you also install the correct WOF tables into the PNOR.
For reference, the relevant WOF tables can be found upstream here:
If the tables are not installed, the CPU will not work in turbo mode and will be quite a bit slower.

(Note: I'm personally running 12 core parts that are normally rated for 105W but have increased the TDP to ~145W. Modifying the tables isn't too much work, and I expect RCS would be open to including additional tables in the firmware image for future release, but you'd have to ask RCS to be certain.)

Blackbird / Re: RAM slot B1 not showing
« on: July 16, 2020, 09:25:42 pm »
It sounds like the RAM was GUARDed out by the firmware. It probably detected some issue with it and disabled it.

You can try clearing the GUARD partition via the BMC when the host off:
Code: [Select]
pflash -P GUARD -c
This won't fix the reason why it was GUARDed out, but it should re-enable the slot.

User Zone / Re: Graphics Card install
« on: December 15, 2019, 09:56:39 pm »
@meklort: Thanks for the patch. I have another question, after patching the kernel, will it be replaced in the next update of Fedora or will it remain?

I believe it would be replaced if you update to the official version from the fedora repos. You'll probably want to change the version string to something custom so that it does not get replaced. To do so, I think you'll want to set the CONFIG_LOCALVERSION value in the .config file
Code: [Select]

You'll need to use the bold commands in the guide.

User Zone / Re: Graphics Card install
« on: December 12, 2019, 11:28:43 pm »
The best option would be to follow the guide each step should be listed (in bold). Some steps make take a while to complete. I'd suggest giving it a try before falling back to another method.

If you are unable to get it to work, you can try installing the following prebuilt ones from GigabytePro:
You'll need to use
Code: [Select]
rpm -i kernel-core-5.4.0-2.fc32.ppc64le.rpmand
Code: [Select]
rpm -i kernel-modules-5.4.0-2.fc32.ppc64le.rpmNote that while this is the fedora 32 kernel, it should be OK on fedora 31.

User Zone / Re: Graphics Card install
« on: December 10, 2019, 10:23:40 pm »
I put together something on the wiki that has instructions on building a patched kernel and installing it. You'll also need to follow the information on enabling the discrete display.

Let me know if you run into problems.

User Zone / Re: Graphics Card install
« on: December 09, 2019, 12:35:08 am »
In this case my instincts let me down.  I had drilled in my head for so long "don't use floating point in kernel space!" that I didn't even think to look for an x86-only guard around the DCN code.  I hope the patches make up for it!  ;)
No worries. I didn't really have the time this weekend to work on fixing the issue, so it was certainly good to have you work on it and on upstreaming the fixes.

They are testing the work done, they already work tell me how you can read above, as soon as they finish the tests they will tell me how to activate this beautiful GPU ...
Everything seems to be working reasonably well here on Fedora 32/Rawhide. I'll try to do a fresh Fedora 31 install here (making sure everything works) and put together a quick guide on the steps needed to get Navi 10 working in the next day or two.

User Zone / Re: Graphics Card install
« on: December 06, 2019, 07:03:21 pm »
I was able to test the patch from madscientist159 and with some modifications I have the Radeon 5700 XT running my system. We're still working on cleaning up the patch, but Navi is now working on POWER. once we're further along, I'll test on Fedora 31 instead of Rawhide and hopefully get you a kernel build to try.

User Zone / Re: Graphics Card install
« on: December 05, 2019, 06:50:02 am »
I've filed a bug report here, feel free to add comments to it:

User Zone / Re: Graphics Card install
« on: December 04, 2019, 11:32:55 pm »
This issue is due to Navi display support only being enabled for X86.
See here:
And here:
And here:

Basically, before this will work the following needs to happen:
Either the DC code needs to be modified to use integer math instead of floating point math or
  • The kernel_fpu_begin / kernel_fpu_end APIs need to be added for POWER (currently only supported on x86 and s390)
  • The KConfig files need to be updated to enable POWER in addition to X86
  • The Makefiles need to be modified to not assume x86 (and sse / sse2)

User Zone / Re: Graphics Card install
« on: December 04, 2019, 06:19:14 am »
Code: [Select]
No outputs definitely connected, trying again...
This isn't a POWER problem, this is an AMD GPU driver / hardware problem.  We're going to need a lot more info including the monitor model etc. -- last time I saw this you had to flip DisplayCore on or off, but Navi may require DisplayCore to operate at all.  If the latter is the case, you'll need to contact AMD support to get the driver fixes.
Yes, agreed, it's a problem with the driver (again, I'm assuming it's only a driver issues on ppc64le and not x86_64, but that's only an assumption) as it's not even detecting that there's anything that a monitor could be plugged into.
Note that I did test with amdgpu.dc=1 and amdgpu.dc=0 and there was no change, however my understanding is that navi only works it enabled like you said

Do you know the best way to contact AMD about this? I can run through the appropriate channels to try to get the support improved.

User Zone / Re: Graphics Card install
« on: December 03, 2019, 10:53:03 pm »
FYI, I tested this our a month or so ago, with build of the kernel / mesa / etc from git and was never able to get it to to work. I've also just re-tested with Fedora rawhide, and am seeing the same behaviour.

Effectively, the graphics card is detected fine, however no output ports are detected when starting X11, and as a result, no screens are found.
Normally, I'd expect to see something like the following in the X11 log:
Code: [Select]
[   716.370] (II) AMDGPU(0): Output DisplayPort-0 has no monitor section
[   716.370] (II) AMDGPU(0): Output DisplayPort-1 has no monitor section
[   716.370] (II) AMDGPU(0): Output DisplayPort-2 has no monitor section
[   716.371] (II) AMDGPU(0): Output HDMI-A-0 has no monitor section
[   716.404] (II) AMDGPU(0): EDID for output DisplayPort-0

With Navi 10 on rawhide, I instead see the following (no outputs type are even detected, so it doesn't probe them):
Code: [Select]
[  1002.413] (II) AMDGPU(0): glamor X acceleration enabled on AMD NAVI10 (DRM 3.35.0, 5.4.0-2.fc32.ppc64le, LLVM 9.0.0)
[  1002.413] (II) AMDGPU(0): glamor detected, initialising EGL layer.
[  1002.413] (==) AMDGPU(0): TearFree property default: auto
[  1002.413] (==) AMDGPU(0): VariableRefresh: disabled
[  1002.413] (II) AMDGPU(0): KMS Pageflipping: enabled
[  1002.413] (WW) AMDGPU(0): No outputs definitely connected, trying again...
[  1002.413] (WW) AMDGPU(0): Unable to find connected outputs - setting 1024x768 initial framebuffer
[  1002.413] (II) AMDGPU(0): mem size init: gart size :1fe810000 vram size: s:1f7b70000 visible:fd50000
[  1002.413] (==) AMDGPU(0): DPI set to (96, 96)
[  1002.413] (==) AMDGPU(0): Using gamma correction (1.0, 1.0, 1.0)
Fatal server error:
[  1002.416] (EE) no screens found(EE)

So, my assumption right now is that the current code has a bug on ppc64 where outputs ports are not detected properly. Note that I'll do some additional tests this weekend, but I expect this will require some sort of fix changes in the kernel/amdgpu driver.

Pages: [1]