Raptor Computing Systems Hardware > Blackbird

Onboard nic not working

<< < (3/3)

power9mm:
Yeah.

it's just a blackbird with an 8 core cpu. Vega 64, 2.5 SSD and a pcie nic. Nothing special. its just hard wired into a librecmc router which is just wired to a modem. Cat6 cable i think.
I can try connecting to the BMC with an airgapped laptop I suppose. I don't want to connect that port to any internet connected devices.

meklort:
So, this is what the state as I see it:

- Linux appeases to be unable to acquire a lock in the hardware. (see https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/broadcom/tg3.c#L15439)
- The APE or RX firmware in the part has priority over Linux
- The APE or RX firmware likely has acquired the lock, and failed to release it due to an infinite loop / PHY hardware not responding how the firmware expected it to.
(Note: If you have an older version of the firmware, the RX firmware acquired the lock for the endpoint it was running on, however this failed to work in some cases, and is disabled in newer firmware. The APE firmware still grabs a lock as well, however this should be for one port only, not all of them, and so I don't expect this to be where the issue is)

All that said, when the lock failed to be acquired, Linux stopped initializing the device, and so you don't get the eth port showing up.
This is also why fwupd and ethtool failed to work - they depend on the tg3 driver being loaded.

To fix this, you'll need to do one of a two things:
- Try connecting to a different external device like a switch before going to the router and see if you get a different result. This is probably the simplest option if it works (unlikely due to all ports failing to initialize)
- If you're unable to get Linux to ever see the device, you'll need to install the development tools for the OSS firmware and at that point there are a couple of options.
(1) You can try loading the proprietary firmware image and see if that works. if it does *not* work, you'll need to RMA with Raptor as it's a hardware issue. If, on the other had, the proprietary firmware works, you can leave it as-is, or, we can flash the latest oss fw release and continue to (2)
(2) You/We can try debugging the issue. You'll need to provide dumps of the registers using the bcmregtool utility. That'll let me know if the firmware is spinning / locked up, the firmware version, etc and if there's an easy fix.

vikings.thum:
Is your BMC not working as well?
I've had the same issue with one of my Blackbirds and contacted RaptorCS;

--- Quote ---I know there was a manufacturing process change to fix data corruption seen on the first handful of boards, it's very possible this one shipped out prior to the change.
--- End quote ---
They fixed it by providing an update bundle which I then flashed. When you contact them, provide your ports MAC addresses so can embed that in the image.
They provided the following commands to flash the new image:

--- Code: ---set +e
echo 0004:01:00.1 > /sys/bus/pci/drivers/tg3/unbind
set -e
bcmflash -t raw -i 1 -r bcm5719.rom[
--- End code ---
After powering down and removing the system from your wall socket, boot it up again and it should work as expected.

power9mm:
I haven't tried the BMC yet. I suppose I will now since you brought that up.

Navigation

[0] Message Index

[*] Previous page

Go to full version