Author Topic: What does it take to support a new CPU?  (Read 927 times)

adaptl

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
What does it take to support a new CPU?
« on: February 19, 2025, 07:21:42 pm »
I'm just wondering what I should know about modifying the firmware to support a new CPU  :D

Here is my situation: I've been trying to upgrade my firmware to the newest version, but it won't IPL on anything past System Package v2.00  :'(  . I've got a weird Talos II lite development board for a discount back in 2018 that I think uses a different DD2.1 stepping CPU (02AA883:o . So I managed to pick up two CPUs off of ebay with the number 02CY230  8)

If anyone wants, I'll send you the other 02CY230 in the mail for free if you're interested in working on the firmware with it (I've only got a single-socket board anyways)  ;D

I found this post which discusses altering the firmware with a 02CY231:
tl;dr I bought an unsupported CPU, which was mostly ok, and I tweaked some firmware to make it work properly...

I figured I'd try to get a newer version of linux on the PNOR to get petitboot running on amdgpu, cause right now I've got no output even after adding firmware to BOOTKERNFW. I'll also probably try to adjust the fan curves to be quieter and undervolt if possible  :P

adaptl

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #1 on: February 19, 2025, 07:23:16 pm »
Oh, I should mention my card is a RX6600 (Navi23, dimgrey_cavefish)  ::)

bobpaul

  • Newbie
  • *
  • Posts: 22
  • Karma: +3/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #2 on: February 25, 2025, 08:17:42 pm »
I'm currently going through this process, too, except I'm hoping to get the 02AA883 working slightly better. Mine boots fine on the latest firmware, but it doesn't load the WOF table.

Have you tried booting the 02CY230 yet?


adaptl

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #3 on: February 25, 2025, 10:01:53 pm »
Have you tried booting the 02CY230 yet?

I've tried booting it on several firmware versions with the 20CY230 but it just doesn't IPL or give me any info from the PNOR serial interface or from the BMC.

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg. If I ever get around to building the firmware or making WOF tables I'll keep you updated. I might try to set up some sort of archive for a lot of this talos repo stuff just in case it all becomes abandonware. Currently it's not building because it fails to fetch repositories from bitbake, dead links I suppose.

I haven't upgraded the FPGA with the PNOR/BMC flash so that may be causing it, looking at FPGA changes there are some minor things related to Talos II lite. I've got a flash programmer here I just dont want to brick the FPGA.

bobpaul

  • Newbie
  • *
  • Posts: 22
  • Karma: +3/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #4 on: February 26, 2025, 07:33:27 am »



I've tried booting it on several firmware versions with the 20CY230 but it just doesn't IPL or give me any info from the PNOR serial interface or from the BMC.

When you say it doesn't IPL, does absolutely nothing come up on the console, or do you at least get some output from Hostboot?

There are a 4 WOF csv files meant for 16core SFORZA over on https://github.com/open-power/WOF-Tables which aren't included in Raptor's PNOR, but if you're not getting any Hostboot output at all, I think it's more than just missing WOF tables. But I can build a PNOR that simply includes all of the 16core WOF tables.

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg.

Try using the PNOR from v2.00 and upgrading the BMC to v2.10. My guess is the BMC firmware is unrelated to your issue.

I don't think we're quit there yet, but here's the diff between v2.00 and v2.10 pnor. There's changes to talos_config and a talos_defconfig that we'd probably want to start with.

I might try to set up some sort of archive for a lot of this talos repo stuff just in case it all becomes abandonware. Currently it's not building because it fails to fetch repositories from bitbake, dead links I suppose.

Easiest way to do this is clone the repo and fix any broken URLs and get it building. After it's built, commit any changes you made (hopefully only be URLs) and then save your bitbake download cache. If bitbake sees a file in the download cache that matches a download, it does checksum verify and if the checksum is good it doesn't try to fetch from the URL. This unfortunately means project developers often don't notice when a dependency URL has changed until a new dev joins the team and tries a fresh checkout.

I haven't upgraded the FPGA with the PNOR/BMC flash so that may be causing it, looking at FPGA changes there are some minor things related to Talos II lite. I've got a flash programmer here I just dont want to brick the FPGA.

Check your motherboard revision. If it's HW rev1.00 then don't upgrade past FPGA v1.07. If it's HW rev 1.01 then definitely upgrade it but don't downgrade it.

bobpaul

  • Newbie
  • *
  • Posts: 22
  • Karma: +3/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #5 on: February 26, 2025, 12:34:57 pm »
If the issue with the 20CY230 is strictly a WOF table issue, then try this: https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-16-20250226

This has all of the upstream WOF tables for 16 core and below. When there were conflicts between upstream and what Raptor had, I kept Raptor's.

But my understanding (not sure if correct) is SFORZA cpus should work without WOF they just won't turboboost.

Not sure if you saw, but you can test the PNOR without flashing it.

Edit I just noticed I'm also including a previously unreleased change that bumps the "X frequency to 2GHz". From the commit date I had assumed that was already in v2.10, but it's not. If this doesn't work I can exclude that change. I can also try building a v2.00 pnor that has the extra WOF tables.
« Last Edit: February 26, 2025, 09:08:47 pm by bobpaul »

adaptl

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #6 on: February 27, 2025, 04:07:24 am »

When you say it doesn't IPL, does absolutely nothing come up on the console, or do you at least get some output from Hostboot?


I don't get anything from hostboot's serial console. I just remembered that last time I tried, I got a bootup screen on the VGA, but nothing else. I'll try again without the VGA disable pin. This is what it looks like. It just boot loops 3 or 4 times and stays on. The loading bar loads a bit but gets stuck like a 1/6th of the way. Looks like this: https://files.catbox.moe/fzskrz.jpg

There are a 4 WOF csv files meant for 16core SFORZA over on https://github.com/open-power/WOF-Tables which aren't included in Raptor's PNOR, but if you're not getting any Hostboot output at all, I think it's more than just missing WOF tables. But I can build a PNOR that simply includes all of the 16core WOF tables.

I've just started to build the PNOR without the debian chroot and it is coming along pretty well. I am going to leave it compiling overnight. I think it's complaining about some XML depedency?

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg.

Try using the PNOR from v2.00 and upgrading the BMC to v2.10. My guess is the BMC firmware is unrelated to your issue.

Isn't the BMC firmware the same from System Package v2.00 and v2.10? BMC v2.10 and PNOR v2.00 is what ive usually got on and it works but complains about WOF tables.


I don't think we're quit there yet, but here's the diff between v2.00 and v2.10 pnor. There's changes to talos_config and a talos_defconfig that we'd probably want to start with.

That is a good start

If the issue with the 20CY230 is strictly a WOF table issue, then try this: https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-16-20250226

Oh wow this is awesome

bobpaul

  • Newbie
  • *
  • Posts: 22
  • Karma: +3/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #7 on: February 27, 2025, 10:50:15 am »

I don't get anything from hostboot's serial console. I just remembered that last time I tried, I got a bootup screen on the VGA, but nothing else. I'll try again without the VGA disable pin. This is what it looks like. It just boot loops 3 or 4 times and stays on. The loading bar loads a bit but gets stuck like a 1/6th of the way. Looks like this: https://files.catbox.moe/fzskrz.jpg


VGA disable jumper doesn't actually disable the VGA output during hostboot, so that shouldn't matter. Your jpg link is showing bad cert (default traefik certificate) and 404.


I've just started to build the PNOR without the debian chroot and it is coming along pretty well. I am going to leave it compiling overnight. I think it's complaining about some XML depedency?


I added a debian Dockerfile and instructions that worked for me.


Isn't the BMC firmware the same from System Package v2.00 and v2.10? BMC v2.10 and PNOR v2.00 is what ive usually got on and it works but complains about WOF tables.


Oh, you're right. I suppose I should downgrade my PNOR and see how my machine behaves. There weren't any WOF file changes between v2.00 and v2.10. Both use the same revision of the machine-xml repo (they renamed the repo, but same rev hash). EDIT I tried v2.00 on mine and it also works for me and I also still get the WOF complaints.

Code: [Select]
$ git checkout raptor-v2.10
$ git diff raptor-v2.00 openpower/configs/talos_defconfig

....

 BR2_HOSTBOOT_CONFIG_FILE="talos.config"
-BR2_OPENPOWER_MACHINE_XML_GITHUB_PROJECT_VALUE="talos-xml"
+BR2_HOSTBOOT_USE_ALTERNATE_GCC=y
+BR2_OPENPOWER_MACHINE_XML_GITHUB_PROJECT_VALUE="machine-talos-ii/machine-xml"
 BR2_OPENPOWER_MACHINE_XML_VERSION="cbd11e9450325378043069d7e638668ea26c2074"
 BR2_OPENPOWER_MACHINE_XML_FILENAME="talos.xml"

....


EDIT: Here's a re-build of pnor v2.00 but with the extra WOF tables for 16-core and below https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-4-8-12-16-core_raptor-v2.00
« Last Edit: February 28, 2025, 12:00:35 am by bobpaul »

adaptl

  • Newbie
  • *
  • Posts: 16
  • Karma: +0/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #8 on: February 27, 2025, 10:20:30 pm »
Alright I got the XML depedencies installed, I also had to install a static zlib to get it compiling. I misremembered the build failing with bitbake URLs earlier, it was actually buildroot that's failing to fetch stuff.

Code: [Select]
--2025-02-27 22:50:28--  http://sources.buildroot.net/ppe42-gcc/ppe42-gcc-84a6a88e95d3b52cf4a6979a5ca47a12daa6ec49-br1.tar.gz
Resolving sources.buildroot.net... 172.67.72.56, 104.26.0.37, 104.26.1.37, ...
Connecting to sources.buildroot.net|172.67.72.56|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-02-27 22:50:33 ERROR 404: Not Found.

bobpaul

  • Newbie
  • *
  • Posts: 22
  • Karma: +3/-0
    • View Profile
Re: What does it take to support a new CPU?
« Reply #9 on: March 04, 2025, 01:24:25 pm »
Alright I got the XML depedencies installed, I also had to install a static zlib to get it compiling. I misremembered the build failing with bitbake URLs earlier, it was actually buildroot that's failing to fetch stuff.

Code: [Select]
--2025-02-27 22:50:28--  http://sources.buildroot.net/ppe42-gcc/ppe42-gcc-84a6a88e95d3b52cf4a6979a5ca47a12daa6ec49-br1.tar.gz

2 things I notice. That revision (84a6a88e95d3b5) indicates you're building a v2.1 firmware or newer. The second thing I notice is ppe42-gcc should come come from one of raptor's gitlab repos, not from sources.buildroot.net. Did you make local changes or fail to git submodules update or switch revisions without clearing your `output/` folder? What does git status show? Which git revision of are you on? Also I think buildroot is sensitive to environment variables; if you're not using `bash` as your shell, it might not work.

Basically I would only expect it to try to fetch from sources.buildroot.net if the PPE42_GCC_SITE variable were messed with, and I doubt you edited openpower/package/ppe42-gcc/ppe42-gcc.mk and changed it. It seems something is just messed up in your build environment...



You don't have to check out my copy of raptor's firmware repo to do so, but I would recommend following my README instructions and using the Dockerfile.stretch (for building raptor-v2.00 based pnor) and Dockerfile.debian (for building raptor-v2.10 based pnor) which I added to my repo. I plan to update the wiki eventually (and probably add a page about using docker for the build environment). The page on troubleshooting/debugging hostboot could use some expansion as well.

Both bitbake and buildroot can be rather sensitive to the local environment. If you have conflicting environment variables (maybe using a shell other than bash) could cause it to do weird things. My docker files should behave the same as a properly set-up debian chroot, but since the Dockerfile acts as a script, the environment is always the same and resetting the environment is as simple as exiting and then running the specific `docker run` command again. At some point I'll probably get Fedora and Ubuntu based dockerfiles working