Recent Posts

Pages: 1 ... 7 8 [9] 10
81
General Discussion / Re: Is Raptor Computing Systems still open for business?
« Last post by markr87 on February 27, 2025, 07:20:44 pm »
@MPC7500, Still no reply from support, sales or X.  For some reason I am not seeing his email listed on the link you provided.

EDIT:  Never mind, it popped up once I followed him.
82
User Zone / Re: Calling for gaming experiences
« Last post by lepidotos on February 27, 2025, 05:17:50 pm »
83
Firmware / Re: What does it take to support a new CPU?
« Last post by bobpaul on February 27, 2025, 10:50:15 am »

I don't get anything from hostboot's serial console. I just remembered that last time I tried, I got a bootup screen on the VGA, but nothing else. I'll try again without the VGA disable pin. This is what it looks like. It just boot loops 3 or 4 times and stays on. The loading bar loads a bit but gets stuck like a 1/6th of the way. Looks like this: https://files.catbox.moe/fzskrz.jpg


VGA disable jumper doesn't actually disable the VGA output during hostboot, so that shouldn't matter. Your jpg link is showing bad cert (default traefik certificate) and 404.


I've just started to build the PNOR without the debian chroot and it is coming along pretty well. I am going to leave it compiling overnight. I think it's complaining about some XML depedency?


I added a debian Dockerfile and instructions that worked for me.


Isn't the BMC firmware the same from System Package v2.00 and v2.10? BMC v2.10 and PNOR v2.00 is what ive usually got on and it works but complains about WOF tables.


Oh, you're right. I suppose I should downgrade my PNOR and see how my machine behaves. There weren't any WOF file changes between v2.00 and v2.10. Both use the same revision of the machine-xml repo (they renamed the repo, but same rev hash). EDIT I tried v2.00 on mine and it also works for me and I also still get the WOF complaints.

Code: [Select]
$ git checkout raptor-v2.10
$ git diff raptor-v2.00 openpower/configs/talos_defconfig

....

 BR2_HOSTBOOT_CONFIG_FILE="talos.config"
-BR2_OPENPOWER_MACHINE_XML_GITHUB_PROJECT_VALUE="talos-xml"
+BR2_HOSTBOOT_USE_ALTERNATE_GCC=y
+BR2_OPENPOWER_MACHINE_XML_GITHUB_PROJECT_VALUE="machine-talos-ii/machine-xml"
 BR2_OPENPOWER_MACHINE_XML_VERSION="cbd11e9450325378043069d7e638668ea26c2074"
 BR2_OPENPOWER_MACHINE_XML_FILENAME="talos.xml"

....


EDIT: Here's a re-build of pnor v2.00 but with the extra WOF tables for 16-core and below https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-4-8-12-16-core_raptor-v2.00
84
Firmware / Re: What does it take to support a new CPU?
« Last post by adaptl on February 27, 2025, 04:07:24 am »

When you say it doesn't IPL, does absolutely nothing come up on the console, or do you at least get some output from Hostboot?


I don't get anything from hostboot's serial console. I just remembered that last time I tried, I got a bootup screen on the VGA, but nothing else. I'll try again without the VGA disable pin. This is what it looks like. It just boot loops 3 or 4 times and stays on. The loading bar loads a bit but gets stuck like a 1/6th of the way. Looks like this: https://files.catbox.moe/fzskrz.jpg

There are a 4 WOF csv files meant for 16core SFORZA over on https://github.com/open-power/WOF-Tables which aren't included in Raptor's PNOR, but if you're not getting any Hostboot output at all, I think it's more than just missing WOF tables. But I can build a PNOR that simply includes all of the 16core WOF tables.

I've just started to build the PNOR without the debian chroot and it is coming along pretty well. I am going to leave it compiling overnight. I think it's complaining about some XML depedency?

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg.

Try using the PNOR from v2.00 and upgrading the BMC to v2.10. My guess is the BMC firmware is unrelated to your issue.

Isn't the BMC firmware the same from System Package v2.00 and v2.10? BMC v2.10 and PNOR v2.00 is what ive usually got on and it works but complains about WOF tables.


I don't think we're quit there yet, but here's the diff between v2.00 and v2.10 pnor. There's changes to talos_config and a talos_defconfig that we'd probably want to start with.

That is a good start

If the issue with the 20CY230 is strictly a WOF table issue, then try this: https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-16-20250226

Oh wow this is awesome
85
Firmware / Re: Messing with WOF Tables
« Last post by bobpaul on February 26, 2025, 10:59:58 pm »
I copied that table CSV from GIT and filtered it to generate plots of frequency versus "CORE_CEFF" for a couple of different "VRATIO" values.
This link declares:  https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.905&rep=rep1&type=pdf

     Power ~= VDD2 x Fclk x Ceff
     where the effective switched capacitance, Ceff, is commonly expressed as the product of the physical capacitance CL, and the activity weighting factor α, each averaged over the N nodes.

That link doesn't seem to work, do you happen to still know how to navigate to it or know the name of the paper. I tried looking on doi.org using the number in the link, but I don't think that's a complete DOI number.

Trying to follow along and I'm not certain what most of the abbreviations mean, so the column titles are a bit nonsense. Fratio ... Frequency ratio? But ratio of what to what? And

But I did reproduce the graph in python, which should make it a bit easier to iteratively view a bunch of plots:

Code: [Select]
import pandas as pd
import matplotlib.pyplot as plt
wof = pd.read_csv('WOF_V7_4_2_SFORZA_16_160_2500_TM.csv')
print("Unique Vratio indexes are: ", wof['VRATIO_INDEX'].unique())

# filter the table down to a plotable slice
def plotme(wf, vratio_index, nest_ceff=0.25, active_quads=6, fratio=1, plotit=True):
    vratio = wf['VRATIO_START'][0] + vratio_index * wf['VRATIO_STEP'][0]
    wf=wf[
          (wf['FRATIO'] == fratio) &
          (wf['ACTIVE_QUADS']==active_quads) &
          (wf['NEST_CEFF']==nest_ceff) &
          (wf['VRATIO_INDEX']==vratio_index) &
          (wf['VRATIO']==vratio)
          ]
    wf[['WOF_FREQ','CORE_CEFF']].plot(x='CORE_CEFF',
            title=f'Fratio={fratio}, Vratio={vratio}, ACTIVE_QUADS={active_quads}, NEST_CEFF={nest_ceff}')
    # plt.show() blocks the thread. If you don't call it, you can render several plots in the background and plt.show() them at once all later
    if plotit:
        plt.show()

plotme(wof, 12)



this works well pasted into an ipython prompt. I see most values for Vratio_index just plot a flat line, 12 and a couple of values near 12 make a nice graph. But why 12? And somewhat surprisingly changing the number of active quads doesn't seem to have any effect. Also what defines a quad as active? Simply that it's not powergated and completely off or does "active" mean high load?

And I think the number 1 question is "how does one know which CSV gets selected for a given CPU"?

Edit
Plotting all Vratio values as a 3d surface plot is interesting:

Code: [Select]
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

def plotsurf(wf, file="", nest_ceff=0.25, active_quads=6, fratio=1, showplot=True, use_index=False):
    # the VRATIO_START column is always the same value, as far as I've noticed
    wf=wf[
          (wf['FRATIO'] == fratio) &
          (wf['ACTIVE_QUADS']==active_quads) &
          (wf['NEST_CEFF']==nest_ceff)
          ]
    vratio = wf['VRATIO_INDEX'] if use_index else wf['VRATIO']

    ax = plt.figure().add_subplot(projection='3d')
    ax.plot_trisurf(wf['CORE_CEFF'], vratio, wf['WOF_FREQ'], cmap=cm.jet, linewidth=0.2)
    ax.set_title(f'{file}\nFratio={fratio}, ACTIVE_QUADS={active_quads}, NEST_CEFF={nest_ceff}')
    ax.set_xlabel('Core_Ceff')
    ax.set_ylabel('Vratio')
    ax.set_zlabel('WOF_Hz')
    ax.view_init(15, 60, 0)
    if showplot:
        plt.show()
    return ax   

(EDIT: graphs should show vratio from 0-1. I fixed a bug in the above code which caused the values greater than 1 shown, but I didn't take new screenshots...



and I see from the file in hostboot where the error is generated that

Code: [Select]
  4.98699|  UserData1  Number of cores : 0x00100002000000a0
  4.98700|  UserData2  WOF Power Mode (1=Nominal, 2=Turbo) : 0x000009c400000012

means 12core, "mode=2" (turbo?) 160w, 2500MHz, with header().size=12. Looking at the SFORZA list, that's a 1800-2500MHz part with 3.8GHz turbo. Intuitively it' makes sense that table matches that CPU. But, also looking at the SFORZA list, every CPU on the list supports turbo mode to at least 3.8GHz, sometimes 4.1GHz.

So why are so many of the tables suffixed with _NM.csv? There's a WOF_V7_4_2_SFORZA_16_140_2200_NM.csv, but I don't see any 140w 16core SFORZAs on the list. And plotting it the same as the other, it still goes to 3.8GHz despite the "2200MHz normal mode" file label.  ???
86
Firmware / Re: What does it take to support a new CPU?
« Last post by bobpaul on February 26, 2025, 12:34:57 pm »
If the issue with the 20CY230 is strictly a WOF table issue, then try this: https://gitlab.com/bobpaul/talos-op-build/-/releases/wof-16-20250226

This has all of the upstream WOF tables for 16 core and below. When there were conflicts between upstream and what Raptor had, I kept Raptor's.

But my understanding (not sure if correct) is SFORZA cpus should work without WOF they just won't turboboost.

Not sure if you saw, but you can test the PNOR without flashing it.

Edit I just noticed I'm also including a previously unreleased change that bumps the "X frequency to 2GHz". From the commit date I had assumed that was already in v2.10, but it's not. If this doesn't work I can exclude that change. I can also try building a v2.00 pnor that has the extra WOF tables.
87
Firmware / Re: What does it take to support a new CPU?
« Last post by bobpaul on February 26, 2025, 07:33:27 am »



I've tried booting it on several firmware versions with the 20CY230 but it just doesn't IPL or give me any info from the PNOR serial interface or from the BMC.

When you say it doesn't IPL, does absolutely nothing come up on the console, or do you at least get some output from Hostboot?

There are a 4 WOF csv files meant for 16core SFORZA over on https://github.com/open-power/WOF-Tables which aren't included in Raptor's PNOR, but if you're not getting any Hostboot output at all, I think it's more than just missing WOF tables. But I can build a PNOR that simply includes all of the 16core WOF tables.

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg.

Try using the PNOR from v2.00 and upgrading the BMC to v2.10. My guess is the BMC firmware is unrelated to your issue.

I don't think we're quit there yet, but here's the diff between v2.00 and v2.10 pnor. There's changes to talos_config and a talos_defconfig that we'd probably want to start with.

I might try to set up some sort of archive for a lot of this talos repo stuff just in case it all becomes abandonware. Currently it's not building because it fails to fetch repositories from bitbake, dead links I suppose.

Easiest way to do this is clone the repo and fix any broken URLs and get it building. After it's built, commit any changes you made (hopefully only be URLs) and then save your bitbake download cache. If bitbake sees a file in the download cache that matches a download, it does checksum verify and if the checksum is good it doesn't try to fetch from the URL. This unfortunately means project developers often don't notice when a dependency URL has changed until a new dev joins the team and tries a fresh checkout.

I haven't upgraded the FPGA with the PNOR/BMC flash so that may be causing it, looking at FPGA changes there are some minor things related to Talos II lite. I've got a flash programmer here I just dont want to brick the FPGA.

Check your motherboard revision. If it's HW rev1.00 then don't upgrade past FPGA v1.07. If it's HW rev 1.01 then definitely upgrade it but don't downgrade it.
88
Firmware / Re: What does it take to support a new CPU?
« Last post by adaptl on February 25, 2025, 10:01:53 pm »
Have you tried booting the 02CY230 yet?

I've tried booting it on several firmware versions with the 20CY230 but it just doesn't IPL or give me any info from the PNOR serial interface or from the BMC.

The 02AA883 boots with anything except the latest firmware, and on the second-latest firmware (v2.00) it complains about not having WOF tables in the PNOR serial's dmesg. If I ever get around to building the firmware or making WOF tables I'll keep you updated. I might try to set up some sort of archive for a lot of this talos repo stuff just in case it all becomes abandonware. Currently it's not building because it fails to fetch repositories from bitbake, dead links I suppose.

I haven't upgraded the FPGA with the PNOR/BMC flash so that may be causing it, looking at FPGA changes there are some minor things related to Talos II lite. I've got a flash programmer here I just dont want to brick the FPGA.
89
Firmware / Re: What does it take to support a new CPU?
« Last post by bobpaul on February 25, 2025, 08:17:42 pm »
I'm currently going through this process, too, except I'm hoping to get the 02AA883 working slightly better. Mine boots fine on the latest firmware, but it doesn't load the WOF table.

Have you tried booting the 02CY230 yet?

90
Firmware / Re: Updating PNOR/BMC firmware on talos II Lite
« Last post by bobpaul on February 25, 2025, 07:29:33 pm »
I built one of the v1.xx BMC firmwares about a year ago. I might still have notes on that, but it involved switching a lot of the repo urls in bitbake packages. All the source is still online, just not at the same URLs.

You can find Raptor's Repos on


and at some point in time they moved some repos from one server to another.

Quote
Maybe this is because it's a special developer system with DD2.1-stepped CPU?

Do you know what part number your CPU is?

Edit: Oh, I see from your other thread and IRC that you it's the same 02AA883 CPU I have. I already mentioned on IRC, but I'm going to repeat here just so it's searchable by others: aside from "just" being DD2.1 stepping, this CPU uses unpaired cores. At least for adaptl and myself, all 4 of the active cores are on the same quad even (cores 0x14, 0x15, 0x16, and 0x17 are the active cores). All of the production (DD2.2+) 4 core CPUs were made for RaptorCS and only made from dies that had 4 usable cores which weren't paired (so ie, if core 0x14 and 0x15 were good, only one was used). The big ramification is L2 and L3 cache sharing which would worsen the spectre/meltdown concerns.

Here's the core layout in the Sforza CPUs (pg37). (core 0x00 from linux's perspective has address 0x20 on the diagram)


I wasn't sure if all of the DD2.1 special developer systems were the same or if some had unpaired cores.
Pages: 1 ... 7 8 [9] 10