Author Topic: Messing with WOF Tables  (Read 13388 times)

cy384

  • Newbie
  • *
  • Posts: 14
  • Karma: +6/-0
    • View Profile
    • http://cy384.com/
Messing with WOF Tables
« on: October 06, 2022, 08:44:59 pm »
tl;dr I bought an unsupported CPU, which was mostly ok, and I tweaked some firmware to make it work properly

This will be a bit of a narrative, documenting it in case anyone else is ever in the same situation:

I saw an astonishingly cheap used POWER9 CPU on ebay and knew it was finally time to buy a Raptor Blackbird.  Specifically, I now have a 02CY231, which is a 16 core, 160W part (not one of the chips that Raptor sells).  I figured since the Blackbird is rated for 160W it should be fine, and it does work out of the box, except it would only hit 90W, which I assume leaves a lot of performance on the table (for the record, I believe my BB shipped with 2.00 firmware).  I spotted a section in the boot log like this:

Code: [Select]
  4.94593|================================================
  4.96605|Error reported by fapi2 (0x3300) EID 0x90000566
  4.98696|  No WOF table match found
  4.98697|  ModuleId   0x10 fapi2::MOD_FAPI2_PLAT_PARSE_WOF_TABLES
  4.98698|  ReasonCode 0x332d fapi2::RC_WOF_TABLE_NOT_FOUND
  4.98699|  UserData1  Number of cores : 0x00100002000000a0
  4.98700|  UserData2  WOF Power Mode (1=Nominal, 2=Turbo) : 0x000009c400000012
  4.98700|------------------------------------------------
  4.98701|  Callout type             : Procedure Callout
  4.98702|  Procedure                : EPUB_PRC_HB_CODE
  4.98702|  Priority                 : SRCI_PRIORITY_HIGH
  4.98703|------------------------------------------------
  4.98704|  Callout type             : Hardware Callout
  4.98705|  Target                   : Physical:/Sys0/Node0/Proc0
  4.98706|  Deconfig State           : NO_DECONFIG
  4.98706|  GARD Error Type          : GARD_NULL
  4.98707|  Priority                 : SRCI_PRIORITY_MED
  4.98707|------------------------------------------------

Ok, seems suspicious, but what's a WOF table? Apparently, it's a CSV file, containing specifications of frequencies and voltages to manage the CPU, which gets compiled into the PNOR image (they're named something like "WOF_V7_4_2_SFORZA_16_160_2500_TM.csv").  What's PNOR?  Early stage bootloader flash.  Fortunately(?) this is all open source and can in theory be modified to support my CPU, so I've been messing with this every evening this week.  Gotta love a long day at work messing with build systems followed by a long evening of messing with build systems.

The instructions on the wiki to build the firmware are basically solid, just replace "talos" with "blackbird" in the obvious places.  One gotcha is that you definitely want to compile on an older distro, I ran Ubuntu 18.04 in a VM to do this.  The other gotcha I ran into was this one ( https://forums.raptorcs.com/index.php/topic,241.0.html ) but as far as I can tell, you don't need to modify OpenBMC if you're just tweaking the WOF tables in the PNOR.

Anyway, I got the firmware building.  I dug around in the files it downloads and found a Raptor repository called "blackbird-xml" which contains the WOF tables; sure enough, it didn't contain any for a 16 core 160W chip.  I searched around and did find a repository on github ( https://github.com/open-power/WOF-Tables ) with a bunch more, so I made a copy of "blackbird-xml" and added all the new WOF tables.  I changed the address of the repository in "machine-xml.mk" to point towards mine, and added the commit hash for my changes to the "blackbird_defconfig" file.  I built and got a new error, like this:

Code: [Select]
ERROR: PnorUtils::checkSpaceConstraints: Image provided (/home/cy384/blackbird-op-build/output/host/powerpc64le-buildroot-linux-gnu/sysroot/openpower_pnor_scratch//wofdata.bin.ecc) has size (6285312) which is greater than allocated space (3145728) for section=WOFDATA.  Aborting! at /home/cy384/blackbird-op-build/output/host/powerpc64le-buildroot-linux-gnu/sysroot/hostboot_build_images/PnorUtils.pm line 462.

I assume there's either a hard limit, or configured limit, on the size of the WOF table data in the PNOR, so I deleted all the WOF tables I didn't care about from my repository, updated the commit hash again, and it built successfully.

I followed the instructions on the wiki page to test out the new PNOR, and my BB booted without the WOF table error!  I did some load testing and sensors does report power usage near 160W, so I'm calling this a success.  The voltage regulators do get really spicy very quick, but that's a subject for another post.

MPC7500

  • Hero Member
  • *****
  • Posts: 606
  • Karma: +42/-1
    • View Profile
    • Twitter
Re: Messing with WOF Tables
« Reply #1 on: October 07, 2022, 10:29:01 am »
Great! It's a pity that there is no more user-friendly way to edit the WOF tables. This also applies to the fan curves.

MauryG5

  • Hero Member
  • *****
  • Posts: 777
  • Karma: +22/-1
    • View Profile
Re: Messing with WOF Tables
« Reply #2 on: October 07, 2022, 01:10:40 pm »
Possibly indeed your problem seems to stem from the fact that currently this Power9 model is not in the list of those directly supported by Raptor. In fact I did not know this particular 16 core model, I remembered that there is a 12 core model and then the classic 18 and 22 but I did not know anything about this 16. You can try to ask Raptor in any case if you still have problems, possibly they can tell you what you need to correct to make it work fully.

ejfluhr

  • Newbie
  • *
  • Posts: 48
  • Karma: +6/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #3 on: October 07, 2022, 01:19:54 pm »
Can you read out the processor frequencies and tell if it is boosting properly?    Do you see ~160W when at idle?   What about when running max threads off some heavy workload (e.g. Mersenne primes is a good one)?
« Last Edit: October 07, 2022, 01:22:53 pm by ejfluhr »

ejfluhr

  • Newbie
  • *
  • Posts: 48
  • Karma: +6/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #4 on: October 07, 2022, 01:55:37 pm »
Too bad this isn't publicly available:
   https://research.ibm.com/publications/deterministic-frequency-boost-and-voltage-enhancements-on-the-power10tm-processor

Abstract
Digital droop sensors with core throttling mitigate microprocessor voltage droops and enable a voltage control loop (undervolting) to offset loadline uplift plus noise effects, protecting reliability VDDMAX.  These combine with a runtime algorithm for Workload Optimized Frequency (WOF) that deterministically maximizes core frequency.  The combined effect is demonstrated across a range of workloads including SPECTM , and provides up to a 15% frequency boost and a 10% reduction in core voltage.


cy384

  • Newbie
  • *
  • Posts: 14
  • Karma: +6/-0
    • View Profile
    • http://cy384.com/
Re: Messing with WOF Tables
« Reply #5 on: October 07, 2022, 02:53:44 pm »
Can you read out the processor frequencies and tell if it is boosting properly?    Do you see ~160W when at idle?   What about when running max threads off some heavy workload (e.g. Mersenne primes is a good one)?

I haven't looked closely at the frequencies, it idles around 30W and only approaches 160W under heavy load.

ejfluhr

  • Newbie
  • *
  • Posts: 48
  • Karma: +6/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #6 on: October 07, 2022, 05:58:39 pm »
I copied that table CSV from GIT and filtered it to generate plots of frequency versus "CORE_CEFF" for a couple of different "VRATIO" values.
This link declares:  https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.905&rep=rep1&type=pdf

     Power ~= VDD2 x Fclk x Ceff
     where the effective switched capacitance, Ceff, is commonly expressed as the product of the physical capacitance CL, and the activity weighting factor α, each averaged over the N nodes.

I don't know how you can identify what "CORE_CEFF" is in your processor, but the equation shows how that correlates to power.  I.e. smaller Ceff equals lower power.
Then the plot looks meaningful since the lowest frequency is at the highest CORE_CEFF and the frequency climbs as CORE_CEFF gets smaller, up to some limit.
Since the largest value of CORE_CEFF in the table is 1.0, that would be the highest power condition presumably associated with the 160W power rating of the table/processor.

I could not figure out how to post an image of the graphs, nor will the forum let me post the XLSX file with base data plus graphing tab, since it is too big.   So I deleted a bunch of rows from the base data that had "NEST_CEFF" > 0.25.   This let me shrink the XLSX file enough to post it.   

The first tab is the CSV data as posted.  The second tab "Plotme" is a filter + graph that can be manipulated by the red-colored cells; one variable showing a big difference is the VRATIO which can be modified by adjusting the VRATIO_INDEX box in integer values from 0 to 23 (the table has entries for all of those).  The other 2 tabs are copies of the Plotme tab with just the values & graphs at VRATIO=1.0 and VRATIO=0.7498; this let me save and review them side-by-side.   You could probably get fancy and plot all the variations on a single graph but I didn't care to go that far.

I picked VRATIO=1.0 and VRATIO=0.7948 because the maximum frequency changes substantially between all those values, starting at 3.4GHz and climbing to 3.8GHz.  You can play with the VRATIO_INDEX in the Plotme tab and see how the frequency curve continues to increase at different CORE_CEFF values, though always capped to that 3.8GHz. 

Raptor quotes the 190W 18-Core CPU as:  2.8GHz - 3.8GHz, so presumably you now have a 160W 16-Core CPU  of 2.5GHz - 3.8GHz.
https://raptorcs.com/content/CP9M36/intro.html
   CP9M36
   IBM POWER9 v2 CPU (18-Core)
       18 cores per package
           2.8GHz base / 3.8GHz turbo (WoF)
           190W TDP

User @deepblue was running an 18-Core CPU on a Blackbird mainboard, though with extra cooling:
   https://forums.raptorcs.com/index.php/topic,99.0.html
Hopefully you will find out if, long term, the Blackbird can handle a 16-core P9 when it matches the TDP of the supported 8-core version.

Nice work!   Please report back in a few months....

« Last Edit: October 07, 2022, 06:03:54 pm by ejfluhr »

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 491
  • Karma: +39/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Messing with WOF Tables
« Reply #7 on: October 07, 2022, 11:21:05 pm »
Just wanted to say, big fan of ssheven.

Nice work on getting it ramped up, though you've already found out that the vregs on the Blackbird could get marginal with high load. Basically you have the 8-core that Raptor sells but with the paired cores still on (the 8-core they sell, two of which are in this T2, is more or less the same 16-core you have there with the paired cores fused off). Like ejfluhr, I'll be interested to hear how it works out long term with the components; my Blackbird is a little 4-core.

ejfluhr

  • Newbie
  • *
  • Posts: 48
  • Karma: +6/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #8 on: October 10, 2022, 05:58:15 pm »
>Basically you have the 8-core that Raptor sells but with the paired cores still on

What a neat observation!   Something to consider is that 160W at 2.5GHz is probably achieved at much lower VDD than the 8c 160W at 3.45GHz (https://raptorcs.com/content/CP9M32/intro.html).

P=IxV  =>  160W = I_3.45 x V_3.45 = I_2.5 x V_2.5

If voltage moves 1:1 with frequency (I don't know if it does, could be 1:2 or 2:1, but got to start somewhere), then 3.45GHz/2.5GHz = 1.38x.   So V_3.45 = V_2.5 x 1.38, alternatively V_2.5 = V_3.45 / 1.38.

So if P is constant, and V_2.5 is that much below V_3.45, then I_2.5 has to be increased by 1.38.

This is probably why the Blackbird wiki states:  Other CPUs (CPUs with a TDP greater than 160W) may operate without WoF due to power regulator limitations.

Even without WOF (i.e. at the base of 2.5GHz), you are probably pushing those regulators much harder than the 8c module does.

I couldn't find any information on how much VDD current the Blackbird regulators can support.  In the post by user deepblue, graphs indicate the processor was exceeding VDD load of 130A at 0.89v under some load.    Doing stupid translation to 16c just to see where that lands is (130A / 18c) x 16c ~= 115, presumably lower as the 18c is 190W and the 16c is 160W.   If your temps are running high, then perhaps the regs are only built to support ~100A and you are at or over that spec...


lepidotos

  • Jr. Member
  • **
  • Posts: 55
  • Karma: +7/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #9 on: January 31, 2025, 07:46:21 pm »
I bought the same CPU and I'm looking at some of the other tables, I wonder if one of the ones for 140 W could be used instead, or the voltage dropped a little and the nest and base frequency dropped a lot. I guess I'll test undervolts when I get the chance, but if I do go too far and don't give the CPU enough power, is there a way to bump it back up to regular voltages?

cy384

  • Newbie
  • *
  • Posts: 14
  • Karma: +6/-0
    • View Profile
    • http://cy384.com/
Re: Messing with WOF Tables
« Reply #10 on: February 01, 2025, 10:59:32 am »
I wouldn't use any of the other ones, but modifying the matching one would be easy enough.  Ultimately, they're just spreadsheets with a bunch of values.

I'm probably dropping mine back down to the defaults anyway.  The extra heat and power usage doesn't seem worth it, especially since the blackbird is so limited on memory bandwidth.

lepidotos

  • Jr. Member
  • **
  • Posts: 55
  • Karma: +7/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #11 on: February 01, 2025, 06:58:59 pm »
Interesting, and that seems fair enough. Would you say there's a noticeable performance difference before and after, even if it's not enough to make it worthwhile? My case does have a bit more cooling capacity and internal volume so I have a little more headroom on heat, I'll still see how far undervolting gets me.

And yeah, the halved memory bandwidth does kind of hurt, but it seems like it's still on par with 7800X3D and such, so silver linings?
« Last Edit: February 01, 2025, 07:02:18 pm by lepidotos »

bobpaul

  • Newbie
  • *
  • Posts: 23
  • Karma: +3/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #12 on: February 26, 2025, 10:59:58 pm »
I copied that table CSV from GIT and filtered it to generate plots of frequency versus "CORE_CEFF" for a couple of different "VRATIO" values.
This link declares:  https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.905&rep=rep1&type=pdf

     Power ~= VDD2 x Fclk x Ceff
     where the effective switched capacitance, Ceff, is commonly expressed as the product of the physical capacitance CL, and the activity weighting factor α, each averaged over the N nodes.

That link doesn't seem to work, do you happen to still know how to navigate to it or know the name of the paper. I tried looking on doi.org using the number in the link, but I don't think that's a complete DOI number.

Trying to follow along and I'm not certain what most of the abbreviations mean, so the column titles are a bit nonsense. Fratio ... Frequency ratio? But ratio of what to what? And

But I did reproduce the graph in python, which should make it a bit easier to iteratively view a bunch of plots:

Code: [Select]
import pandas as pd
import matplotlib.pyplot as plt
wof = pd.read_csv('WOF_V7_4_2_SFORZA_16_160_2500_TM.csv')
print("Unique Vratio indexes are: ", wof['VRATIO_INDEX'].unique())

# filter the table down to a plotable slice
def plotme(wf, vratio_index, nest_ceff=0.25, active_quads=6, fratio=1, plotit=True):
    vratio = wf['VRATIO_START'][0] + vratio_index * wf['VRATIO_STEP'][0]
    wf=wf[
          (wf['FRATIO'] == fratio) &
          (wf['ACTIVE_QUADS']==active_quads) &
          (wf['NEST_CEFF']==nest_ceff) &
          (wf['VRATIO_INDEX']==vratio_index) &
          (wf['VRATIO']==vratio)
          ]
    wf[['WOF_FREQ','CORE_CEFF']].plot(x='CORE_CEFF',
            title=f'Fratio={fratio}, Vratio={vratio}, ACTIVE_QUADS={active_quads}, NEST_CEFF={nest_ceff}')
    # plt.show() blocks the thread. If you don't call it, you can render several plots in the background and plt.show() them at once all later
    if plotit:
        plt.show()

plotme(wof, 12)



this works well pasted into an ipython prompt. I see most values for Vratio_index just plot a flat line, 12 and a couple of values near 12 make a nice graph. But why 12? And somewhat surprisingly changing the number of active quads doesn't seem to have any effect. Also what defines a quad as active? Simply that it's not powergated and completely off or does "active" mean high load?

And I think the number 1 question is "how does one know which CSV gets selected for a given CPU"?

Edit
Plotting all Vratio values as a 3d surface plot is interesting:

Code: [Select]
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

def plotsurf(wf, file="", nest_ceff=0.25, active_quads=6, fratio=1, showplot=True, use_index=False):
    # the VRATIO_START column is always the same value, as far as I've noticed
    wf=wf[
          (wf['FRATIO'] == fratio) &
          (wf['ACTIVE_QUADS']==active_quads) &
          (wf['NEST_CEFF']==nest_ceff)
          ]
    vratio = wf['VRATIO_INDEX'] if use_index else wf['VRATIO']

    ax = plt.figure().add_subplot(projection='3d')
    ax.plot_trisurf(wf['CORE_CEFF'], vratio, wf['WOF_FREQ'], cmap=cm.jet, linewidth=0.2)
    ax.set_title(f'{file}\nFratio={fratio}, ACTIVE_QUADS={active_quads}, NEST_CEFF={nest_ceff}')
    ax.set_xlabel('Core_Ceff')
    ax.set_ylabel('Vratio')
    ax.set_zlabel('WOF_Hz')
    ax.view_init(15, 60, 0)
    if showplot:
        plt.show()
    return ax   

(EDIT: graphs should show vratio from 0-1. I fixed a bug in the above code which caused the values greater than 1 shown, but I didn't take new screenshots...



and I see from the file in hostboot where the error is generated that

Code: [Select]
  4.98699|  UserData1  Number of cores : 0x00100002000000a0
  4.98700|  UserData2  WOF Power Mode (1=Nominal, 2=Turbo) : 0x000009c400000012

means 12core, "mode=2" (turbo?) 160w, 2500MHz, with header().size=12. Looking at the SFORZA list, that's a 1800-2500MHz part with 3.8GHz turbo. Intuitively it' makes sense that table matches that CPU. But, also looking at the SFORZA list, every CPU on the list supports turbo mode to at least 3.8GHz, sometimes 4.1GHz.

So why are so many of the tables suffixed with _NM.csv? There's a WOF_V7_4_2_SFORZA_16_140_2200_NM.csv, but I don't see any 140w 16core SFORZAs on the list. And plotting it the same as the other, it still goes to 3.8GHz despite the "2200MHz normal mode" file label.  ???
« Last Edit: March 17, 2025, 07:55:04 pm by bobpaul »

ejfluhr

  • Newbie
  • *
  • Posts: 48
  • Karma: +6/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #13 on: March 14, 2025, 12:05:51 am »
>That link doesn't seem to work, do you happen to still know how to navigate to it or know the name of the paper.

Sorry, I don't.  As I recall, it was an educational presentation/paper on how processor power is modeled.  It is pretty well known material, maybe you can find other references.

Nominal mode is just a lower power TDP.  Turbo mode is a higher power TDP.   WOF should be able to boost from both, the proc will just start and get to higher frequencies sooner with the TM table.

It's been a long time, but I believe VRATIO just means the number of ON cores (a.k.a. active) relative to the maximum available.  As the # of active cores goes down, the power from those cores is applied to the remaining cores, allowing them to boost to higher frequencies.   Depending on the system & power limit, the proc could pretty quickly apply so much power credit from offline cores that it flat-lines at the maximum possible frequency, i.e. it becomes technology limited not power limited.  It sounds like, for that table, the processor is only power-limited when >= 12 cores ON.    Technically VRATIO is "voltage ratio" intended to handle quads using the internal voltage regulator at some % below the input voltage, but it ended up not getting supported and devolved to tracking # of cores.   A core in a stopped state with the power headers off has voltage of 0v hance VRATIO=0 for that core, and you add up the # of ON vs OFF cores to get he VRATIO.

FRATIO, or "frequency ratio" was intended to handle some cores operating at lower frequencies than processor-level frequency target, since each quad has it's own clock generation.  I don't believe Linux ever implemented support for different quad frequencies so they all just run at the same frequency?   If true, WOF only uses FRATIO=1.0 indices.


>And I think the number 1 question is "how does one know which CSV gets selected for a given CPU"?

WOF has a bunch of cross-checks between the data table and the module VPD data....I'm not exactly sure which, but possibly core count, nominal/turbo/max frequencies, & the sort power target?    E.g. each table is designed to manage to a certain power target, and the processor code is trying to find a table appropriate for that processor

Cool 3D plot!   For reference, CORE_CEFF is the ratio of the workload switching power relative to TDP, where TDP=1.0.   So 0.5 means the workload has half the switching power of TDP.     If the workload is using less power, the WOF table should attempt to raise frequency, up to the maximum allowed, where it flatlines.   Similarly, if there are fewer cores active, the frequency will go up, and if both are true, the frequency will go up more.   So that explains the shape of that plot.

« Last Edit: March 14, 2025, 12:09:41 am by ejfluhr »

bobpaul

  • Newbie
  • *
  • Posts: 23
  • Karma: +3/-0
    • View Profile
Re: Messing with WOF Tables
« Reply #14 on: March 17, 2025, 07:51:37 pm »
>That link doesn't seem to work, do you happen to still know how to navigate to it or know the name of the paper.

Sorry, I don't.  As I recall, it was an educational presentation/paper on how processor power is modeled.  It is pretty well known material, maybe you can find other references.

hmm, ok. I thought it might be more specific to these processors.

It's been a long time, but I believe VRATIO just means the number of ON cores (a.k.a. active) relative to the maximum available.  As the # of active cores goes down, the power from those cores is applied to the remaining cores, allowing them to boost to higher frequencies.   Depending on the system & power limit, the proc could pretty quickly apply so much power credit from offline cores that it flat-lines at the maximum possible frequency, i.e. it becomes technology limited not power limited.  It sounds like, for that table, the processor is only power-limited when >= 12 cores ON.    Technically VRATIO is "voltage ratio" intended to handle quads using the internal voltage regulator at some % below the input voltage, but it ended up not getting supported and devolved to tracking # of cores.   A core in a stopped state with the power headers off has voltage of 0v hance VRATIO=0 for that core, and you add up the # of ON vs OFF cores to get he VRATIO.

Ah-ha. So that's why VRATIO_INDEX is always 0-23 and VRATIO_STEP is always exactly 1/24 to 3 figures. VRATIO_STEP is always 0.0409 to account for rounding in VRATIO_STEP to ensure that 23*VRATIO_STEP + VRATIO_START == 1. So really, VRATIO_INDEX is the easier to follow number since it's just the number of active cores. And for a 16 core CPU, one should be able to ignore any VRATIO_INDEX > 15, as 0-15 cover 1-16 active cores, right?

I don't believe Linux ever implemented support for different quad frequencies so they all just run at the same frequency?   If true, WOF only uses FRATIO=1.0 indices.

Certainly in Linux I see different frequencies on a per-core basis, but I understood that to be simply what the kernel was requesting rather than what it's actually running at. I had thought it was the case where if 2 active cores in a quad had requested frequencies of 2200 and 1800 then the quad (and thus both cores) would run at 2200MHz. Linux knows that 4 threads are the same core, so those always show the same frequency, but I understood Linux didn't actually know about the quads and rather I thought it was the PNOR that was really in control of the quad frequencies. But I don't know of any way of checking what frequency a core (or quad) is actually running at.

All of the tables have FRATIO_STEP of 0.1, FRATIO_START of 1, and FRATIO of 0.6, 0.7, 0.8, 0.9, or 1.0. But... yeah, holding vratio constant and looking at fratio, it does not look like the fratio value has an impact, at least for this csv:



For reference, CORE_CEFF is the ratio of the workload switching power relative to TDP, where TDP=1.0.   So 0.5 means the workload has half the switching power of TDP.     If the workload is using less power, the WOF table should attempt to raise frequency, up to the maximum allowed, where it flatlines.   Similarly, if there are fewer cores active, the frequency will go up, and if both are true, the frequency will go up more.   So that explains the shape of that plot.

OK, I don't understand the name, but that makes more sense than my assumption that it was a physical property related to the switching capacitance.