Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - ejfluhr

Pages: 1 2 [3]
31
General Hardware Discussion / Re: 2u Blackbird Build with 18 cores?!
« on: December 11, 2021, 12:25:55 am »
Re: the xz compression behavior, can you tell if you are memory bandwidth starved?  Lots of cores vs. Blackbird's meager DIMM capacity isn't a very balanced combination. 

BTW, I looked at your awesome graphs up above more carefully and you can see how current & voltage are moving around --> that looks like the processor is indeed dynamically boosting/dropping frequency to manage within it's power limit.



32
General CPU Discussion / Re: The point about Power 10 currently
« on: December 10, 2021, 09:57:41 am »
https://www.itjungle.com/2021/11/08/inside-the-ibm-denali-power-e1080-system/

Pictures/info here show fancy "on the substrate" connectors for external cables, plus requirement for memory-buffer-based DIMMs.
It does seem that the expense of building around these features is pretty high, not something for an end-user-affordable system.


33
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 29, 2021, 08:31:18 am »
Pp. 35 of https://www.ibm.com/downloads/cas/6GZMODN3 indicates this (below).   Maybe "folding" means putting cores to sleep???

  Page 35
Processor Folding in Linux
It is essential to install a daemon package based on the host OS to enable utilization-based processor
folding for Static Power Saver and Idle Power Saver modes:
pseries-energy-1.4.0-1.el7.ppc64.rpm
pseries-energy-1.4.0-1.el6.ppc64.rpm
pseries-energy-1.4.0-1.sles11.ppc64.rpm
Version 5.4 has the necessary user space tools required to enable CPU Folding.4
 
Once this package is installed, the energyd daemon will monitor the system power mode and activate
processor folding when system power mode is set to "Static Power Saver" and deactivate processor
folding in all other modes.  The utilization-based CPU folding daemon will deactivate unused cores and
transition them to low power idle states until the CPU utilization increases and those cores are activated
to run a workload.
 
Utilization-based processor folding can be manually disabled using the following commands:
 
 /etc/init.d/energyd stop #Stop daemon now, activate all cores
 chkconfig energyd off #Do not restart daemon on startup
 
 -or-
 
 rpm -e pseries-energy #un-install the package completely
 
Alternatively, CPU cores can be folded or set to low power idle state in any power mode
manually using the following command line:
 
 echo 0 > /sys/devices/system/cpu/cpuN/online #Where N is the
logical CPU number
 
Please note that all active hardware threads of a core needs to be taken off-line using the above
command in order to move the core to a low power idle state.
 
The cores can be activated again with the following command:
 
 echo 1 > /sys/devices/system/cpu/cpuN/online #Where N is the
logical CPU number

34
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 29, 2021, 08:08:47 am »
Is the lack of "sleep" mode something to do with Linux support?    The processor has lots of core power-down mode as indicated by this doc, pp. 7:
   https://www.ibm.com/downloads/cas/6GZMODN3

Seems like it should be possible to use Linux tools to manage frequency.  pp. 15 says: 
>>#Use cpupower tool to query and set frequency
>>Available frequency steps from cpupower will list only the nominal range, but user can select full fre-
>>quency range to set and it will take effect.

>Does the CPU have any of these settings in it?
I believe the CPU contains processor voltage/freq limitation as defined in the VPD poundv, hence why Raptor's code overrides that.
Once that is done, possibly the WOF boost table has to also be over-written to match the VPD values --> I saw that in a comment in one of the scripts.

On top of that, Linux should be able to provide "direction" as to what cores are active and what frequency range to target.  I've played with that a bit on x86 a few years ago so hopefully that also works on POWER9?  I don't have a POWER9 system to play with but here is what I get on my x86 laptop which shows min - max of 800MHz - 3600MHz:
>ls /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo*
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency

>cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo*
3600000
800000
0

35
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 22, 2021, 11:40:35 am »
Well, then sorry, I cannot help.  I would guess this is a pretty technical rabbit hole that likely few people have gone thru; maybe even only Raptor.   I was trying to contribute to the cause, exploring how to make it work by crawling thru all the code from Raptor & IBM in the repos.   It seems that Raptor figured it out, so the path to success is in there, somewhere.

36
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 20, 2021, 04:52:22 am »
So it looks like the WOF boost tables are posted here?
https://github.com/open-power/WOF-Tables

Lots of interesting info in this.  Seems like the 4-core target frequency is 3200 and the ultraturbo frequency is 3800.  Last column looks to be the "WOF frequency" according to the header.
Here is the header followed by a single line (of thousands) which seems to represent something about how it decides the target frequency:  (I'm not sure how to fix the column formatting, so readers will have to deconvolve that...)
MOPT   YIELD   PACKAGE   VERSION   SOCKET_POWER   RDP_CAPACITY   CORE_COUNT   PDV_SORT_POWER_TARGET_FREQ   PDV_SORT_POWER_ULTRA_TURBO_FREQ   NEST_FREQ   VRATIO_START   VRATIO_STEP   FRATIO_START   FRATIO_STEP   CORE_CEFF   CORE_CEFF_INDEX   NEST_CEFF   NEST_CEFF_INDEX   ACTIVE_QUADS   VRATIO   VRATIO_INDEX   FRATIO   FRATIO_INDEX   WOF_FREQ
OpenPOWER Raptor   95   Sforza   v7.3.3   90   108   4   3200   3800   1867   0.0409   0.0417   1   0.1   0   0   0.25   0   6   0.0409   0   0.6   4   3800



Still cannot find how they get consumed by the boot process, though.

37
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 20, 2021, 04:26:22 am »
Ok, now I see that raptor-aggressive mods the CSV version of the WOF boost table to crank up the max frequency to 4.2G or 4.4G.
>#!/bin/bash
>
>cd ../wofdata
>php ../raptor-util/woferclock.php 5 3536 WOF_V7_3_3_SFORZA_4_90_3200_TM.csv.original WOF_V7_3_3_SFORZA_4_90_3200_TM.csv 1.187 4200
>php ../raptor-util/woferclock.php 5 4052 WOF_V7_3_3_SFORZA_8_160_3500_TM.csv.original WOF_V7_3_3_SFORZA_8_160_3500_TM.csv 1.05 4400
>php ../raptor-util/woferclock.php 5 3094 WOF_V7_3_3_SFORZA_18_190_2800_TM.csv.original WOF_V7_3_3_SFORZA_18_190_2800_TM.csv 1.10 4200
>php ../raptor-util/woferclock.php 5 3039 WOF_V7_3_3_SFORZA_22_190_2750_TM.csv.original WOF_V7_3_3_SFORZA_22_190_2750_TM.csv 1.10 4200

I can't find those boost tables in the Raptor git tree; maybe it is somewhere else?   Perhaps on your system already and you can just replace the old with the hacked version?

Maybe some PNOR updater here?  https://git.raptorcs.com/git/pnor/tree/update_image.pl
>if(-e $wof_binary_filename)
>    {
>        $sections{WOFDATA}{in} = "$wof_binary_filename";
>    }




38
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 20, 2021, 04:04:13 am »
Aha, looking farther it seems that they are programming poundv for a voltage uplift of 1.142x over the existing ultraturbo value.
https://git.raptorcs.com/git/vpdtools/plain/woferclock/woferclock_cpu
# Reasonable defaults
# Partly validated on initial silicon
# NOT GUARANTEED, starting point ONLY!
if [[ "$CORE_COUNT" == "4" ]]; then
   NEW_ULTRATURBO_MHZ=4200
   VOLTAGE_MULTIPLIER=1.142
fi
if [[ "$CORE_COUNT" == "8" ]]; then
   NEW_ULTRATURBO_MHZ=4400
   VOLTAGE_MULTIPLIER=1.142

If you can read out your original poundv values, then you know how much "headroom" your parts have to the maximum allowed:
https://git.raptorcs.com/git/vpdtools/plain/woferclock/update_poundv_buckets
># Rated limit plus safety margin
>#max_voltage = 1098
>
># Absolute maximum process limit
># Hardware damage WILL occur above this value!
>max_voltage = 1150

BTW, do you want or need WOF to manage the frequency/thermals, or just playing around and can manage it yourself?   With just the VPD update, perhaps you can disable WOF and manually override frequency via OS controls up to the newly-programmed ultraturbo maximum.   I don't know specifically how to do that, just believe it possible based on what little I know of Linux freq control.


39
Talos II / Re: Trying to overclock, what is raptor-aggressive
« on: November 20, 2021, 03:32:43 am »
Are you updating the VPD voltage/frequency table, or the WOF boost table, or both?    I'm not sure how it works on Linux, but the general architecture is that the VPD sets the maximum tested voltage/frequency and the WOF boost table defines how to set frequency based on (1) active workload power vs. TDP and (2) # of active cores.   The official boost table is designed to raise freq not beyond the module's AC power spec, and also prevent exceeding the RDP spec for the system VDD VRM (well, a transient spec on top of the RDP "DC" spec).

40
General Hardware Discussion / Re: 2u Blackbird Build with 18 cores?!
« on: November 12, 2021, 11:42:44 pm »
Nice update...tx!  Do you know if the xz compression uses the in-processor compression accelerator, or does it just run on the cores?

Do you know if WOF (dynamic frequency boosting) is active?   Since you are so power limited, perhaps it is lowering the core frequency when you add more threads than 1 per core?


42
General CPU Discussion / Re: IBM POWER9+ ?
« on: September 24, 2020, 02:06:37 am »
Even "simple" die shrinks are very, very costly, and POWER sales volumes cannot justify such an expense.  IBM succeeds due to its system sales, and OpenPOWER support is an offshoot of whatever the system roadmap drives for processor development.


43
Blackbird / Re: using bigger CPUs with some cores disabled on Blackbird?
« on: September 09, 2020, 05:50:08 pm »
Can you tell what Stop state the cores are put into before and after your command?    Technically the cores can be power-gated hence you can make larger-core-count modules behave similarly to the small-core-count variants.   The frequencies won't be identical, as the boost tables may adjust frequencies commensurate with extra regulator headroom but it won't line up identically to lower-core-count-tuned modules.  Higher frequency could affect your power readings, but taking cores offline should reduce power far faster than raising voltage/frequency, especially with 16->2 cores.

I'm not sure how well Linux supports all of the power management features of POWER9, though.   Are all cores active when at idle, or has it already power-gated many such that formally disabling them doesn't actually change their state?

May be worth posting your power/current readings if you have them.  VDD current is what matters, if you can separate that out.

Regards, Eric

44
Do the cores remain powered on but idle when doing that, or does the OS+firmware actually power-gate the cores and caches?   Ideally the cores and caches are power gated for maximum savings, but I'm not sure if the open-source setups fully implement all the stop states.

Note that the maximum boost frequency may also go up, offsetting the power reduction slightly until the technology-limited frequency is reached.  Although given the plots in this link, it may be that the procs naturally run at Fmax/Vmax most of the time as their active power tends to run toward idle:
   https://forums.raptorcs.com/index.php/topic,99.0.html

Some other good posts on power readings that seem relevant:
   https://forums.raptorcs.com/index.php/topic,199.0.html
   https://forums.raptorcs.com/index.php/topic,135.0.html
   
(I know a lot about the processor, not so much about Linux and I don't have a Raptor system to play with so can't test anything.)

Regards, Eric

Pages: 1 2 [3]