Author Topic: CPU hotplug / completely power down one CPU in a multi-CPU system  (Read 2256 times)

pocock

  • Sr. Member
  • ****
  • Posts: 280
  • Karma: +31/-0
    • View Profile

Linux provides a mechanism to disable individual cores.  This can be useful to reduce peak power consumption or to simulate a smaller environment, for example, if a developer with a Talos II wants to know how their application would perform on a Blackbird with a 4-core CPU, they can turn off all but 4 cores.

echo 0 | sudo tee /sys/devices/system/cpu/cpu1/online

Spoiler: If you put all the cores of one CPU offline with that command then you won't be able to access the RAM and PCI slots connected to that CPU and you might observe strange behaviour.

Is it possible to go one step further and completely power down a CPU socket and maybe the associated RAM banks too, almost as if they were removed from the board?

There is some documentation about Linux kernel hotplug and it suggests x86 only.  Maybe this would be good for another bounty but first it is important to understand whether the Raptor and POWER9 hardware supports this and whether it would lead to energy savings or other benefits.

Problems that would be solved with this:

- reducing heat output from Talos II workstations during summer heatwaves

- extending runtime for a system on UPS batteries
Debian Developer
https://danielpocock.com

ejfluhr

  • Newbie
  • *
  • Posts: 44
  • Karma: +3/-0
    • View Profile
Re: CPU hotplug / completely power down one CPU in a multi-CPU system
« Reply #1 on: July 29, 2020, 05:31:38 pm »
Do the cores remain powered on but idle when doing that, or does the OS+firmware actually power-gate the cores and caches?   Ideally the cores and caches are power gated for maximum savings, but I'm not sure if the open-source setups fully implement all the stop states.

Note that the maximum boost frequency may also go up, offsetting the power reduction slightly until the technology-limited frequency is reached.  Although given the plots in this link, it may be that the procs naturally run at Fmax/Vmax most of the time as their active power tends to run toward idle:
   https://forums.raptorcs.com/index.php/topic,99.0.html

Some other good posts on power readings that seem relevant:
   https://forums.raptorcs.com/index.php/topic,199.0.html
   https://forums.raptorcs.com/index.php/topic,135.0.html
   
(I know a lot about the processor, not so much about Linux and I don't have a Raptor system to play with so can't test anything.)

Regards, Eric
« Last Edit: October 23, 2020, 03:54:24 pm by ejfluhr »