Raptor Computing Systems Hardware > Blackbird

Blackbird Cooling

(1/3) > >>

cy384:
I've been struggling a bit to figure out how to best cool my Blackbird.  Possibly this is made worse by my choice of case (smallest mATX I could find, a SAMA IM01) and the CPU (160W 16 core).  I am using the 3U heatsink module with the provided fan.  Main complaints:

* CPU fan is oriented up/down, and blows downward, like an inch away from whatever's in the first PCIe slot, so I flipped the fan around to blow upwards, which means hot air going over the voltage regulators.
* No heatsink on the voltage regulators, while the Talos and Talos Lite both have them.  The BB doesn't have holes to mount a heatsink there, either.
* I'm not sure where various heat sensors are, physically, on the board.  Where's the ambient temperature sensor?  The PCIe sensor?  The CPU ambient sensor?
* RAM slots are as close as physically possible to the CPU (good for signal integrity, bad for airflow).
* Voltage regulator temps do not seem to be considered in setting fan speed.
* Changing the cooling parameters requires recompiling firmware.

I don't really care about any of these except that the voltage regulators hit 90C within a minute under heavy load.  I've ordered some tiny little heatsinks that can be stuck directly to the chips but I'm wondering if anyone else has a nice solution here.  They're really low on the board and in an awkward spot.

I do have a 3D printer and will design some ducts/shrouds if I can't get temps low enough otherwise.

Attaching some pics for the curious.

ClassicHasClass:
I do think the case is a big part of the problem. I certainly wouldn't run even an 8-core in an mATX case. Even the 4-core in my mATX system probably runs the fans more often than I'd like.

Vikings is working on a liquid cooling setup and this might be an option for you when they get it up for sale.

There are no ambient sensors on the board that I know of, and the manual doesn't mention any. The manual adds, for what it's worth, "The C1P9S01 factory setpoint for CPU core temperature is 60°C, and the system will attempt to maintain the cores at that temperature even under light or no load. As a result, a lightly loaded system may not benefit from air drawn over the CPU heatsink(s), and mainboard / peripheral cooling predominantly comes from chassis fans in this situation. For this reason, it is important to connect at least one chassis fan providing airflow over the mainboard surface in order to provide cooling for memory modules and other active components."

cy384:
just to share some numbers, here's what "sensors" reports at idle:


--- Code: ---nvme-pci-0100
Adapter: PCI adapter
Composite:    +39.9°C  (low  = -273.1°C, high = +76.8°C)
                       (crit = +79.8°C)
Sensor 1:     +39.9°C  (low  = -273.1°C, high = +65261.8°C)

ibmpowernv-isa-0000
Adapter: ISA adapter
Chip 0 Vdd Remote Sense: 683.00 mV (lowest =  +0.67 V, highest =  +1.01 V)
Chip 0 Vdn Remote Sense: 674.00 mV (lowest =  +0.67 V, highest =  +0.67 V)
Chip 0 Vdd:              685.00 mV (lowest =  +0.68 V, highest =  +1.02 V)
Chip 0 Vdn:              675.00 mV (lowest =  +0.68 V, highest =  +0.68 V)
Chip 0 Core 0:            +44.0°C  (lowest = +22.0°C, highest = +65.0°C)
Chip 0 Core 4:            +44.0°C  (lowest = +22.0°C, highest = +65.0°C)
Chip 0 Core 8:            +44.0°C  (lowest = +23.0°C, highest = +65.0°C)
Chip 0 Core 12:           +44.0°C  (lowest = +23.0°C, highest = +65.0°C)
Chip 0 Core 16:           +45.0°C  (lowest = +21.0°C, highest = +65.0°C)
Chip 0 Core 20:           +45.0°C  (lowest = +23.0°C, highest = +67.0°C)
Chip 0 Core 24:           +45.0°C  (lowest = +23.0°C, highest = +66.0°C)
Chip 0 Core 28:           +44.0°C  (lowest = +24.0°C, highest = +68.0°C)
Chip 0 Core 32:           +44.0°C  (lowest = +22.0°C, highest = +67.0°C)
Chip 0 Core 36:           +44.0°C  (lowest = +22.0°C, highest = +67.0°C)
Chip 0 Core 40:           +44.0°C  (lowest = +22.0°C, highest = +66.0°C)
Chip 0 Core 44:           +44.0°C  (lowest = +22.0°C, highest = +67.0°C)
Chip 0 Core 48:           +45.0°C  (lowest = +23.0°C, highest = +66.0°C)
Chip 0 Core 52:           +45.0°C  (lowest = +23.0°C, highest = +67.0°C)
Chip 0 Core 56:           +45.0°C  (lowest = +23.0°C, highest = +68.0°C)
Chip 0 Core 60:           +45.0°C  (lowest = +24.0°C, highest = +68.0°C)
Chip 0 DIMM 0 :           +49.0°C  (lowest = +30.0°C, highest = +51.0°C)
Chip 0 DIMM 1 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 2 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 3 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 4 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 5 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 6 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 7 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 8 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 9 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 10 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 11 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 12 :          +50.0°C  (lowest = +30.0°C, highest = +53.0°C)
Chip 0 DIMM 13 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 14 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 15 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 Nest:              +45.0°C  (lowest = +23.0°C, highest = +63.0°C)
Chip 0 VRM VDD:           +53.0°C  (lowest = +35.0°C, highest = +90.0°C)
Chip 0 :                  32.00 W  (lowest =  28.00 W, highest = 156.00 W)
Chip 0 Vdd:                4.00 W  (lowest =   2.00 W, highest = 127.00 W)
Chip 0 Vdn:                9.00 W  (lowest =   7.00 W, highest =  11.00 W)
Chip 0 :                 326.27 kJ
Chip 0 Vdd:               47.31 kJ
Chip 0 Vdn:               89.58 kJ
Chip 0 Vdd:                6.38 A  (lowest =  +4.00 A, highest = +129.75 A)
Chip 0 Vdn:               14.38 A  (lowest = +11.50 A, highest = +17.38 A)

--- End code ---

and I've attached an image of what the openbmc web interface reports.

Now that I look at it again, the "Temperature Pcie" might just be the NVMe drive.

ClassicHasClass:
Yes, that's what it is. Here's this T2, for comparison (dual-8, two NVMe drives, BTO WX7100 GPU):


--- Code: ---nvme-pci-330100
Adapter: PCI adapter
Composite:    +48.9°C  (low  = -273.1°C, high = +82.8°C)
                       (crit = +84.8°C)
Sensor 1:     +48.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +57.9°C  (low  = -273.1°C, high = +65261.8°C)

ibmpowernv-isa-0000
Adapter: ISA adapter
Chip 0 Vdd Remote Sense: 800.00 mV (lowest =  +0.66 V, highest =  +1.00 V)
Chip 0 Vdn Remote Sense: 701.00 mV (lowest =  +0.70 V, highest =  +0.70 V)
Chip 8 Vdd Remote Sense: 648.00 mV (lowest =  +0.64 V, highest =  +0.93 V)
Chip 8 Vdn Remote Sense: 661.00 mV (lowest =  +0.66 V, highest =  +0.66 V)
Chip 0 Vdd:              804.00 mV (lowest =  +0.67 V, highest =  +1.00 V)
Chip 0 Vdn:              702.00 mV (lowest =  +0.70 V, highest =  +0.70 V)
Chip 8 Vdd:              650.00 mV (lowest =  +0.65 V, highest =  +0.93 V)
Chip 8 Vdn:              662.00 mV (lowest =  +0.66 V, highest =  +0.66 V)
Chip 0 Core 0:            +53.0°C  (lowest =  +6.0°C, highest = +87.0°C)
Chip 0 Core 4:            +53.0°C  (lowest = +10.0°C, highest = +89.0°C)
Chip 0 Core 8:            +53.0°C  (lowest =  +6.0°C, highest = +88.0°C)
Chip 0 Core 12:           +53.0°C  (lowest = +33.0°C, highest = +89.0°C)
Chip 0 Core 16:           +53.0°C  (lowest = +31.0°C, highest = +87.0°C)
Chip 0 Core 20:           +53.0°C  (lowest = +31.0°C, highest = +87.0°C)
Chip 0 Core 24:           +54.0°C  (lowest = +32.0°C, highest = +87.0°C)
Chip 0 Core 28:           +54.0°C  (lowest = +32.0°C, highest = +87.0°C)
Chip 8 Core 32:           +44.0°C  (lowest = +30.0°C, highest = +74.0°C)
Chip 8 Core 36:           +44.0°C  (lowest = +29.0°C, highest = +75.0°C)
Chip 8 Core 40:           +43.0°C  (lowest = +31.0°C, highest = +80.0°C)
Chip 8 Core 44:           +43.0°C  (lowest = +30.0°C, highest = +74.0°C)
Chip 8 Core 48:           +43.0°C  (lowest = +29.0°C, highest = +75.0°C)
Chip 8 Core 52:           +44.0°C  (lowest = +29.0°C, highest = +75.0°C)
Chip 8 Core 56:           +44.0°C  (lowest = +30.0°C, highest = +75.0°C)
Chip 8 Core 60:           +44.0°C  (lowest = +29.0°C, highest = +72.0°C)
Chip 0 DIMM 0 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 1 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 2 :           +51.0°C  (lowest = +35.0°C, highest = +58.0°C)
Chip 0 DIMM 3 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 4 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 5 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 6 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 7 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 8 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 9 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 10 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 11 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 12 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 13 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 DIMM 14 :          +49.0°C  (lowest = +38.0°C, highest = +58.0°C)
Chip 0 DIMM 15 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 0 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 1 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 2 :           +41.0°C  (lowest = +35.0°C, highest = +44.0°C)
Chip 8 DIMM 3 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 4 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 5 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 6 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 7 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 8 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 9 :            +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 10 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 11 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 12 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 13 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 8 DIMM 14 :          +39.0°C  (lowest = +34.0°C, highest = +42.0°C)
Chip 8 DIMM 15 :           +0.0°C  (lowest =  +0.0°C, highest =  +0.0°C)
Chip 0 Nest:              +53.0°C  (lowest = +33.0°C, highest = +82.0°C)
Chip 8 Nest:              +44.0°C  (lowest = +31.0°C, highest = +68.0°C)
Chip 0 VRM VDD:           +53.0°C  (lowest = +40.0°C, highest = +71.0°C)
Chip 8 VRM VDD:           +39.0°C  (lowest = +35.0°C, highest = +56.0°C)
Chip 0 :                  44.00 W  (lowest =  28.00 W, highest = 140.00 W)
Chip 0 Vdd:               16.00 W  (lowest =   0.00 W, highest = 110.00 W)
Chip 0 Vdn:                9.00 W  (lowest =   7.00 W, highest =  12.00 W)
Chip 8 :                  31.00 W  (lowest =  28.00 W, highest = 127.00 W)
Chip 8 Vdd:                4.00 W  (lowest = 1000.00 mW, highest =  98.00 W)
Chip 8 Vdn:                8.00 W  (lowest =   7.00 W, highest =  11.00 W)
Chip 0 :                  28.18 MJ
Chip 0 Vdd:                4.20 MJ
Chip 0 Vdn:                7.71 MJ
Chip 8 :                  27.36 MJ
Chip 8 Vdd:                4.23 MJ
Chip 8 Vdn:                6.85 MJ
Chip 0 Vdd:               12.38 A  (lowest =  +0.38 A, highest = +113.38 A)
Chip 0 Vdn:               12.88 A  (lowest = +11.13 A, highest = +17.88 A)
Chip 8 Vdd:               12.63 A  (lowest =  +1.63 A, highest = +109.50 A)
Chip 8 Vdn:               13.13 A  (lowest = +11.63 A, highest = +17.00 A)

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:      750.00 mV
fan1:        1272 RPM  (min =  700 RPM, max = 4500 RPM)
edge:         +55.0°C  (crit = +99.0°C, hyst = -273.1°C)
PPT:           9.24 W  (cap =  95.00 W)

nvme-pci-310100
Adapter: PCI adapter
Composite:    +39.9°C  (low  = -273.1°C, high = +82.8°C)
                       (crit = +84.8°C)
Sensor 1:     +39.9°C  (low  = -273.1°C, high = +65261.8°C)
Sensor 2:     +47.9°C  (low  = -273.1°C, high = +65261.8°C)
--- End code ---

cy384:
Got these nice little copper heatsinks installed and did some quick temperature testing... not a huge difference, unfortunately!  Need to redirect more airflow over them.

Edit: also, the voltage regulators seem to be TDA21472, with a maximum recommended temperature of 125C and thermal shutdown at 140C.  I guess running them hot is only moderately concerning.

Navigation

[0] Message Index

[#] Next page

Go to full version