I just got a new Talos II TL2SV4 server a few days ago, with the 500GB Samsung internal NVMe storage option. Shortly after installing the OS, smartd started reporting "Device: /dev/nvme0, Critical Warning (0x02): Temperature" about once per day.
Looking at "smartctl -a /dev/nvme0", with the system totally idle, I see temperatures like this:
Temperature Sensor 1: 45 Celsius
Temperature Sensor 2: 52 Celsius
And after running "dd if=/dev/nvme0n1 of=/dev/null bs=1M" to generate load, after only a few seconds, I see temperatures like this:
Temperature Sensor 1: 46 Celsius
Temperature Sensor 2: 80 Celsius
These temperatures seem way too high. The BMC was also reporting a warning that "Temperature Pcie" was 49 C. I looked inside the case, and it looks like there's basically no way for any air to flow over the NVMe drive. It's all the way at the bottom against the edge of the case, with no ventilation around it.
Has anyone ran into this before? What should I do about it?