Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - jas

Pages: [1]
1
General Discussion / More than two NVME's on HighPoint SSD7505?
« on: October 15, 2023, 04:55:09 am »
Hi. I have had great success with two NVME's in a HighPoint SSD7505 for a production system for months.  I'm building a second machine and wanted to populate all four NVME slots, but I got kernel messages on boot.  I have tried both 4 NVME's and 3 NVME's and tried a couple of different NVME's in case it was a NVME problem rather than a problem with the HighPoint device.

See links below for photos of kernel output (the forum says the upload folder is full so I can't attach them here):

https://josefsson.org/tmp/talos-highpoint-nvme-kernel515.jpg
https://josefsson.org/tmp/talos-highpoint-nvme-kernel6.jpg

I'm using Trisquel aramo with Linux-libre kernel 5.15 and 6, both relatively recently upgraded.  At this point the boot seems to hang even though the kernel continues to print things when I shut down the machine.  When I remove the two last NVME's and only have two NVME devices in the card, the boot works fine and everything has been stable for some time already.

This is a Talos II Lite system, and the HighPoint SSD7505 is in the only PCI slot that fits.

Any ideas?

2
General CPU Discussion / DD2.4?
« on: September 22, 2022, 04:44:39 am »
Hi.  Is a DD2.4 CPU stepping planned, and are there any public resources to track what would go into it?  And the timeline of it.

I noticed Raptor CS stopped selling DD2.2, does this indicate they are no longer being manufactured?

/Simon

3
Talos II / ipmi_sdr_cache_create: SDR record count invalid
« on: August 24, 2022, 07:26:08 am »
Hi.  I'm using freeipmi (version 1.6.6-4+deb11u1 from Debian 11) ipmi-sensors but it complains about SDR problems:

Code: [Select]
root@vello:~# ipmi-sensors --flush-cache
Flushing cache: /root/.freeipmi/sdr-cache/sdr-cache-vello.localhost
root@vello:~# ipmi-sensors
Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-vello.localhost
Caching SDR record 105 of 105 (current record ID 275)
ipmi_sdr_cache_create: SDR record count invalid
root@vello:~#

According to

https://www.mail-archive.com/freeipmi-users@gnu.org/msg01542.html

this suggests something is wrong in the hardware.  Any ideas?

For reference, the workaround works fine, and I get nice outputs like below.

Code: [Select]
root@vello:~# ipmi-sensors -W assumemaxsdrrecordcount
Caching SDR repository information: /root/.freeipmi/sdr-cache/sdr-cache-vello.localhost
Caching SDR record 105 of 105 (current record ID 275)
ID  | Name           | Type         | Reading    | Units | Event
3   | occ            | Processor    | N/A        | N/A   | N/A
4   | occ            | Processor    | N/A        | N/A   | N/A
8   | occ0           | Power Unit   | N/A        | N/A   | 'Device Enabled'
9   | occ1           | Power Unit   | N/A        | N/A   | 'Device Disabled'
17  | p0_core0_temp  | Temperature  | N/A        | C     | N/A
20  | p0_core1_temp  | Temperature  | N/A        | C     | N/A
23  | p0_core2_temp  | Temperature  | 37.00      | C     | 'OK'
26  | p0_core3_temp  | Temperature  | 37.00      | C     | 'OK'
29  | p0_core4_temp  | Temperature  | 37.00      | C     | 'OK'
32  | p0_core5_temp  | Temperature  | 37.00      | C     | 'OK'
35  | p0_core6_temp  | Temperature  | 37.00      | C     | 'OK'
38  | p0_core7_temp  | Temperature  | 37.00      | C     | 'OK'
41  | p0_core8_temp  | Temperature  | 37.00      | C     | 'OK'
44  | p0_core9_temp  | Temperature  | 37.00      | C     | 'OK'
47  | p0_core10_temp | Temperature  | 37.00      | C     | 'OK'
50  | p0_core11_temp | Temperature  | 37.00      | C     | 'OK'
53  | p0_core12_temp | Temperature  | N/A        | C     | N/A
56  | p0_core13_temp | Temperature  | N/A        | C     | N/A
59  | p0_core14_temp | Temperature  | N/A        | C     | N/A
62  | p0_core15_temp | Temperature  | N/A        | C     | N/A
65  | p0_core16_temp | Temperature  | 37.00      | C     | 'OK'
68  | p0_core17_temp | Temperature  | 37.00      | C     | 'OK'
71  | p0_core18_temp | Temperature  | 37.00      | C     | 'OK'
74  | p0_core19_temp | Temperature  | 37.00      | C     | 'OK'
77  | p0_core20_temp | Temperature  | 37.00      | C     | 'OK'
80  | p0_core21_temp | Temperature  | 37.00      | C     | 'OK'
83  | p0_core22_temp | Temperature  | 37.00      | C     | 'OK'
86  | p0_core23_temp | Temperature  | 37.00      | C     | 'OK'
91  | p1_core0_temp  | Temperature  | N/A        | C     | N/A
94  | p1_core1_temp  | Temperature  | N/A        | C     | N/A
97  | p1_core2_temp  | Temperature  | N/A        | C     | N/A
100 | p1_core3_temp  | Temperature  | N/A        | C     | N/A
103 | p1_core4_temp  | Temperature  | N/A        | C     | N/A
106 | p1_core5_temp  | Temperature  | N/A        | C     | N/A
109 | p1_core6_temp  | Temperature  | N/A        | C     | N/A
112 | p1_core7_temp  | Temperature  | N/A        | C     | N/A
115 | p1_core8_temp  | Temperature  | N/A        | C     | N/A
118 | p1_core9_temp  | Temperature  | N/A        | C     | N/A
121 | p1_core10_temp | Temperature  | N/A        | C     | N/A
124 | p1_core11_temp | Temperature  | N/A        | C     | N/A
127 | p1_core12_temp | Temperature  | N/A        | C     | N/A
130 | p1_core13_temp | Temperature  | N/A        | C     | N/A
133 | p1_core14_temp | Temperature  | N/A        | C     | N/A
136 | p1_core15_temp | Temperature  | N/A        | C     | N/A
139 | p1_core16_temp | Temperature  | N/A        | C     | N/A
142 | p1_core17_temp | Temperature  | N/A        | C     | N/A
145 | p1_core18_temp | Temperature  | N/A        | C     | N/A
148 | p1_core19_temp | Temperature  | N/A        | C     | N/A
151 | p1_core20_temp | Temperature  | N/A        | C     | N/A
154 | p1_core21_temp | Temperature  | N/A        | C     | N/A
157 | p1_core22_temp | Temperature  | N/A        | C     | N/A
160 | p1_core23_temp | Temperature  | N/A        | C     | N/A
161 | p0_vdd_temp    | Temperature  | 41.00      | C     | 'OK'
162 | p1_vdd_temp    | Temperature  | N/A        | C     | N/A
165 | dimm0_temp     | Temperature  | N/A        | C     | N/A
167 | dimm1_temp     | Temperature  | N/A        | C     | N/A
169 | dimm2_temp     | Temperature  | 42.00      | C     | 'OK'
171 | dimm3_temp     | Temperature  | 41.00      | C     | 'OK'
173 | dimm4_temp     | Temperature  | N/A        | C     | N/A
175 | dimm5_temp     | Temperature  | N/A        | C     | N/A
177 | dimm6_temp     | Temperature  | 39.00      | C     | 'OK'
179 | dimm7_temp     | Temperature  | 39.00      | C     | 'OK'
181 | dimm8_temp     | Temperature  | N/A        | C     | N/A
183 | dimm9_temp     | Temperature  | N/A        | C     | N/A
185 | dimm10_temp    | Temperature  | N/A        | C     | N/A
187 | dimm11_temp    | Temperature  | N/A        | C     | N/A
189 | dimm12_temp    | Temperature  | N/A        | C     | N/A
191 | dimm13_temp    | Temperature  | N/A        | C     | N/A
193 | dimm14_temp    | Temperature  | N/A        | C     | N/A
195 | dimm15_temp    | Temperature  | N/A        | C     | N/A
221 | fan0           | Fan          | 19900.00   | RPM   | 'OK'
222 | fan1           | Fan          | 19900.00   | RPM   | 'OK'
223 | fan2           | Fan          | 0.00       | RPM   | 'OK'
226 | fan3           | Fan          | 0.00       | RPM   | 'OK'
227 | fan4           | Fan          | 1700.00    | RPM   | 'OK'
228 | fan5           | Fan          | 0.00       | RPM   | 'OK'
229 | fan6           | Fan          | N/A        | RPM   | N/A
231 | p0_power       | Power Supply | 30.00      | W     | 'OK'
232 | p0_vdd_power   | Power Supply | 2.00       | W     | 'OK'
233 | p0_vdn_power   | Power Supply | 9.00       | W     | 'OK'
234 | p1_power       | Power Supply | N/A        | W     | N/A
235 | p1_vdd_power   | Power Supply | N/A        | W     | N/A
236 | p1_vdn_power   | Power Supply | N/A        | W     | N/A
252 | cpu_1_ambient  | Temperature  | 26.10      | C     | 'OK'
253 | pcie           | Temperature  | 39.00      | C     | 'OK'
254 | ambient        | Temperature  | 27.20      | C     | 'OK'
root@vello:~#

/Simon

4
General Discussion / Re: HighPoint SSD7505 RAID tools for ppc?
« on: August 14, 2022, 01:02:52 am »
Slightly off topic:
What is your reasoning behind "mdraid is not optimal for SSD/NVMe devices"?

In my experience the exact opposite is the case: hardware RAIDs are inflexible, often require proprietary tools (as is the case with the HighPoint adapters) and can't be fixed without the device/propr. driver/tools . If there's an issue with an mdraid I can fix it with any other computer and with tools I have ready on any other computer.

NVMEs generate more heat (and draw more power) when writing compared to spinning disks, and mdraid can cause quite a lot of write activity (rebuilds) that could be optimized a bit by reading&comparing before doing a write.

Otherwise and in general I agree with you, and I'm now running mdraid on the NVMe SSDs on these devices so we'll see how it works.

/Simon

5
Talos II / Re: Indium pads required for 18-core CPU on Talos II Lite?
« on: August 14, 2022, 12:51:06 am »
Thanks for responses!  Seems like there are some different opinions on this.  The second T2 was bought from Vikings, so I'll follow their suggestion to use Arctic MX-4 for that system.  My first T2 came from Raptor and came with the indium pad.

/Simon

6
Talos II / ECC memory error reporting tools?
« on: August 14, 2022, 12:47:27 am »
Hi.  On x86 systems, I often use 'edac-util' to read out ECC errors from the machine, and 'dmidecode' to read out information about the DIMM capsules.  Is there something similar for Talos?  I prefer to work from user-space, not via BMC.

/Simon

7
Talos II / Re: Secure Mode?
« on: July 29, 2022, 05:15:12 pm »
Thank you for the pointer!  Case closed (unless I run into problems following those steps...).

8
General Discussion / WD Black SN850 problems?
« on: July 29, 2022, 04:45:17 pm »
Hi.  I have tried two different WD Black SN850 SSD NVMe but I'm not able to get a Talos II Lite machine to recognize them.  Via a HighPoint SSD7505 and a cheap M.2->PCIe adapter.  Samsung 980 Pro and Kingston KC3000 works fine.  First I suspected the SSD was broken so I returned it and got another one, but this one behaves exactly the same -- that is, it is like it isn't connected to the machine at all.

Has anyone gotten the SN850 to work on Raptor machine?

9
Talos II / Indium pads required for 18-core CPU on Talos II Lite?
« on: July 29, 2022, 03:01:53 pm »
Hi.  My recent Talos II Lite & 18-core v2 CPU came without an indium pad, and I'm not sure it is required or not.  The manual says to install it, the wiki says on https://wiki.raptorcs.com/wiki/Talos_II/Building_FAQ#What_is_an_indium_pad.3F_Does_the_stock_HSF_include_it.3F the following

Code: [Select]
Indium pads help heat transfer from the CPU to the HSF. 4-core and 8-core CPUs do not require them (and do not ship with them). More powerful CPUs should ship with them if required (TBD whether pre-applied to the HSF, or separately).

Which doesn't really answer my question.  My previous T2Lite system came with an indium pad, so I'm curious if it is no longer necessary, or if I should be careful about using this system too much.  CPU temp seems to be 35-50 on both systems.

/Simon

10
Talos II / Secure Mode?
« on: July 29, 2022, 02:53:49 pm »
Hi.  I have a Talos II Lite system.  Is there any documentation on the 'Secure Mode' jumper, and how to setup things and enable the jumper?  It came shipped disabled and I reckon that is responsible for the following boot log output.  Did anyone try setting the jumper?  I'd rather not touch things without more information.  If someone knows what exactly it does that would be a good starting point to learn if this is something that is relevant to pursue.

Code: [Select]
  9.16916|SECURE|Security Access Bit> 0x0000000000000000
  9.16917|SECURE|Secure Mode Disable (via Jumper)> 0x8000000000000000
...
[   50.223613319,3] STB: VERSION verification FAILED. log=0xffffffffffff8160
[   51.341625520,3] STB: IMA_CATALOG verification FAILED. log=0xffffffffffff8160
[   52.027211979,3] CAPP: Error loading ucode lid. index=203d1
...
[   64.478188034,3] STB: BOOTKERNEL verification FAILED. log=0xffffffffffff8160

11
General Discussion / HighPoint SSD7505 RAID tools for ppc?
« on: July 29, 2022, 02:27:56 pm »
Hi.  I have a Talos II Lite with HighPoint SSD7505 with Samsung 980 PRO and Kingston KC3000 drives in it.  Everything works fine, and I'm using Linux md-raid which isn't really optimal for SSD/NVMe devices.  Has anyone managed to configure the SSD7505 device for RAID from a ppc machine?  Or do I have to move the SSD7505 to a x86 platform, run the proprietary HighPoint stuff (which I haven't even tried yet) to configure it, and then move it back?  I'm reluctant to do that unless I can find a good guide that is tailored towards Debian/Ubuntu/Fedora users with the device.

Pages: [1]