The SATA works fine, I tried it in my x86 system. It even has a POST screen.
Turns out, the power supply connection wasn't right. The power supply is modular, and the cable is connected to the "Peripheral and SATA" section, but moving to another spot and rebooting made it work. The error messages were rather misleading, they had me thinking it was an issue with the card. I would have thought a power issue to the hard drive would have resulted in
lspci showing the drives,
fdisk -l not showing the drives, and no crazy "freak out" messages in
dmesg.
Edit: after a reboot, it's back to not working with the device failure messages in dmesg.
Edit2: works again, the SATA cable need to be replaced, the old one had a piece of plastic broken so the connection might not have been snug. But there are
still errors in
dmesg, even though I was able to mount the drive and see it in
df -h. So it seems to work, but isn't stable?
[ 79.075136] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
[ 171.863222] ata4.00: exception Emask 0x52 SAct 0x800000 SErr 0xffffffff action 0xe frozen
[ 171.863243] EEH: Recovering PHB#31-PE#fd
[ 171.863248] ata4: SError: { RecovData RecovComm UnrecovData Persist Proto HostInt PHYRdyChg PHYInt CommWake 10B8B Dispar BadCRC Handshk LinkSeq TrStaTrns UnrecFIS DevExch }
[ 171.863274] EEH: PE location: UOPWR.A100059-Node0-CPU2 Slot3 (8x), PHB location: N/A
[ 171.863315] ata4.00: failed command: WRITE FPDMA QUEUED
[ 171.863347] EEH: Frozen PHB#31-PE#fd detected
[ 171.863372] ata4.00: cmd 61/00:b8:00:59:40/08:00:52:00:00/40 tag 23 ncq dma 1048576 ou
res 40/00:81:82:00:00/00:00:00:00:00/40 Emask 0x56 (ATA bus error)
[ 171.863392] EEH: Call Trace:
[ 171.863445] ata4.00: status: { DRDY }
[ 171.863467] EEH: [00000000b45087cd] __eeh_send_failure_event+0x7c/0x160
[ 171.863483] ata4: hard resetting link
[ 171.863514] EEH: [0000000073b36745] eeh_dev_check_failure+0x2c0/0x680
[ 171.863534] ahci 0031:01:00.0: AHCI controller unavailable!
[ 171.863561] EEH: [000000009272f56d] ahci_scr_read+0xe4/0x110 [libahci]
[ 171.863609] EEH: [000000000ee1ab2b] sata_scr_read+0xa0/0xc0
[ 171.863642] EEH: [00000000556c9f0a] ata_eh_link_autopsy+0xb8/0xdb0
[ 171.863677] EEH: [000000006a94c504] ata_eh_autopsy+0x58/0x130
[ 171.863713] EEH: [000000002996a79c] sata_pmp_error_handler+0x84/0xc70
[ 171.863750] EEH: [000000004a7b4d3d] ahci_error_handler+0x70/0xe0 [libahci]
[ 171.863787] EEH: [00000000cb070bbc] ata_scsi_port_error_handler+0x304/0x7f0
[ 171.863825] EEH: [0000000063c7adf8] ata_scsi_error+0xb4/0x100
[ 171.863865] EEH: [000000006d77a5c3] scsi_error_handler+0x11c/0x740
[ 171.863902] EEH: [00000000a8877226] kthread+0x124/0x130
[ 171.863938] EEH: [00000000a5874d14] ret_from_kernel_thread+0x5c/0x64
[ 171.863973] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 failures.
[ 171.864025] EEH: Notify device drivers to shutdown
[ 171.864050] EEH: Beginning: 'error_detected(IO frozen)'
[ 171.864080] PCI 0031:01:00.0#00fd: EEH: driver not EEH aware
[ 171.864084] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'none'
[ 171.864155] EEH: Collect temporary log
[ 171.864199] EEH: of node=0031:01:00.0
[ 171.864217] EEH: PCI device/vendor: ffffffff
[ 171.864251] EEH: PCI cmd/status register: ffffffff
[ 171.864294] EEH: PCI-E capabilities and status follow:
[ 171.864324] EEH: PCI-E 00: ffffffff ffffffff ffffffff ffffffff
[ 171.864367] EEH: PCI-E 10: ffffffff ffffffff ffffffff ffffffff
[ 171.864399] EEH: PCI-E 20: ffffffff
[ 171.864418] EEH: PCI-E AER capability register set follows:
[ 171.864457] EEH: PCI-E AER 00: ffffffff ffffffff ffffffff ffffffff
[ 171.864494] EEH: PCI-E AER 10: ffffffff ffffffff ffffffff ffffffff
[ 171.864544] EEH: PCI-E AER 20: ffffffff ffffffff ffffffff ffffffff
[ 171.864579] EEH: PCI-E AER 30: ffffffff ffffffff
[ 171.864610] PHB4 PHB#49 Diag-data (Version: 1)
[ 171.864642] brdgCtl: 00000002
[ 171.864661] RootSts: 00070040 00402000 c1010008 00100107 00000000
[ 171.864685] RootErrSts: 00000003 00000020 00000001
[ 171.864719] sourceId: 00000100
[ 171.864739] PhbSts: 0000001c00000000 0000001c00000000
[ 171.864784] Lem: 0000000102000000 0000000000000000 0000000002000000
[ 171.864828] PhbErr: 000008c000000000 0000080000000000 2148000098000240 a008400000000000
[ 171.864876] RxeArbErr: 0000000000000010 0000000000000010 80fd010000000000 0000000000000000
[ 171.864922] RegbErr: 00d0000810000000 0000000810000000 8800005800000000 0000000007011000
[ 171.864959] PE[000] A/B: 8000000000000000 8000000000000000
[ 171.864992] PE[..0ff] A/B: as above
[ 171.865013] EEH: Reset with hotplug activity
[ 172.903967] ata4: failed to resume link (SControl FFFFFFFF)
[ 172.904030] ata4: SATA link down (SStatus FFFFFFFF SControl FFFFFFFF)
[ 172.904094] ata4: EH complete
[ 172.905112] ------------[ cut here ]------------
[ 172.905135] WARNING: CPU: 31 PID: 343 at drivers/ata/libata-core.c:5919 ata_port_detach+0x88/0x1c0
[ 172.905180] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq kvm_hv kvm nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs overlay qrtr cmac algif_hash algif_skcipher af_alg bnep binfmt_misc btusb btrtl btbcm btintel btmtk rtl2832_sdr bluetooth r820t rtl2832 i2c_mux uvcvideo dvb_usb_rtl28xxu jitterentropy_rng videobuf2_vmalloc dvb_usb_v2 videobuf2_memops sha512_generic videobuf2_v4l2 snd_usb_audio dvb_core videobuf2_common drbg snd_hda_codec_hdmi snd_usbmidi_lib rc_core ansi_cprng videodev cdc_acm snd_rawmidi joydev evdev ecdh_generic snd_seq_device snd_hda_intel rfkill mc snd_intel_dspcfg ecc snd_hda_codec ctr snd_hda_core vmx_crypto snd_hwdep ofpart gf128mul ipmi_powernv snd_pcm powernv_flash sg ipmi_devintf ipmi_msghandler mtd snd_timer opal_prd at24 snd regmap_i2c soundcore nfsd auth_rpcgss nfs_acl lockd grace parport_pc lp parport fuse sunrpc loop configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic xts ecb hid_generic usbhid hid dm_crypt
[ 172.905330] dm_mod amdgpu gpu_sched drm_buddy i2c_algo_bit drm_display_helper drm_ttm_helper ttm sd_mod drm_kms_helper syscopyarea sysfillrect sysimgblt xhci_pci fb_sys_fops nvme xhci_hcd tg3 nvme_core crc32c_vpmsum t10_pi drm libphy usbcore crc64_rocksoft_generic ahci ptp crc64_rocksoft libahci crc_t10dif pps_core usb_common crct10dif_generic drm_panel_orientation_quirks crc64 crct10dif_common
[ 172.905706] CPU: 31 PID: 343 Comm: eehd Not tainted 6.1.0-10-powerpc64le #1 Debian 6.1.37-1
[ 172.905736] Hardware name: T2P9S01 REV 1.01 POWER9 0x4e1203 opal:skiboot-9858186 PowerNV
[ 172.905764] NIP: c0000000009bca48 LR: c0000000009bca38 CTR: 0000000000007ffe
[ 172.905800] REGS: c00000000b5d36d0 TRAP: 0700 Not tainted (6.1.0-10-powerpc64le Debian 6.1.37-1)
[ 172.905838] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28002820 XER: 20040000
[ 172.905876] CFAR: c0000000009cdec4 IRQMASK: 0
GPR00: c0000000009bca38 c00000000b5d3970 c00000000110c200 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000028002820 c00000000001ab38
GPR08: 0000000000000000 0000000000000001 0000000000000000 0000000000004000
GPR12: c0000000001a9ad0 c000000ffffc6d00 c000000000173458 c00020000790e400
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 c00000000c868068 00000000000000ab
GPR24: 0000000000000001 0000000000000001 0000000000000000 0000000000000080
GPR28: 0000000000000000 c000200005e30600 0000000000000000 c000200007b98000
[ 172.906191] NIP [c0000000009bca48] ata_port_detach+0x88/0x1c0
[ 172.906217] LR [c0000000009bca38] ata_port_detach+0x78/0x1c0
[ 172.906266] Call Trace:
[ 172.906284] [c00000000b5d3970] [c0000000009bca38] ata_port_detach+0x78/0x1c0 (unreliable)
[ 172.906325] [c00000000b5d39b0] [c0000000009bcd2c] ata_pci_remove_one+0x6c/0xa0
[ 172.906367] [c00000000b5d39f0] [c00800000df201d4] ahci_remove_one+0x5c/0x80 [ahci]
[ 172.906416] [c00000000b5d3a20] [c00000000085ce70] pci_device_remove+0x60/0x110
[ 172.906479] [c00000000b5d3a60] [c000000000938480] device_remove+0x70/0xd0
[ 172.906526] [c00000000b5d3a90] [c00000000093a75c] device_release_driver_internal+0x2cc/0x360
[ 172.906570] [c00000000b5d3ae0] [c00000000084c798] pci_stop_bus_device+0xb8/0x110
[ 172.906631] [c00000000b5d3b20] [c00000000084cb78] pci_stop_and_remove_bus_device+0x28/0x40
[ 172.906672] [c00000000b5d3b50] [c000000000066ea0] pci_hp_remove_devices+0x80/0x120
[ 172.906718] [c00000000b5d3bd0] [c000000000048510] eeh_reset_device+0xf0/0x2d0
[ 172.906766] [c00000000b5d3c80] [c0000000000474b4] eeh_handle_normal_event+0x744/0xa10
[ 172.906817] [c00000000b5d3d60] [c000000000048808] eeh_event_handler+0x118/0x1a0
[ 172.906867] [c00000000b5d3dc0] [c000000000173574] kthread+0x124/0x130
[ 172.906904] [c00000000b5d3e10] [c00000000000cedc] ret_from_kernel_thread+0x5c/0x64
[ 172.906955] Instruction dump:
[ 172.906977] 48010eb1 60000000 e87f0010 7fc4f378 483e2d41 60000000 7fe3fb78 48011375
[ 172.907015] 60000000 813f0020 69290400 7929b7e2 <0b090000> 387f3b80 4b7ab249 60000000
[ 172.907058] ---[ end trace 0000000000000000 ]---
$ df -h /datafs/
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 7.3T 28K 6.9T 1% /datafs
$ ls -lah /datafs/
ls: reading directory '/datafs/': Input/output error
total 0
$ dmesg
<snip from last dmesg>
[ 204.038662] ata4.00: disable device
[ 204.038708] sd 3:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=62s
[ 204.038728] sd 3:0:0:0: [sda] tag#23 Sense Key : Not Ready [current]
[ 204.038754] sd 3:0:0:0: [sda] tag#23 Add. Sense: Logical unit not ready, hard reset required
[ 204.038783] sd 3:0:0:0: [sda] tag#23 CDB: Write(16) 8a 00 00 00 00 00 52 40 59 00 00 00 08 00 00 00
[ 204.038810] I/O error, dev sda, sector 1379948800 op 0x1:(WRITE) flags 0x800 phys_seg 16 prio class 2
[ 204.038934] device offline error, dev sda, sector 7814252720 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[ 204.038978] Aborting journal on device sda1-8.
[ 204.039006] device offline error, dev sda, sector 7814252544 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[ 204.039058] device offline error, dev sda, sector 7814252544 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[ 204.039107] Buffer I/O error on dev sda1, logical block 976781312, lost sync page write
[ 204.039151] JBD2: I/O error when updating journal superblock for sda1-8.
[ 204.056711] Buffer I/O error on dev sda1, logical block 83, lost async page write
[ 204.114689] sd 3:0:0:0: [sda] Synchronizing SCSI cache
[ 204.114745] sd 3:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 204.114777] sd 3:0:0:0: [sda] Stopping disk
[ 204.114818] sd 3:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 204.115192] ahci 0031:01:00.0: AHCI controller unavailable!
[ 204.115220] ata1: failed to stop engine (-19)
[ 204.115257] ahci 0031:01:00.0: AHCI controller unavailable!
[ 204.115289] ata2: failed to stop engine (-19)
[ 204.115318] ahci 0031:01:00.0: AHCI controller unavailable!
[ 204.115349] ata3: failed to stop engine (-19)
[ 204.115384] ahci 0031:01:00.0: AHCI controller unavailable!
[ 204.115406] ata4: failed to stop engine (-19)
[ 204.205262] pci 0031:01:00.0: Removing from iommu group 5
[ 204.205459] pci 0031:01 : [PE# fd] Releasing PE
[ 204.205493] pci 0031:01 : [PE# fd] Removing DMA window #0
[ 204.205536] pci 0031:01 : [PE# fd] Disabling 64-bit DMA bypass
[ 216.633037] EEH: Sleep 5s ahead of complete hotplug
[ 221.653197] pci 0031:00:00.0: PCI bridge to [bus 01]
[ 221.653229] pci 0031:00:00.0: bridge window [mem 0x620c080000000-0x620c0ffefffff]
[ 221.653257] EEH: Notify device driver to resume
[ 221.653278] EEH: Beginning: 'resume'
[ 221.653287] EEH: Finished:'resume'
[ 221.653289] EEH: Recovery successful.
[ 426.512672] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 426.512678] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.512719] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.512778] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 426.512782] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.512919] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.512993] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.513054] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.513176] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.513286] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.513360] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 426.513492] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 427.336189] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[ 427.343453] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 427.343549] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 429.694157] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[ 429.701421] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 429.701507] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 441.008383] EXT4-fs error: 35 callbacks suppressed
[ 441.008378] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 441.008389] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008441] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008516] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 441.008538] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008649] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008716] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008773] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008887] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.008948] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.009006] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 441.009109] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.791980] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[ 448.800098] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 448.800124] EXT4-fs error: 5 callbacks suppressed
[ 448.800129] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800194] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[ 448.800253] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800344] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800453] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800558] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800619] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800701] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800805] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800879] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[ 448.800989] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
Starship is
https://starship.rs/.
The drive worked on my x86 machine, directly connected to the motherboard, and with the Raptor-provided SATA controller (this controller). So is the controller Marvell 88SE9215 not
really supported by POWER9?