Author Topic: Raptor-provided SATA card issues  (Read 1633 times)

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Raptor-provided SATA card issues
« on: December 27, 2022, 09:50:33 am »
I'm using the SATA card provided by Raptor when I bought the Talos II. lsusb identifies it as: SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev ff).

Void Linux can see the optical drive as /dev/cdrom which is a link to /dev/sr0. However, I cannot mount a disc, nor can fdisk open it, both with the same error: "no medium found". I was trying some other distros, and after a few warm reboots, the OS no longer sees the optical drive. I attached a hard drive as well, and the distros I tried couldn't see the hard drive at all (though the hard drive works via USB-SATA cable).

dmesg | grep ata has this output:
Code: [Select]
[    2.522439] libata version 3.00 loaded.
[    2.547068] ata1: SATA max UDMA/133 abar m2048@0x620c080040000 port 0x620c080040100 irq 128
[    2.547072] ata2: SATA max UDMA/133 abar m2048@0x620c080040000 port 0x620c080040180 irq 128
[    2.547075] ata3: SATA max UDMA/133 abar m2048@0x620c080040000 port 0x620c080040200 irq 128
[    2.547078] ata4: SATA max UDMA/133 abar m2048@0x620c080040000 port 0x620c080040280 irq 128
[    2.862532] ata1: SATA link down (SStatus 0 SControl 300)
[    2.862564] ata2: SATA link down (SStatus 0 SControl 300)
[    2.863697] ata4: SATA link down (SStatus 0 SControl 300)
[   12.544009] ata3: softreset failed (1st FIS failed)
[   22.544008] ata3: softreset failed (1st FIS failed)
[   57.544575] ata3: softreset failed (1st FIS failed)
[   62.544519] ata3: softreset failed (1st FIS failed)
[   62.544542] ata3: reset failed, giving up
[   62.598911] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Quota mode: none.
[   62.671972] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Quota mode: none.
[   65.356285] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[60247.550077] PHB4 PHB#49 Diag-data (Version: 1)
[60247.572032] ata1: failed to stop engine (-19)
[60247.572050] ata2: failed to stop engine (-19)
[60247.572087] ata3: failed to stop engine (-19)
[60247.572103] ata4: failed to stop engine (-19)

So I'm wondering, is there a missing firmware file for this? Should I buy another SATA adapter?

Kernel: 6.0.13_1. Didn't work on older kernels either.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 443
  • Karma: +34/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Raptor-provided SATA card issues
« Reply #1 on: December 27, 2022, 06:21:56 pm »
Can Petitboot see it from the shell?

My T2 has a 88SE9235 which is slightly different, we've found. Raptor may have sold both. I have two optical drives in mine and both read fine and burn BD-ROMs (but I have to use an Apple USB SuperDrive to burn CDs and DVDs, though I suspect this is the drive, not the controller).

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Re: Raptor-provided SATA card issues
« Reply #2 on: January 07, 2023, 12:03:05 pm »
It was recognised by Petitboot, and able to boot an OS installed on a SATA HDD. I'll keep investigating.

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Re: Raptor-provided SATA card issues
« Reply #3 on: July 09, 2023, 02:06:55 pm »
I finally got around to investigating further. I just installed a Seagate FireCuda drive, connected to the Raptor-provided SATA controller (the only SATA device). fdisk -l does not see the drive, and dmesg has a lot of errors from the ata4 driver:

Code: [Select]
[  281.755562] ata4.00: exception Emask 0x52 SAct 0x8 SErr 0xffffffff action 0xe frozen
[  281.755590] ata4: SError: { RecovData RecovComm UnrecovData Persist Proto HostInt PHYRdyChg PHYInt CommWake 10B8B Dispar BadCRC Handshk LinkSeq TrStaTrns UnrecFIS DevExch }
[  281.755649] ata4.00: failed command: READ FPDMA QUEUED
[  281.755671] ata4.00: cmd 60/00:18:00:00:00/01:00:00:00:00/40 tag 3 ncq dma 131072 in
                        res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x56 (ATA bus error)
[  281.755717] ata4.00: status: { DRDY }
[  281.755740] ata4: hard resetting link
[  281.755756] ahci 0031:01:00.0: AHCI controller unavailable!
[  281.756773] EEH: Recovering PHB#31-PE#fd
[  281.756794] EEH: PE location: UOPWR.A100059-Node0-CPU2 Slot3 (8x), PHB location: N/A
[  281.756810] EEH: Frozen PHB#31-PE#fd detected
[  281.756840] EEH: Call Trace:
[  281.756850] EEH: [0000000092db4cb8] __eeh_send_failure_event+0x7c/0x160
[  281.756891] EEH: [0000000033818003] eeh_dev_check_failure+0x2c0/0x680
[  281.756926] EEH: [0000000082f0e2ee] ahci_scr_read+0xe4/0x110 [libahci]
[  281.756971] EEH: [0000000078b561cf] sata_scr_read+0xa0/0xc0
[  281.757006] EEH: [00000000d90f381f] ata_eh_link_autopsy+0xb8/0xdb0
[  281.757031] EEH: [0000000077f11bd7] ata_eh_autopsy+0x58/0x130
[  281.757066] EEH: [00000000fae1c6e2] sata_pmp_error_handler+0x84/0xc70
[  281.757103] EEH: [00000000178f8dc8] ahci_error_handler+0x70/0xe0 [libahci]
[  281.757150] EEH: [000000001cb2b4f8] ata_scsi_port_error_handler+0x304/0x7f0
[  281.757180] EEH: [0000000050cb919b] ata_scsi_error+0xb4/0x100
[  281.757204] EEH: [000000000fd788ee] scsi_error_handler+0x11c/0x740
[  281.757232] EEH: [000000001d98b5e6] kthread+0x124/0x130
[  281.757255] EEH: [0000000042537880] ret_from_kernel_thread+0x5c/0x64
[  281.757290] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 failures.
[  281.757328] EEH: Notify device drivers to shutdown
[  281.757350] EEH: Beginning: 'error_detected(IO frozen)'
[  281.757371] PCI 0031:01:00.0#00fd: EEH: driver not EEH aware
[  281.757376] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'none'
[  281.757435] EEH: Collect temporary log
[  281.757483] EEH: of node=0031:01:00.0
[  281.757504] EEH: PCI device/vendor: ffffffff
[  281.757536] EEH: PCI cmd/status register: ffffffff
[  281.757567] EEH: PCI-E capabilities and status follow:
[  281.757605] EEH: PCI-E 00: ffffffff ffffffff ffffffff ffffffff
[  281.757634] EEH: PCI-E 10: ffffffff ffffffff ffffffff ffffffff
[  281.757666] EEH: PCI-E 20: ffffffff
[  281.757685] EEH: PCI-E AER capability register set follows:
[  281.757723] EEH: PCI-E AER 00: ffffffff ffffffff ffffffff ffffffff
[  281.757752] EEH: PCI-E AER 10: ffffffff ffffffff ffffffff ffffffff
[  281.757780] EEH: PCI-E AER 20: ffffffff ffffffff ffffffff ffffffff
[  281.757804] EEH: PCI-E AER 30: ffffffff ffffffff
[  281.757825] PHB4 PHB#49 Diag-data (Version: 1)
[  281.757846] brdgCtl:    00000002
[  281.757866] RootSts:    00070040 00402000 c1010008 00100107 00000000
[  281.757899] RootErrSts: 00000003 00000020 00000001
[  281.757920] sourceId:   00000100
[  281.757939] PhbSts:     0000001c00000000 0000001c00000000
[  281.757960] Lem:        0000000102000000 0000000000000000 0000000002000000
[  281.757993] PhbErr:     000008c000000000 0000080000000000 2148000098000240 a008400000000000
[  281.758028] RxeArbErr:  0000000000000010 0000000000000010 80fd010000000000 0000000000000000
[  281.758064] RegbErr:    00d0000810000000 0000000810000000 8800005800000000 0000000007011000
[  281.758110] PE[000] A/B: 8000000000000000 8000000000000000
[  281.758143] PE[..0ff] A/B: as above
[  281.758162] EEH: Reset with hotplug activity
[  282.794468] ata4: failed to resume link (SControl FFFFFFFF)
[  282.794518] ata4: SATA link down (SStatus FFFFFFFF SControl FFFFFFFF)
[  282.794569] ata4: EH complete
[  282.795760] ------------[ cut here ]------------
[  282.795785] WARNING: CPU: 27 PID: 343 at drivers/ata/libata-core.c:5919 ata_port_detach+0x88/0x1c0
[  282.795818] Modules linked in: hid_logitech_hidpp uhid rfcomm snd_seq_dummy snd_hrtimer snd_seq kvm_hv kvm nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs overlay qrtr cmac algif_hash algif_skcipher af_alg bnep binfmt_misc btusb btrtl btbcm btintel btmtk bluetooth rtl2832_sdr r820t rtl2832 i2c_mux uvcvideo jitterentropy_rng videobuf2_vmalloc videobuf2_memops sha512_generic videobuf2_v4l2 dvb_usb_rtl28xxu snd_usb_audio videobuf2_common dvb_usb_v2 drbg snd_hda_codec_hdmi cdc_acm dvb_core ansi_cprng snd_usbmidi_lib videodev joydev snd_rawmidi rc_core ecdh_generic snd_seq_device evdev snd_hda_intel rfkill mc snd_intel_dspcfg ecc snd_hda_codec snd_hda_core snd_hwdep ctr snd_pcm vmx_crypto sg ofpart snd_timer gf128mul snd powernv_flash at24 ipmi_powernv soundcore mtd regmap_i2c ipmi_devintf ipmi_msghandler opal_prd nfsd auth_rpcgss nfs_acl lockd grace parport_pc lp sunrpc parport fuse loop configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic xts ecb
[  282.795974]  hid_generic usbhid hid dm_crypt dm_mod amdgpu gpu_sched drm_buddy i2c_algo_bit drm_display_helper drm_ttm_helper ttm xhci_pci sd_mod drm_kms_helper xhci_hcd syscopyarea sysfillrect sysimgblt fb_sys_fops nvme tg3 crc32c_vpmsum nvme_core libphy t10_pi ahci usbcore ptp drm libahci crc64_rocksoft_generic pps_core crc64_rocksoft crc_t10dif usb_common crct10dif_generic crc64 crct10dif_common drm_panel_orientation_quirks
[  282.796371] CPU: 27 PID: 343 Comm: eehd Not tainted 6.1.0-10-powerpc64le #1  Debian 6.1.37-1
[  282.796410] Hardware name: T2P9S01 REV 1.01 POWER9 0x4e1203 opal:skiboot-9858186 PowerNV
[  282.796434] NIP:  c0000000009bca48 LR: c0000000009bca38 CTR: 0000000000007ffe
[  282.796471] REGS: c00000000b50f6d0 TRAP: 0700   Not tainted  (6.1.0-10-powerpc64le Debian 6.1.37-1)
[  282.796521] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 24002822  XER: 20040000
[  282.796569] CFAR: c0000000009cdec4 IRQMASK: 0
               GPR00: c0000000009bca38 c00000000b50f970 c00000000110c200 0000000000000000
               GPR04: 0000000000000000 0000000000000000 0000000024002822 c00000000001ab38
               GPR08: 0000000000000000 0000000000000001 0000000000000000 0000000000000000
               GPR12: c0000000001a00d0 c000000ffffca100 c000000000173458 c000200007a5bd40
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
               GPR20: 0000000000000000 0000000000000000 c00000000c794068 0000000000000119
               GPR24: 0000000000000001 0000000000000001 0000000000000000 0000000000000080
               GPR28: 0000000000000000 c000200005d82400 0000000000000000 c00020000900c000
[  282.796777] NIP [c0000000009bca48] ata_port_detach+0x88/0x1c0
[  282.796803] LR [c0000000009bca38] ata_port_detach+0x78/0x1c0
[  282.796830] Call Trace:
[  282.796847] [c00000000b50f970] [c0000000009bca38] ata_port_detach+0x78/0x1c0 (unreliable)
[  282.796888] [c00000000b50f9b0] [c0000000009bcd2c] ata_pci_remove_one+0x6c/0xa0
[  282.796926] [c00000000b50f9f0] [c00800000e1001d4] ahci_remove_one+0x5c/0x80 [ahci]
[  282.796957] [c00000000b50fa20] [c00000000085ce70] pci_device_remove+0x60/0x110
[  282.796994] [c00000000b50fa60] [c000000000938480] device_remove+0x70/0xd0
[  282.797020] [c00000000b50fa90] [c00000000093a75c] device_release_driver_internal+0x2cc/0x360
[  282.797060] [c00000000b50fae0] [c00000000084c798] pci_stop_bus_device+0xb8/0x110
[  282.797099] [c00000000b50fb20] [c00000000084cb78] pci_stop_and_remove_bus_device+0x28/0x40
[  282.797147] [c00000000b50fb50] [c000000000066ea0] pci_hp_remove_devices+0x80/0x120
[  282.797198] [c00000000b50fbd0] [c000000000048510] eeh_reset_device+0xf0/0x2d0
[  282.797236] [c00000000b50fc80] [c0000000000474b4] eeh_handle_normal_event+0x744/0xa10
[  282.797288] [c00000000b50fd60] [c000000000048808] eeh_event_handler+0x118/0x1a0
[  282.797327] [c00000000b50fdc0] [c000000000173574] kthread+0x124/0x130
[  282.797353] [c00000000b50fe10] [c00000000000cedc] ret_from_kernel_thread+0x5c/0x64
[  282.797371] Instruction dump:
[  282.797391] 48010eb1 60000000 e87f0010 7fc4f378 483e2d41 60000000 7fe3fb78 48011375
[  282.797431] 60000000 813f0020 69290400 7929b7e2 <0b090000> 387f3b80 4b7ab249 60000000
[  282.797471] ---[ end trace 0000000000000000 ]---
[  314.662353] ata4.00: disable device
[  314.662393] sd 3:0:0:0: [sda] tag#5 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=63s
[  314.662414] sd 3:0:0:0: [sda] tag#5 Sense Key : Not Ready [current]
[  314.662439] sd 3:0:0:0: [sda] tag#5 Add. Sense: Logical unit not ready, hard reset required
[  314.662456] sd 3:0:0:0: [sda] tag#5 CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00
[  314.690402] sd 3:0:0:0: [sda] Synchronizing SCSI cache
[  314.690506] sd 3:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  314.690538] sd 3:0:0:0: [sda] Stopping disk
[  314.690574] sd 3:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  314.691025] ahci 0031:01:00.0: AHCI controller unavailable!
[  314.691047] ata1: failed to stop engine (-19)
[  314.691063] ahci 0031:01:00.0: AHCI controller unavailable!
[  314.691084] ata2: failed to stop engine (-19)
[  314.691100] ahci 0031:01:00.0: AHCI controller unavailable!
[  314.691121] ata3: failed to stop engine (-19)
[  314.691135] ahci 0031:01:00.0: AHCI controller unavailable!
[  314.691156] ata4: failed to stop engine (-19)
[  314.903793] pci 0031:01:00.0: Removing from iommu group 5
[  314.903992] pci 0031:01     : [PE# fd] Releasing PE
[  314.904016] pci 0031:01     : [PE# fd] Removing DMA window #0
[  314.904038] pci 0031:01     : [PE# fd] Disabling 64-bit DMA bypass
[  339.777377] EEH: Sleep 5s ahead of complete hotplug
[  344.831241] pci 0031:00:00.0: PCI bridge to [bus 01]
[  344.831270] pci 0031:00:00.0:   bridge window [mem 0x620c080000000-0x620c0ffefffff]
[  344.831299] EEH: Notify device driver to resume
[  344.831320] EEH: Beginning: 'resume'
[  344.831330] EEH: Finished:'resume'
[  344.831332] EEH: Recovery successful.

uname -a: Linux behemoth 6.1.0-10-powerpc64le #1 SMP Debian 6.1.37-1 (2023-07-03) ppc64le GNU/Linux

I suppose the next step is to try a SATA controller from the Wiki.

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Re: Raptor-provided SATA card issues
« Reply #4 on: July 09, 2023, 04:21:32 pm »
The SATA works fine, I tried it in my x86 system. It even has a POST screen.

Turns out, the power supply connection wasn't right. The power supply is modular, and the cable is connected to the "Peripheral and SATA" section, but moving to another spot and rebooting made it work. The error messages were rather misleading, they had me thinking it was an issue with the card. I would have thought a power issue to the hard drive would have resulted in lspci showing the drives, fdisk -l not showing the drives, and no crazy "freak out" messages in dmesg.

Edit: after a reboot, it's back to not working with the device failure messages in dmesg.
Edit2: works again, the SATA cable need to be replaced, the old one had a piece of plastic broken so the connection might not have been snug. But there are still errors in dmesg, even though I was able to mount the drive and see it in df -h. So it seems to work, but isn't stable?

Code: [Select]
[   79.075136] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
[  171.863222] ata4.00: exception Emask 0x52 SAct 0x800000 SErr 0xffffffff action 0xe frozen
[  171.863243] EEH: Recovering PHB#31-PE#fd
[  171.863248] ata4: SError: { RecovData RecovComm UnrecovData Persist Proto HostInt PHYRdyChg PHYInt CommWake 10B8B Dispar BadCRC Handshk LinkSeq TrStaTrns UnrecFIS DevExch }
[  171.863274] EEH: PE location: UOPWR.A100059-Node0-CPU2 Slot3 (8x), PHB location: N/A
[  171.863315] ata4.00: failed command: WRITE FPDMA QUEUED
[  171.863347] EEH: Frozen PHB#31-PE#fd detected
[  171.863372] ata4.00: cmd 61/00:b8:00:59:40/08:00:52:00:00/40 tag 23 ncq dma 1048576 ou
                        res 40/00:81:82:00:00/00:00:00:00:00/40 Emask 0x56 (ATA bus error)
[  171.863392] EEH: Call Trace:
[  171.863445] ata4.00: status: { DRDY }
[  171.863467] EEH: [00000000b45087cd] __eeh_send_failure_event+0x7c/0x160
[  171.863483] ata4: hard resetting link
[  171.863514] EEH: [0000000073b36745] eeh_dev_check_failure+0x2c0/0x680
[  171.863534] ahci 0031:01:00.0: AHCI controller unavailable!
[  171.863561] EEH: [000000009272f56d] ahci_scr_read+0xe4/0x110 [libahci]
[  171.863609] EEH: [000000000ee1ab2b] sata_scr_read+0xa0/0xc0
[  171.863642] EEH: [00000000556c9f0a] ata_eh_link_autopsy+0xb8/0xdb0
[  171.863677] EEH: [000000006a94c504] ata_eh_autopsy+0x58/0x130
[  171.863713] EEH: [000000002996a79c] sata_pmp_error_handler+0x84/0xc70
[  171.863750] EEH: [000000004a7b4d3d] ahci_error_handler+0x70/0xe0 [libahci]
[  171.863787] EEH: [00000000cb070bbc] ata_scsi_port_error_handler+0x304/0x7f0
[  171.863825] EEH: [0000000063c7adf8] ata_scsi_error+0xb4/0x100
[  171.863865] EEH: [000000006d77a5c3] scsi_error_handler+0x11c/0x740
[  171.863902] EEH: [00000000a8877226] kthread+0x124/0x130
[  171.863938] EEH: [00000000a5874d14] ret_from_kernel_thread+0x5c/0x64
[  171.863973] EEH: This PCI device has failed 1 times in the last hour and will be permanently disabled after 5 failures.
[  171.864025] EEH: Notify device drivers to shutdown
[  171.864050] EEH: Beginning: 'error_detected(IO frozen)'
[  171.864080] PCI 0031:01:00.0#00fd: EEH: driver not EEH aware
[  171.864084] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'none'
[  171.864155] EEH: Collect temporary log
[  171.864199] EEH: of node=0031:01:00.0
[  171.864217] EEH: PCI device/vendor: ffffffff
[  171.864251] EEH: PCI cmd/status register: ffffffff
[  171.864294] EEH: PCI-E capabilities and status follow:
[  171.864324] EEH: PCI-E 00: ffffffff ffffffff ffffffff ffffffff
[  171.864367] EEH: PCI-E 10: ffffffff ffffffff ffffffff ffffffff
[  171.864399] EEH: PCI-E 20: ffffffff
[  171.864418] EEH: PCI-E AER capability register set follows:
[  171.864457] EEH: PCI-E AER 00: ffffffff ffffffff ffffffff ffffffff
[  171.864494] EEH: PCI-E AER 10: ffffffff ffffffff ffffffff ffffffff
[  171.864544] EEH: PCI-E AER 20: ffffffff ffffffff ffffffff ffffffff
[  171.864579] EEH: PCI-E AER 30: ffffffff ffffffff
[  171.864610] PHB4 PHB#49 Diag-data (Version: 1)
[  171.864642] brdgCtl:    00000002
[  171.864661] RootSts:    00070040 00402000 c1010008 00100107 00000000
[  171.864685] RootErrSts: 00000003 00000020 00000001
[  171.864719] sourceId:   00000100
[  171.864739] PhbSts:     0000001c00000000 0000001c00000000
[  171.864784] Lem:        0000000102000000 0000000000000000 0000000002000000
[  171.864828] PhbErr:     000008c000000000 0000080000000000 2148000098000240 a008400000000000
[  171.864876] RxeArbErr:  0000000000000010 0000000000000010 80fd010000000000 0000000000000000
[  171.864922] RegbErr:    00d0000810000000 0000000810000000 8800005800000000 0000000007011000
[  171.864959] PE[000] A/B: 8000000000000000 8000000000000000
[  171.864992] PE[..0ff] A/B: as above
[  171.865013] EEH: Reset with hotplug activity
[  172.903967] ata4: failed to resume link (SControl FFFFFFFF)
[  172.904030] ata4: SATA link down (SStatus FFFFFFFF SControl FFFFFFFF)
[  172.904094] ata4: EH complete
[  172.905112] ------------[ cut here ]------------
[  172.905135] WARNING: CPU: 31 PID: 343 at drivers/ata/libata-core.c:5919 ata_port_detach+0x88/0x1c0
[  172.905180] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq kvm_hv kvm nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs overlay qrtr cmac algif_hash algif_skcipher af_alg bnep binfmt_misc btusb btrtl btbcm btintel btmtk rtl2832_sdr bluetooth r820t rtl2832 i2c_mux uvcvideo dvb_usb_rtl28xxu jitterentropy_rng videobuf2_vmalloc dvb_usb_v2 videobuf2_memops sha512_generic videobuf2_v4l2 snd_usb_audio dvb_core videobuf2_common drbg snd_hda_codec_hdmi snd_usbmidi_lib rc_core ansi_cprng videodev cdc_acm snd_rawmidi joydev evdev ecdh_generic snd_seq_device snd_hda_intel rfkill mc snd_intel_dspcfg ecc snd_hda_codec ctr snd_hda_core vmx_crypto snd_hwdep ofpart gf128mul ipmi_powernv snd_pcm powernv_flash sg ipmi_devintf ipmi_msghandler mtd snd_timer opal_prd at24 snd regmap_i2c soundcore nfsd auth_rpcgss nfs_acl lockd grace parport_pc lp parport fuse sunrpc loop configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic xts ecb hid_generic usbhid hid dm_crypt
[  172.905330]  dm_mod amdgpu gpu_sched drm_buddy i2c_algo_bit drm_display_helper drm_ttm_helper ttm sd_mod drm_kms_helper syscopyarea sysfillrect sysimgblt xhci_pci fb_sys_fops nvme xhci_hcd tg3 nvme_core crc32c_vpmsum t10_pi drm libphy usbcore crc64_rocksoft_generic ahci ptp crc64_rocksoft libahci crc_t10dif pps_core usb_common crct10dif_generic drm_panel_orientation_quirks crc64 crct10dif_common
[  172.905706] CPU: 31 PID: 343 Comm: eehd Not tainted 6.1.0-10-powerpc64le #1  Debian 6.1.37-1
[  172.905736] Hardware name: T2P9S01 REV 1.01 POWER9 0x4e1203 opal:skiboot-9858186 PowerNV
[  172.905764] NIP:  c0000000009bca48 LR: c0000000009bca38 CTR: 0000000000007ffe
[  172.905800] REGS: c00000000b5d36d0 TRAP: 0700   Not tainted  (6.1.0-10-powerpc64le Debian 6.1.37-1)
[  172.905838] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28002820  XER: 20040000
[  172.905876] CFAR: c0000000009cdec4 IRQMASK: 0
               GPR00: c0000000009bca38 c00000000b5d3970 c00000000110c200 0000000000000000
               GPR04: 0000000000000000 0000000000000000 0000000028002820 c00000000001ab38
               GPR08: 0000000000000000 0000000000000001 0000000000000000 0000000000004000
               GPR12: c0000000001a9ad0 c000000ffffc6d00 c000000000173458 c00020000790e400
               GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
               GPR20: 0000000000000000 0000000000000000 c00000000c868068 00000000000000ab
               GPR24: 0000000000000001 0000000000000001 0000000000000000 0000000000000080
               GPR28: 0000000000000000 c000200005e30600 0000000000000000 c000200007b98000
[  172.906191] NIP [c0000000009bca48] ata_port_detach+0x88/0x1c0
[  172.906217] LR [c0000000009bca38] ata_port_detach+0x78/0x1c0
[  172.906266] Call Trace:
[  172.906284] [c00000000b5d3970] [c0000000009bca38] ata_port_detach+0x78/0x1c0 (unreliable)
[  172.906325] [c00000000b5d39b0] [c0000000009bcd2c] ata_pci_remove_one+0x6c/0xa0
[  172.906367] [c00000000b5d39f0] [c00800000df201d4] ahci_remove_one+0x5c/0x80 [ahci]
[  172.906416] [c00000000b5d3a20] [c00000000085ce70] pci_device_remove+0x60/0x110
[  172.906479] [c00000000b5d3a60] [c000000000938480] device_remove+0x70/0xd0
[  172.906526] [c00000000b5d3a90] [c00000000093a75c] device_release_driver_internal+0x2cc/0x360
[  172.906570] [c00000000b5d3ae0] [c00000000084c798] pci_stop_bus_device+0xb8/0x110
[  172.906631] [c00000000b5d3b20] [c00000000084cb78] pci_stop_and_remove_bus_device+0x28/0x40
[  172.906672] [c00000000b5d3b50] [c000000000066ea0] pci_hp_remove_devices+0x80/0x120
[  172.906718] [c00000000b5d3bd0] [c000000000048510] eeh_reset_device+0xf0/0x2d0
[  172.906766] [c00000000b5d3c80] [c0000000000474b4] eeh_handle_normal_event+0x744/0xa10
[  172.906817] [c00000000b5d3d60] [c000000000048808] eeh_event_handler+0x118/0x1a0
[  172.906867] [c00000000b5d3dc0] [c000000000173574] kthread+0x124/0x130
[  172.906904] [c00000000b5d3e10] [c00000000000cedc] ret_from_kernel_thread+0x5c/0x64
[  172.906955] Instruction dump:
[  172.906977] 48010eb1 60000000 e87f0010 7fc4f378 483e2d41 60000000 7fe3fb78 48011375
[  172.907015] 60000000 813f0020 69290400 7929b7e2 <0b090000> 387f3b80 4b7ab249 60000000
[  172.907058] ---[ end trace 0000000000000000 ]---

Code: [Select]
$ df -h /datafs/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       7.3T   28K  6.9T   1% /datafs

$ ls -lah /datafs/
ls: reading directory '/datafs/': Input/output error
total 0

$ dmesg
<snip from last dmesg>
[  204.038662] ata4.00: disable device
[  204.038708] sd 3:0:0:0: [sda] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=62s
[  204.038728] sd 3:0:0:0: [sda] tag#23 Sense Key : Not Ready [current]
[  204.038754] sd 3:0:0:0: [sda] tag#23 Add. Sense: Logical unit not ready, hard reset required
[  204.038783] sd 3:0:0:0: [sda] tag#23 CDB: Write(16) 8a 00 00 00 00 00 52 40 59 00 00 00 08 00 00 00
[  204.038810] I/O error, dev sda, sector 1379948800 op 0x1:(WRITE) flags 0x800 phys_seg 16 prio class 2
[  204.038934] device offline error, dev sda, sector 7814252720 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[  204.038978] Aborting journal on device sda1-8.
[  204.039006] device offline error, dev sda, sector 7814252544 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[  204.039058] device offline error, dev sda, sector 7814252544 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 2
[  204.039107] Buffer I/O error on dev sda1, logical block 976781312, lost sync page write
[  204.039151] JBD2: I/O error when updating journal superblock for sda1-8.
[  204.056711] Buffer I/O error on dev sda1, logical block 83, lost async page write
[  204.114689] sd 3:0:0:0: [sda] Synchronizing SCSI cache
[  204.114745] sd 3:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  204.114777] sd 3:0:0:0: [sda] Stopping disk
[  204.114818] sd 3:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[  204.115192] ahci 0031:01:00.0: AHCI controller unavailable!
[  204.115220] ata1: failed to stop engine (-19)
[  204.115257] ahci 0031:01:00.0: AHCI controller unavailable!
[  204.115289] ata2: failed to stop engine (-19)
[  204.115318] ahci 0031:01:00.0: AHCI controller unavailable!
[  204.115349] ata3: failed to stop engine (-19)
[  204.115384] ahci 0031:01:00.0: AHCI controller unavailable!
[  204.115406] ata4: failed to stop engine (-19)
[  204.205262] pci 0031:01:00.0: Removing from iommu group 5
[  204.205459] pci 0031:01     : [PE# fd] Releasing PE
[  204.205493] pci 0031:01     : [PE# fd] Removing DMA window #0
[  204.205536] pci 0031:01     : [PE# fd] Disabling 64-bit DMA bypass
[  216.633037] EEH: Sleep 5s ahead of complete hotplug
[  221.653197] pci 0031:00:00.0: PCI bridge to [bus 01]
[  221.653229] pci 0031:00:00.0:   bridge window [mem 0x620c080000000-0x620c0ffefffff]
[  221.653257] EEH: Notify device driver to resume
[  221.653278] EEH: Beginning: 'resume'
[  221.653287] EEH: Finished:'resume'
[  221.653289] EEH: Recovery successful.
[  426.512672] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  426.512678] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.512719] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.512778] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  426.512782] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.512919] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.512993] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.513054] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.513176] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.513286] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.513360] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  426.513492] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  427.336189] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[  427.343453] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  427.343549] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  429.694157] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[  429.701421] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  429.701507] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  441.008383] EXT4-fs error: 35 callbacks suppressed
[  441.008378] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  441.008389] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008441] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008516] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  441.008538] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008649] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008716] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008773] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008887] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.008948] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.009006] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  441.009109] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.791980] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm ls: error -5 reading directory block
[  448.800098] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  448.800124] EXT4-fs error: 5 callbacks suppressed
[  448.800129] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800194] EXT4-fs warning (device sda1): htree_dirblock_to_tree:1080: inode #2: lblock 0: comm starship: error -5 reading directory block
[  448.800253] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800344] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800453] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800558] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800619] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800701] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800805] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800879] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0
[  448.800989] EXT4-fs error (device sda1): __ext4_find_entry:1678: inode #2: comm starship: reading directory lblock 0

Starship is https://starship.rs/.

The drive worked on my x86 machine, directly connected to the motherboard, and with the Raptor-provided SATA controller (this controller). So is the controller Marvell 88SE9215 not really supported by POWER9?
« Last Edit: July 09, 2023, 04:50:29 pm by rjzak »

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Re: Raptor-provided SATA card issues
« Reply #5 on: July 09, 2023, 05:59:10 pm »
The attachments show that Petitboot is able to see the Seagate drive. It seems to mount fine as read-only. Trying to re-mount as read-write shows a lot of errors.

I also tried using a Samsung 850 SSD, and that didn't show up in Petitboot.

I connected the SATA controller and Seagate HDD back to my x86 machine, mounted the filesystem, and was able to compile some Rust projects on it without issues, and without complaints in dmesg.

Please advise: what might I be doing wrong? Which SATA controller should I use?

xilinder

  • Jr. Member
  • **
  • Posts: 82
  • Karma: +9/-0
    • View Profile
Re: Raptor-provided SATA card issues
« Reply #6 on: July 10, 2023, 07:54:17 am »
I,ve been using a JMicron JMB363 SATA/IDE controller (rev 2) for a couple of years now without problems.That is where my DVD drive is connected.
I also have the on-board Adaptec SATA controller that came with the Talos2.

Wish I could help. :(
Talos II 2x8, 32GB RAM, onboard Microsemi RAID,  AMD WX7100, J.Micron SATA/PATA PCIe adapter. Debian with Mate.

rjzak

  • Newbie
  • *
  • Posts: 33
  • Karma: +6/-0
    • View Profile
    • Personal site
Re: Raptor-provided SATA card issues
« Reply #7 on: July 15, 2023, 09:55:53 am »
I,ve been using a JMicron JMB363 SATA/IDE controller (rev 2) for a couple of years now without problems.That is where my DVD drive is connected.
I also have the on-board Adaptec SATA controller that came with the Talos2.

Wish I could help. :(

Something like this? https://www.amazon.com/Expansion-ATA133-ESATA-JMB363-Adapter/dp/B09W5TKT15/

Your Talos II has a SATA connector? Mine does not, sadly. I had the option of the SAS controller, and I regret not taking it.

ClassicHasClass

  • Sr. Member
  • ****
  • Posts: 443
  • Karma: +34/-0
  • Talospace Earth Orbit
    • View Profile
    • Floodgap
Re: Raptor-provided SATA card issues
« Reply #8 on: July 15, 2023, 11:47:24 am »
There may be some kernel variations here. Petitboot unfailingly sees my Raptor BTO Marvell card, but with recent Linux kernels it's been hit or miss and I'm not sure what the pattern is.

xilinder

  • Jr. Member
  • **
  • Posts: 82
  • Karma: +9/-0
    • View Profile
Re: Raptor-provided SATA card issues
« Reply #9 on: July 16, 2023, 08:33:24 am »
@rjzak
Yes, that's the one. Mine is older and has both SATA slots internal.
Caution; You can only use SATA or PATA, not both. But it works.
Talos II 2x8, 32GB RAM, onboard Microsemi RAID,  AMD WX7100, J.Micron SATA/PATA PCIe adapter. Debian with Mate.