Gilbo
Storage is cool
I was installing a new software RAID 5 (linux) in a fileserver of mine. I've done this twice before. I managed to partition the disks just fine. I successfully built the array once using mdadm --create --force (my third attempt), but every other time it gets built, 3 out of the five disks are considered failed immediately. The one time that it built properly I copied 364GB of my ripped DVD collection to it before it froze. When it rebooted the three disks were failed.
cat /proc/mdstat:
Obviously that doesn't make for a very useful RAID 5...
dmesg notes the problem as being SCSI parity errors on the three bad disks:
I have difficulty accessing the disks. I can't erase the superblocks or build filesystems on the partitions. Now, dmesg makes me believe that it's a hardware error, however,
1. I consider 3/4 new disks being bad very poor luck.
2. I don't know enough about linux to discount the possibility of a software error (SCSI support is enabled. Low-level SATA driver for AMD/NVidia is too).
I'm leaning towards cables right now, or motherboard headers, but I wanted to get opinions while I continue trouble-shooting, and advice on what to test.
P.S. I haven't once managed to shut the system down without it freezing on "Remounting remaining filesystems readonly ..." I don't know if that's related, or useful, but hitting reset all the time is annoying.
cat /proc/mdstat:
Code:
Personalities : [raid5] [raid4]
md0 : active raid5 sdd1[5](F) sdc1[3] sdb1[6](F) sda1[7](F) hda1[0]
980446464 blocks level 5, 64k chunk, algorithm 0 [5/2] [U__U_]
unused devices: <none>
Obviously that doesn't make for a very useful RAID 5...
dmesg notes the problem as being SCSI parity errors on the three bad disks:
Code:
ata4: command 0x35 timeout, stat 0xd0 host_stat 0x21
ata4: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata4: status=0xd0 { Busy }
sd 3:0:0:0: SCSI error: return code = 0x8000002
sdd: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sdd, sector 490234559
ATA: abnormal status 0xD0 on port 0x967
ATA: abnormal status 0xD0 on port 0x967
ATA: abnormal status 0xD0 on port 0x967
ata2: command 0x35 timeout, stat 0xd0 host_stat 0x21
ata2: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata2: status=0xd0 { Busy }
sd 1:0:0:0: SCSI error: return code = 0x8000002
sdb: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sdb, sector 490223295
ATA: abnormal status 0xD0 on port 0x977
ATA: abnormal status 0xD0 on port 0x977
ATA: abnormal status 0xD0 on port 0x977
ata1: command 0x35 timeout, stat 0xd0 host_stat 0x21
ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata1: status=0xd0 { Busy }
sd 0:0:0:0: SCSI error: return code = 0x8000002
sda: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sda, sector 490234559
ATA: abnormal status 0xD0 on port 0x9F7
ATA: abnormal status 0xD0 on port 0x9F7
ATA: abnormal status 0xD0 on port 0x9F7
ata4: command 0xea timeout, stat 0xd0 host_stat 0x0
ata4: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata4: status=0xd0 { Busy }
raid5: Disk failure on sdd1, disabling device. Operation continuing on 4 devicesata2: command 0xea timeout, stat 0xd0 host_stat 0x0
ata2: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata2: status=0xd0 { Busy }
raid5: Disk failure on sdb1, disabling device. Operation continuing on 3 devicesata1: command 0xea timeout, stat 0xd0 host_stat 0x0
ata1: translated ATA stat/err 0xd0/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata1: status=0xd0 { Busy }
raid5: Disk failure on sda1, disabling device. Operation continuing on 2 devicesRAID5 conf printout:
I have difficulty accessing the disks. I can't erase the superblocks or build filesystems on the partitions. Now, dmesg makes me believe that it's a hardware error, however,
1. I consider 3/4 new disks being bad very poor luck.
2. I don't know enough about linux to discount the possibility of a software error (SCSI support is enabled. Low-level SATA driver for AMD/NVidia is too).
I'm leaning towards cables right now, or motherboard headers, but I wanted to get opinions while I continue trouble-shooting, and advice on what to test.
P.S. I haven't once managed to shut the system down without it freezing on "Remounting remaining filesystems readonly ..." I don't know if that's related, or useful, but hitting reset all the time is annoying.