RAID using sx600

andersbodilsen

What is this storage?
Joined
Jan 1, 2005
Messages
4
Hi all.

First all the specs.

RH9 linux Server consisting of:
PS: ATX with a max load of 360W
Motherboard: Asus a7m-266d ( only with one 1900+ MP cpu )
RAM: 1024MB
RAID controller:
Promise Super Track SX6000
BIOS version: 1.20.0.4
Chache: 64MB RAM
HDD's: Maxtor DiamondMax Plus 9 120GB

Partition table.

/ 5GB
/home 5GB
/var 5GB
/etc 5GB
/boot 100MB
SWAP 1GB
/usr 10GB
/var/ftp 575GB


A few days ago I formatted my server. Well, not all the patitions anyway...
I did not format the "/var/ftp" partition. This is where I store backups of
all my configuration files, home pages, family photos, etc....

Usually it is not a problem to format the server and still keep the data stored
on the "/var/ftp" partition. I have done this three times before with this specific
setupwithout any problems occurring.
This time it only worked for about 24 hours.


Now I get a really strange error. The system runs the POST startup nice and smoothly. The array status is set to ”Functional” and all the disks are found when the controller has finished its testings.
Then the system proceeds loading the operating system, and then I get this error message:

Loading pti_st.o module
AMD756: dev 8086:1962, router pirq : 2 get irq : 10
PCI: Found IRQ 10 for device 02:05.1
IRQ routing conflict for 00:09.1, have irq 5, want irq 10
IRQ routing conflict for 00:05.1, have irq 5, want irq 10
Found PTI SuperTrak at mbase: 0xf70000000, irq 5.
scsi0: PROMISE SuperTrak SX6000 Driver
Vendor: PTI Modil: SuperTrak REV:
Type: Direct-Access ANSI SCSI revision: 02
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 1191406080 512-byte hdwr sectors(610000 MB)
Partition check:
sda: sda1 sda2 sda3 sda4 ( sda5 sda6 sda7 sda8 )
Loading jbd.o module
Jouralled Block Device driver loaded
Loading ext3.o module
Mouting /proc filesystem
Creating block devices
Creating root devices
Mouting root filesystem
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled dyring recovery
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.

write scsi: aborting command due to timeout: pid 122, scsi0, channel 0, id 0, lun 0
WRITE (10) 00 00 03 30 20 00 00 08 00

And now the system just goes bip – pause – bip – pause – bip – pause – bip – pause etc...

I thought ”Hey, this is IRQ confilct” and I began ripping out circutboards until my sx6000 was the only one left in the case.
Just before booting I moved the raid controller to another PCI slot.
I rebooted the system, and hoped for the bedst, but the exact same thing happend...

Please help me fixing this problem, I really would hate to lose all my family photos:(
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,297
Location
I am omnipresent
I suppose the safe thing to do would be to configure another PC to host the array temporarily, until you get your data off. Yes, that's a lot of work. Yes, your family photos are worth it (and a more permanent backup!).

With a functional Linux box only lasting 24 hours, I'd say you've got a serious problem somewhere. I'll bet the error message with your array isn't the only symptom, either.

I have a funny feeling it's probably the card or perhaps the memory on the card, but that's just a guess at this point. At any rate, you'll find out when you swap that hardware to a different PC.
 

andersbodilsen

What is this storage?
Joined
Jan 1, 2005
Messages
4
Well, I have not had the time to make the test yet.....

But, if I have to replace the controller with a new one, would it be possible for me to keep me data??
Is the information regarding the array kept on the controller or the disk's in the array???
 

Bozo

Storage? I am Storage!
Joined
Feb 12, 2002
Messages
4,396
Location
Twilight Zone
Usually the RAID setup is stored on the hard drives. Some of the better RAID cards have it on the cards.

Bozo :mrgrn:
 

Buck

Storage? I am Storage!
Joined
Feb 22, 2002
Messages
4,514
Location
Blurry.
Website
www.hlmcompany.com
I can't fix your problem, but it is good to see that the SX6000 card worked for you under RH9. I tried this about 2 years ago with RH8 and RH9, it was a no-go. The card has been sitting in it's retail box every since. (Prior to that it was working flawlessly on a Windows NT 4.0 system.)
 

andersbodilsen

What is this storage?
Joined
Jan 1, 2005
Messages
4
Borrowed a new sx6000 controller

Buck -> Thanks, for you post-reply. It's allway nice to get some feedback from other linux users:) Until this ufortunate problem occurred. I have had my RH9 running for about a year.

Anyway. I have now manage to test the array with a new raid controller.... ( I borrowed one from work ). But it did not make any difference. I still get the same error. This has to indicate that the array is messing things up.... right??

Is possible at all to save any of the data on the disks??
 

andersbodilsen

What is this storage?
Joined
Jan 1, 2005
Messages
4
CLOSED

Hi all.

I just got my server up and running again.... :lol: This issue was caused by the controller. It could not see that one of the disk was dead. It just went on like nothing happend.
But I became aware of the problem when I decided to run the disk utility provided by Maxtor. I found that one of the disk had a surface problem.
I simply replaced this disk with similar one, and then I was able to rebuildt the entire array.

I have to admit that episode has been quite scary for me, and I've allready bought tape drive for me linux box.
No one knows how importent backing up data is before they have
( or allmost ) lost some of their precious data.
 

Buck

Storage? I am Storage!
Joined
Feb 22, 2002
Messages
4,514
Location
Blurry.
Website
www.hlmcompany.com
Re: CLOSED

andersbodilsen said:
I have to admit that episode has been quite scary for me, and I've allready bought tape drive for me linux box.
No one knows how importent backing up data is before they have
( or allmost ) lost some of their precious data.

Indeed!
 
Top