Why you should back up your RAID 1 mirroring array

Discussion in 'other software & services' started by Devinco, Oct 11, 2005.

Thread Status:
Not open for further replies.
  1. Devinco

    Devinco Registered Member

    Jul 2, 2004
    Even if you have a RAID 1 mirroring array, you still need to back up your data (preferrably off site).
    While you eliminate the inevitable problem of data loss due to hard drive failure, there are many ways you could still lose data.
    The odds of 2 hard drives failing at exactly the same moment are very small. But a flaky power supply, a power spike, RAID controller failure, malware infection, accidental deletion, data corruption, theft, or natural disaster could all cause data loss in your RAID 1 array.
    You can reduce these risks by having a quality power supply, UPS, and a reasonable layered security setup.

    A while ago, I took all these precautions and still lost data. Below is what happened and what to do better.
    The problem came from a totally unexpected source.
    Here was the setup:
    Gigabyte motherboard, 3.0 GHz P4 (no overclocking), Silicon Image 3112 SATA Raid Controller, 2 HighPoint RocketHead 100 SATA to PATA adapters, 2 PATA mobile racks, 2 Seagate 7200.7 160GB PATA drives.
    At the time, they didn't make a lot of SATA mobile racks, so I opted for some nice PATA mobile racks and the RocketHead100 adapters. The adapters fit well (not loose), but just to be sure, I used a little electrical tape to hold them to the mobile rack (not blocking air flow) so they wouldn't be accidently knocked loose while working on the computer.
    The system worked flawlessly for several months.
    Then on one boot after POST, the Silicon Image controller would show drive 1 of the array to have 153GB(normal) and drive 2 only 136GB. The system would not halt and it would simply flash by during the normal boot process. No warnings, just the drives were different sizes. Previously, they both showed 153GB. Warm boot, cold boot, the size difference remained. The system appeared to operate normally. But I recall slower windows start ups and more activity from the drive LED on the drive 1 rack (the good one) and little or no activity from the drive 2 rack.
    I tried removing and reinserting the mobile racks (power off), same problem.
    I tried swapping the mobile racks, same problem.
    So I rebuilt the array from the good drive 1.
    The problem disappeared and did not reappear until a few months later.
    Same problem, same solution. Very intermittant, no specific rhyme or reason.
    Sometimes the problem didn't recur for a month, sometimes 3 months, there was no pattern to it.
    I even replaced the complete drive 2 mobile rack, but the problem remained.
    The data was fine, so unfortunately, I didn't look into it any further.

    Several months later on boot up, the RAID controller reported the RAID array had failed and the system halted. That's when I made my next mistake by acting too quickly and trying to rebuild the array the same way that had fixed the problem previously. I rebuilt the array from drive 1., but after the rebuild, the array was blank, empty, no data!!!
    In an attempt to recover the data, I broke the RAID 1 array and tried to use Spinrite 6.0 on the individual drives (Spinrite doesn't work with RAID). But it was too late, the data was gone. If I didn't rush to rebuild the array, the data might have been recovered by spinrite. Spinrite showed the drives to be in good health, so it wasn't a hard drive failure.
    It may have been the Silicon Image 3112 RAID controller, but more than likely, it was one or both of the of the SATA to PATA adapters.
    It was a bad idea to begin with, SATA is timing sensitive and RAID is timing sensitive, why put a cheap converting adapter right in the middle of an important data flow?

    I was able to recover older data from a DVD backup, but still, quite a lot was lost. I disabled the Silicon Image RAID controller and instead hooked up the PATA mobile racks directly to the (slower) GigaRAID PATA RAID controller. It was plenty fast enough and has worked perfectly ever since. No more synchronization problems, no problems at all. I have been regularly backing up this RAID 1 array ever since.

    So please remember to back up your RAID 1 array, it can fail when you least expect it even if you take precautions.
    And if you have a data loss event, don't act too quickly.
    Stop and think it over so you can choose the right action.

    Hope this helps somebody.
Thread Status:
Not open for further replies.