RAID Recovery

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
(Recreating an array)
(When Things Go Wrong)
Line 18: Line 18:
  
 
This perl script is an un-tested prototype : [[permute_array.pl]]
 
This perl script is an un-tested prototype : [[permute_array.pl]]
 +
 +
==Preserving RAID drive configuration==
 +
 +
One of the most useful things to do '''first''', when trying to recover a broken RAID array, is to preserve the information reported in the RAID superblocks on each device at the time the array went down (and before you start trying to recreate the array). Something like
 +
 +
<code lang="text">
 +
mdadm --examine /dev/sd[bcdefghijklmn]1 > raid.status
 +
</code>
 +
 +
(adjust to suit your drives) creates a file, raid.status, which is a sequential listing of the mdadm --examine output for all the RAID devices on my system, in order. It's also still there five minutes later when we start messing with mdadm --create, which is the more important point.

Revision as of 00:24, 1 December 2010

When Things Go Wrong

There are two kinds of failure with RAID systems: failures that reduce the resilience and failures that prevent the raid device from operating.

Normally a single disk failure will degrade the raid device but it will continue operating (that is the point of RAID after all).

However there will come a point when enough component devices fail that the raid device stops working.

If this happens then first of all: don't panic. Seriously. Don't rush into anything; don't issue any commands that will write to the disks (like mdadm -C , fsck or even mount etc).

The first thing to do is to start to preserve information. You'll need data from /var/log/messages, dmesg etc.

Recreating an array

When an array is created, the data areas are not written to, *provided* the array is created in degraded mode; that is with a 'missing' device.

So if you somehow screw up your array and can't remember how it was originally created, you can re-run the create command using various permutations until the data is readable.

This perl script is an un-tested prototype : permute_array.pl

Preserving RAID drive configuration

One of the most useful things to do first, when trying to recover a broken RAID array, is to preserve the information reported in the RAID superblocks on each device at the time the array went down (and before you start trying to recreate the array). Something like

mdadm --examine /dev/sd[bcdefghijklmn]1 > raid.status

(adjust to suit your drives) creates a file, raid.status, which is a sequential listing of the mdadm --examine output for all the RAID devices on my system, in order. It's also still there five minutes later when we start messing with mdadm --create, which is the more important point.

Personal tools