Irreversible mdadm failure recovery

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
(English language corrections, and better phrasing, sentence retouching for more precision)
m
Line 11: Line 11:
 
If the correct RAID array information is still available (then are you sure nothing else can be done with conventional ways?), to save some time, please keep the output of cat /proc/mdstat (for getting the members orders), and the output of mdadm --examine against your differents RAID members. It should be done BEFORE calling any mdadm --create command. Creation and Update time are provided into the output, so you can easily know if it contains your original array information, or if it's already lost/overwritten.
 
If the correct RAID array information is still available (then are you sure nothing else can be done with conventional ways?), to save some time, please keep the output of cat /proc/mdstat (for getting the members orders), and the output of mdadm --examine against your differents RAID members. It should be done BEFORE calling any mdadm --create command. Creation and Update time are provided into the output, so you can easily know if it contains your original array information, or if it's already lost/overwritten.
  
Members can be entired drives or partitions inside of it. If you erased a GPT table describing a partition by mistake, or removed the partition by mistake, please create it again, unformatted, at the same place (and same size) it was when used as RAID member. The content of the partition may be completely or mostly untouched.
+
Members can be entire drives or partitions inside of it. If you erased a GPT table describing a partition by mistake, or removed the partition by mistake, please create it again, unformatted, at the same place (and same size) it was when used as RAID member. The content of the partition may be completely or mostly untouched.
  
 
== Situation analysis ==
 
== Situation analysis ==

Revision as of 18:12, 4 May 2019

Contents

Introduction

If you feel comfortable using overlays, this is always a good idea (so that no accidental write can happen to the real members).

Every information and situation you can find here has been seriously tested on mdadm 3.4-4 and 4.1-1 before being published.

Going here is normally not needed, unless mdadm conventional ways won't have any chance to work anymore (we believe that you explored others solution before going here). This section is about searching right parameters to play --create over and existing array, in order to retrieve a correctly configured access to any still available data (be it the entire filesystem if still available).

Be warned: Playing mdadm --create erases and overwrites at least the RAID array information area of every involved RAID members. A misuse of mdadm --create can be the reason why you are here. Without the --assume-clean parameter, one member data area can be entirely reconstructed (sometimes it's fine, sometimes it's not - so it's better to avoid it). And if you write anything to the array before checking everything, it can cause problems if create parameters were incorrect: this is why it needs to be so careful.

If the correct RAID array information is still available (then are you sure nothing else can be done with conventional ways?), to save some time, please keep the output of cat /proc/mdstat (for getting the members orders), and the output of mdadm --examine against your differents RAID members. It should be done BEFORE calling any mdadm --create command. Creation and Update time are provided into the output, so you can easily know if it contains your original array information, or if it's already lost/overwritten.

Members can be entire drives or partitions inside of it. If you erased a GPT table describing a partition by mistake, or removed the partition by mistake, please create it again, unformatted, at the same place (and same size) it was when used as RAID member. The content of the partition may be completely or mostly untouched.

Situation analysis

A RAID array member has 2 area: RAID array information (which may be lost if you are here), and the Data area. As long as the Data area is not erased, you can consider the member as useful for the following steps. If Data area is erased, it should be considered as missing for the following steps (and if the missing member is mandatory, don't quit this page before taking a look at the "Unlucky situations" section: may be some files can be recovered).

You are about to recover entire access to your filesystem as long as:

  • Your array is redundant and you didn't lose/overwrite any member data area
  • Your array isn't redundant but hopefully, you didn't lose/overwrite any member data area
  • Your array is redundant, you have lost/overwritten one member data area, but hopefully there is still enough others members left to get your data back
  • The data inside your members are arranged in one consistent and predictable way all over the array

If you aren't lucky, don't quit this page before taking a look at the "Unlucky situations" section: there's still several things to verify before aborting.

Parameters recovery

Playing --create over an existing array needs you to know:

  • The number of members in your array.
  • The RAID level of your array
  • The chunk size of your array if using RAID0, RAID4, RAID5, RAID6, or RAID10 (generally 512K on modern mdadm versions, at least for 3.4-4 and 4.1-1)
  • The metadata version of your array (depends of the mdadm version initially used to create the array - 1.2 on modern versions, at least for 3.4-4 and 4.1-1)
  • The layout of your array if using RAID5, RAID6, or RAID10 (generally left-symmetric by default for RAID5 and 6, and can be near=2 for RAID10 if your 4 members are A, A, B, B)
  • The data offset of your array (depends on the size of your array members used at initial array creation - even if members have been replaced by bigger ones after, and grow --size=max used! It also depends on the version of mdadm you were using at the initial creation)
  • The position of the different members into the array (if it changed on the motherboard since initial creation, you may need to try several ones. If not, you probably used alphabetical order)
  • Is any of your member missing, or having his Data area overwritten by something like a wrong command? Which one? Because it should be declared as "missing" when running --create. If you don't know which one, you will have to do more attempts.
  • --assume-clean will give you the ability to make several attempts without having the last member rebuilt

If nothing changed on your system (members size, mdadm version...) and you used default parameters, then letting mdadm choosing default ones should be fine and apply the right parameters.

Data-offset

If you changed mdadm version since initial array creation, data-offset is likely to have changed. I have seen 8192s when created with tiny disks (5000 MiB) into mdadm 3.4-4 (while 4.1-1 defaults to 10240s for the same size, and 18432s for 10000 MiB), and 262144s for 8TB drives with mdadm 3.4-4. Didn't try creating with 8TB members on mdadm 4.1-1. May be some formula can be found into the different code versions?

Example of command

The easy one

Using Debian 9, mdadm 3.4-4, a RAID array was created with with 3 x 8TB members, using:

mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1

If nothing changed and I'm still using the same mdadm version, typing the same command again (but appending --assume-clean for safety) will be fine:

mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 --assume-clean

Gnome-disks shows the file system again and I can mount it (trying read-only before, mount -o ro,noload /dev/mdXX /media/raid-volume) : everything is fine and every member is in the right place into the array. Then, do not forget to do a filesystem check and repair after you ensured that you are using the correct parameters, in case some brutal interruption happened previously on the array.

A more precise one

For creating the array again after having upgraded from mdadm 3.4-4 to mdadm 4.1-1, data offset should be specified because the automatically selected value for a given size changed.

Also, disk position changed on my motherboard. After a precise analysis of the previous version defaults parameters for this array, in this case, the working command for finding back my array finally was:

mdadm --create /dev/md0 --level=5 --chunk=512K --metadata=1.2 --layout left-symmetric --data-offset=262144s --raid-devices=3 /dev/sdd1 /dev/sde1 /dev/sdb1 --assume-clean 

One facultative member is missing

If the second disk is missing, type missing instead of its block device name. On a redundant RAID array, data will still be available. Of course, once your array is back working, you will have to --add the missing member soon or late for safety, but after you ensured that the parameters are OK and your data available. The member you will add will be rebuilt with consistent data.

The array was created with 2 disks, and grown to 3

Having the array initially created with 2 disks before having the 3rd one added and the array grown, give the exact same final arrangement compared to a 3 disks array creation.

About accidental rebuild

If you forget the --assume-clean parameter, in case of RAID 5 for example, the last RAID member given in parameter will be rewritten by reading and processing data of the others given members.

In case of RAID 5:

  • Even with wrong order or wrong chunk size, and any wrong RAID5 or RAID4-like layout, if the others members data is fine, and the amount of members is correct, the parity calculation will be correct. Although the computer may not interpret the data the right manner (which bit is parity or data) until you find the right parameters.
  • But if you forget the --assume-clean parameter in others cases (like incorrect amount of disks, or RAID level, or reading a lost member to calculate and rewrite over a good one), consider the overwritten disk as missing.

In case correct data seems to be erased from one of your members, if it wasn't mandatory (if remaining fine members are enough), you will be able to resync it later by --adding it as a new one, but only once your correct parameters are found, and your array back working. If the overwritten member was mandatory, hope you played mdadm --stop. Please find a way to verify if the data is really gone, and if it's gone, go to the "Unlucky situations" section.

Help needed, no combination is giving back access to the data

If you have 3 members and you are sure that data is still available on both of it, try every order: 123 132 213 231 312 321.

If it's not working, you should find a way to ensure data-offset (and may be others values) are correct, by creating a reconstitution of your original array creation environment to find what default values mdadm probably used. If your values are good, your data are back. If one of the members has its data missing, and you know which one (if not, its longer), check every combination:

Member 1/3 is missing: X23, X32, 2X3, 23X, 3X2, 32X

Member 2/3 is missing: 1X3, 13X, X13, X31, 31X, 3X1

Member 3/3 is missing: 12X, 1X2, 21X, 2X1, X12, X21

The numbers stand for the different members block device names, X stands for the "missing" word, in order to represent the parameters order into --create command.

Unlucky situations

The RAID array is split into 2 arrangements

  • Did the reshape/conversion actually started? If you are sure that the answer is no (stuck at 0% with no disk activity, backup file is there but contains nothing, nothing started and it's completely stuck), then things will be easy. Find the parameters set of the unchanged arrangement.
  • If the reshape/conversion actually started and moved some data, this kind of interruption can be restarted just by assembling the array again, and mount/use it as nothing happened. If it doesn't work, see if there is no parameters to force things to continue: don't follow this guide unless RAID information of your members have already been erased, or if you're sure that nothing else worked.
  • If a member failed during reshaping, and your array is redundant, don't play --create! You can add a new member: reshaping will finish in degraded mode, and the last member will be integrated and rebuilt just after.
  • If you have no other choice that playing --create with 2 different arrangements over 2 half of the array, unless someone have a better idea, you're falling into forensics : you'll need to find 2 sets of parameters, and probably need to use a file recovery scanner tool with both parameters set on the array. Be brave! The second set of parameters will probably be the same as the first one, with 1 more member (and raid-devices count set to one more).

Mandatory member lost

So, your RAID is not redundant (or not enough), and a mandatory member (or the data area into it) is screwed.

  • Is your RAID member partially screwed/overwritten? If mdadm --create (or anything else) started to overwrite it, but you played mdadm --stop before it's entirely overwritten (or stopped anything that was overwriting the data into it), some of the data is still available. Find back your correct parameters (but you won't see any ext4 partition so it will be more difficult to know it you have the right ones) and use some file recovery scanner tool over your resulting array. You may still have the surprise to see some files that aren't completely erased.
  • Also, remember that having mdadm --create accidentally overwriting one of the disks should be avoided, but if it accidentally happens, in some case, calculated data overwritten to the disk may be the same that data which was already into it. In such case, there is a chance that your data is totally fine. You can always try to use the member as if it's OK: if it is, your file system will be made available again.
  • If one of the disks physically failed, but the data is still enclosed into it, see if RAW data into it can be rescued and cloned by some company, or if available sectors can be rescued by dd-rescue. See Replacing a failed drive for more detailed information about drive failure, including failure in non-redundant arrays.

Making sure that none of my arrays will never be lost anymore in the future

You can't, because it won't always be your fault. We believe that having succeeded or not into your recovery, you are probably interesting into clever backup approaches for securing the recovered or future data.

Indeed, a defective power supply can destroy some of your disks. A violent shock on the case containing all the disks can cause several of them to fail. A fire incident too. The server can be stolen (with disks inside of it). A ransomware having access to some part of the filesystem, even through the network file sharing, can also destroy all of the writable files. A wrong dd, shred, or you swapped the wrong "failed" disk, opened it to play with what's inside, before realizing it wasn't the failed one. Data memory module malfunction during a rebuild (it can screw the data of one of your members without letting you know what happened). And may be others things.

A free open source versioned continuous copy system like Syncthing (in case files are being encrypted, the previous version is kept and timestamped) isn't officially meant to protect you in this kind of cases, but still does it pretty well. Conventional backups systems and solutions could also be used. In case of malicious attack, still remember to use a backup system that is different enough of the system you are trying to backup (not the same password, not the same config, not the same network, not the same place: it would be like a RAID1 on 2 partitions of the same disk).

Personal tools