Devices

Software RAID devices are so-called "block" devices, like ordinary disks or disk partitions. A RAID device is "built" from a number of other block devices - for example, a RAID-1 could be built from two ordinary disks, or from two disk partitions (on separate disks - please see the description of RAID-1 for details on this).

There are no other special requirements to the devices from which you build your RAID devices - this gives you a lot of freedom in designing your RAID solution. For example, you can build a RAID from a mix of IDE and SCSI devices, and you can even build a RAID from other RAID devices (this is useful for RAID-0+1, where you simply construct two RAID-1 devices from ordinary disks, and finally construct a RAID-0 device from those two RAID-1 devices).

Therefore, in the following text, we will use the word "device" as meaning "disk", "partition", or even "RAID device". A "device" in the following text simply refers to a "Linux block device". It could be anything from a SCSI disk to a network block device. We will commonly refer to these "devices" simply as "disks", because that is what they will be in the common case.

However, there are several roles that devices can play in your arrays. A device could be a "spare disk", it could have failed and thus be a "faulty disk", or it could be a normally working and fully functional device actively used by the array.

In the following we describe two special types of devices; namely the "spare disks" and the "faulty disks".

To avoid confusion it is worth mentioning the existence of MD Faulty - this is a special debugging type of RAID that uses a normal device and simulates faults.

Spare disks

Spare disks are disks that do not take part in the RAID set until one of the active disks fail. When a device failure is detected, that device is marked as "bad" and reconstruction is immediately started on the first spare-disk available.

Thus, spare disks add a nice extra safety to especially RAID-5 systems that perhaps are hard to get to (physically). One can allow the system to run for some time, with a faulty device, since all redundancy is preserved by means of the spare disk.

You cannot be sure that your system will keep running after a disk crash though. The RAID layer should handle device failures just fine, but SCSI drivers could be broken on error handling, or the IDE chipset could lock up, or a lot of other things could happen.

Also, once reconstruction to a hot-spare begins, the RAID layer will start reading from all the other disks to re-create the redundant information. If multiple disks have built up bad blocks over time, the reconstruction itself can actually trigger a failure on one of the "good" disks. This will lead to a complete RAID failure. If you do frequent backups of the entire filesystem on the RAID array, then it is highly unlikely that you would ever get in this situation - this is another very good reason for taking frequent backups. Remember, RAID is not a substitute for backups.

Faulty disks

When the RAID layer handles device failures just fine, crashed disks are marked as faulty, and reconstruction is immediately started on the first spare-disk available.

Faulty disks still appear and behave as members of the array. The RAID layer just treats crashed devices as inactive parts of the filesystem.

(Need comments on the UUU_ syntax and removing them)

Devices

Devices

Spare disks

Faulty disks

Views

Personal tools

Navigation

Search

Tools