Why RAID?

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
 
(de-raidtoolsed)
Line 5: Line 5:
 
performance improvements, and redundancy.
 
performance improvements, and redundancy.
  
It is, however, very important to understand that RAID is not a
+
It is, however, very important to understand that RAID is not a general
 
substitute for good backups. Some RAID levels will make your systems
 
substitute for good backups. Some RAID levels will make your systems
immune to data loss from single-disk failures, but RAID will not allow
+
immune to data loss from one or two disk failures, but RAID will not allow
 
you to recover from an accidental "rm -rf /". RAID will also not help
 
you to recover from an accidental "rm -rf /". RAID will also not help
 
you preserve your data if the server holding the RAID itself is lost
 
you preserve your data if the server holding the RAID itself is lost
Line 14: Line 14:
  
 
RAID will generally allow you to keep systems up and running, in case
 
RAID will generally allow you to keep systems up and running, in case
of common hardware problems (single disk failure). It is not in itself
+
of common hardware problems (disk failure). It is not in itself
 
a complete data safety solution. This is very important to realize.
 
a complete data safety solution. This is very important to realize.
  
Line 21: Line 21:
  
 
Linux RAID can work on most block devices. It doesn't matter whether
 
Linux RAID can work on most block devices. It doesn't matter whether
you use IDE or SCSI devices, or a mixture. Some people have also used
+
you use SATA, USB, IDE or SCSI devices, or a mixture. Some people have also used
the Network Block Device (NBD) with more or less success.
+
the Network Block Device (NBD) with success.
  
 
Since a Linux Software RAID device is itself a block device, the above
 
Since a Linux Software RAID device is itself a block device, the above
Line 34: Line 34:
 
You can put any filesystem on a RAID device, just like any other block
 
You can put any filesystem on a RAID device, just like any other block
 
device.
 
device.
 
 
  
 
==Performance==
 
==Performance==
Line 90: Line 88:
 
==Why mdadm?==
 
==Why mdadm?==
  
The classic raidtools are the standard software RAID management tool
+
mdadm is now the standard software RAID management tool
for Linux, so using mdadm is not a must.
+
for Linux; using raidtools is deprecated (and the code does not appear
 
+
to have been worked on since Jan 2003) and although it will still work
However, if you find raidtools cumbersome or limited, mdadm (multiple
+
for basic use, bugs have been reported where raidtools does not handle
devices admin) is an extremely useful tool for running RAID systems.
+
new features correctly.
It can be used as a replacement for the raidtools, or as a supplement.
+
  
 
The mdadm tool, written by Neil Brown, a software engineer at the
 
The mdadm tool, written by Neil Brown, a software engineer at the
University of New South Wales and a kernel developer, is now at
+
University of New South Wales and a kernel developer.
version 1.4.0 and has proved to be quite stable. There is much
+
See http://www.kernel.org/pub/linux/utils/raid/mdadm/ANNOUNCE for the latest version.
positive response on the Linux-raid mailing list and mdadm is likely
+
to become widespread in the future.
+
  
 
The main differences between mdadm and raidtools are:
 
The main differences between mdadm and raidtools are:
 
  
 
* mdadm can diagnose, monitor and gather detailed information about your arrays
 
* mdadm can diagnose, monitor and gather detailed information about your arrays

Revision as of 12:26, 3 October 2006

Contents

Why RAID?

There can be many good reasons for using RAID. A few are; the ability to combine several physical disks into one larger "virtual" device, performance improvements, and redundancy.

It is, however, very important to understand that RAID is not a general substitute for good backups. Some RAID levels will make your systems immune to data loss from one or two disk failures, but RAID will not allow you to recover from an accidental "rm -rf /". RAID will also not help you preserve your data if the server holding the RAID itself is lost in one way or the other (theft, flooding, earthquake, Martian invasion etc.)

RAID will generally allow you to keep systems up and running, in case of common hardware problems (disk failure). It is not in itself a complete data safety solution. This is very important to realize.


Device and filesystem support

Linux RAID can work on most block devices. It doesn't matter whether you use SATA, USB, IDE or SCSI devices, or a mixture. Some people have also used the Network Block Device (NBD) with success.

Since a Linux Software RAID device is itself a block device, the above implies that you can actually create a RAID of other RAID devices. This in turn makes it possible to support RAID-10 (RAID-0 of multiple RAID-1 devices), simply by using the RAID-0 and RAID-1 functionality together. Other more exotic configurations, such a RAID-5 over RAID-5 "matrix" configurations are equally supported.

The RAID layer has absolutely nothing to do with the filesystem layer. You can put any filesystem on a RAID device, just like any other block device.

Performance

Often RAID is employed as a solution to performance problems. While RAID can indeed often be the solution you are looking for, it is not a silver bullet. There can be many reasons for performance problems, and RAID is only the solution to a few of them.

See the Introduction#The_RAID_levels for a mention of the performance characteristics of each level.


Swapping on RAID

There's no reason to use RAID for swap performance reasons. The kernel itself can stripe swapping on several devices, if you just give them the same priority in the /etc/fstab file.

A nice /etc/fstab looks like:

 /dev/sda2       swap           swap    defaults,pri=1   0 0
 /dev/sdb2       swap           swap    defaults,pri=1   0 0
 /dev/sdc2       swap           swap    defaults,pri=1   0 0
 /dev/sdd2       swap           swap    defaults,pri=1   0 0
 /dev/sde2       swap           swap    defaults,pri=1   0 0
 /dev/sdf2       swap           swap    defaults,pri=1   0 0
 /dev/sdg2       swap           swap    defaults,pri=1   0 0


This setup lets the machine swap in parallel on seven SCSI devices. No need for RAID, since this has been a kernel feature for a long time.

Another reason to use RAID for swap is high availability. If you set up a system to boot on eg. a RAID-1 device, the system should be able to survive a disk crash. But if the system has been swapping on the now faulty device, you will for sure be going down. Swapping on a RAID-1 device would solve this problem.

There has been a lot of discussion about whether swap was stable on RAID devices. This is a continuing debate, because it depends highly on other aspects of the kernel as well. As of this writing, it seems that swapping on RAID should be perfectly stable, you should however stress-test the system yourself until you are satisfied with the stability.

You can set up RAID in a swap file on a filesystem on your RAID device, or you can set up a RAID device as a swap partition, as you see fit. As usual, the RAID device is just a block device.


Why mdadm?

mdadm is now the standard software RAID management tool for Linux; using raidtools is deprecated (and the code does not appear to have been worked on since Jan 2003) and although it will still work for basic use, bugs have been reported where raidtools does not handle new features correctly.

The mdadm tool, written by Neil Brown, a software engineer at the University of New South Wales and a kernel developer. See http://www.kernel.org/pub/linux/utils/raid/mdadm/ANNOUNCE for the latest version.

The main differences between mdadm and raidtools are:

  • mdadm can diagnose, monitor and gather detailed information about your arrays
  • mdadm is a single centralized program and not a collection of disperse programs, so there's a common syntax for every RAID management command
  • mdadm can perform almost all of its functions without having a configuration file and does not use one by default
  • Also, if a configuration file is needed, mdadm will help with management of it's contents
Personal tools