Hardware issues

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
 
m
 
(40 intermediate revisions by 16 users not shown)
Line 1: Line 1:
 +
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 +
|- padding:5px;padding-top:0.5em;font-size: 95%;
 +
| Back to [[Devices]] <span style="float:right; padding-left:5px;">Forward to [[RAID setup]]</span>
 +
|}
 
=Hardware issues=
 
=Hardware issues=
  
 
This section will mention some of the hardware concerns involved when
 
This section will mention some of the hardware concerns involved when
running software RAID.
+
running software RAID. References to IDE and SCSI have been deleted, all
 +
recent drives are SATA.
  
If you are going after high performance, you should make sure that the
+
If you are going after high performance, you should be using SSDs (or
bus(ses) to the drives are fast enough. You should not have 14 UW-SCSI
+
hybrid drives), and make sure you match the performance of the drives
drives on one UW bus, if each drive can give 20 MB/s and the bus can
+
to the performance of the bus. Many motherboards come with 6 SATA connectors
only sustain 160 MB/s.  Also, you should only have one device per IDE
+
so setting up a RAID is easy and affordable.
bus. Running disks as master/slave is horrible for performance. IDE is
+
really bad at accessing more that one drive per bus. Of Course, all
+
newer motherboards have two IDE busses, so you can set up two disks in
+
RAID without buying more controllers. Extra IDE controllers are rather
+
cheap these days, so setting up 6-8 disk systems with IDE is easy and
+
affordable.
+
  
 +
See also the section on [[Performance#Bottlenecks|bottlenecks]].
  
==IDE Configuration==
+
==Drive Selection==
  
It is indeed possible to run RAID over IDE disks. And excellent
+
===Desktop and Enterprise drives===
performance can be achieved too. In fact, today's price on IDE drives
+
and controllers does make IDE something to be considered, when setting
+
up new RAID systems.
+
  
* Physical stability: IDE drives has traditionally been of lower mechanical quality than SCSI drives. Even today, the warranty on IDE drives is typically one year, whereas it is often three to five    years on SCSI drives.  Although it is not fair to say, that IDE drives are per definition poorly made, one should be aware that IDE drives of some brand may fail more often that similar SCSI drives.    However, other brands use the exact same mechanical setup for both SCSI and IDE drives. It all boils down to: All disks fail, sooner or later, and one should be prepared for that.
+
Disk drives now tend to come in two varieties, desktop drives from which
 +
most of the features needed for a decent raid have been deleted, and  
 +
enterprise drives, which have the features but are designed to run 24/7.
 +
So if you want to run raid on a desktop system it's rather difficult to
 +
find a drive that is suitable.
  
* Data integrity: Earlier, IDE had no way of assuring that the data sent onto the IDE bus would be the same as the data actually written to the disk. This was due to total lack of parity, checksums, etc.  With the Ultra-DMA standard, IDE drives now do a checksum on the data they receive, and thus it becomes highly unlikely that data get corrupted. The PCI bus however, does not have parity or checksum, and that bus is used for both IDE and SCSI systems.
+
===TLER and SCT/ERC===
  
* Performance: I am not going to write thoroughly about IDE performance here. The really short story is:
+
TLER (Time Limited Error Recovery) is a WD creation, which means that drives will return within 7 seconds.
** IDE drives are fast, although they are not (as of this writing) found in 10.000 or 15.000 rpm versions as their SCSI counterparts
+
Having introduced it, WD subsequently disabled it on most desktop drives,
** IDE has more CPU overhead than SCSI (but who cares?)
+
although it is enabled by default on enterprise drives.  
**  Only use one IDE drive per IDE bus, slave disks spoil performance
+
**  Fault survival: The IDE driver usually survives a failing IDE device. The RAID layer will mark the disk as failed, and if you are running RAID levels 1 or above, the machine should work just fine    until you can take it down for maintenance.
+
  
It is very important, that you only use one IDE disk per IDE bus. Not
+
SCT/ERC is the generic specification implemented by TLER.
only would two disks ruin the performance, but the failure of a disk
+
often guarantees the failure of the bus, and therefore the failure of
+
all disks on that bus.  In a fault-tolerant RAID setup (RAID levels
+
1,4,5), the failure of one disk can be handled, but the failure of two
+
disks (the two disks on the bus that fails due to the failure of the
+
one disk) will render the array unusable. Also, when the master drive
+
on a bus fails, the slave or the IDE controller may get awfully
+
confused. One bus, one drive, that's the rule.
+
  
There are cheap PCI IDE controllers out there. You often get two or
+
If it's available this feature needs to be enabled. If it isn't enabled
four busses for around $80. Considering the much lower price of IDE
+
or available, the linux defaults will interact badly with the drive, and
disks versus SCSI disks, an IDE disk array can often be a really nice
+
a single drive failure will usually take down the array.
solution if one can live with the relatively low number (around 8
+
probably) of disks one can attach to a typical system.
+
  
IDE has major cabling problems when it comes to large arrays. Even if
+
===smartctl -x===
you had enough PCI slots, it's unlikely that you could fit much more
+
than 8 disks in a system and still get it running without data
+
corruption caused by too long IDE cables.
+
  
Furthermore, some of the newer IDE drives come with a restriction that
+
This command will tell you what the drive is capable of. If possible, it
they are only to be used a given number of hours per day. These drives
+
would be wise to see the output of it on the drive(s) you are thinking of buying.
are meant for desktop usage, and it can lead to severe problems if
+
The following is the output from my laptop's Toshiba drive. Note especially where it says
these are used in a 24/7 server RAID environment.
+
SCT Error Recovery Control is supported.
  
 +
<pre>
 +
crappit:/home/anthony # smartctl -x /dev/sda
 +
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.1.27-27-default] (SUSE RPM)
 +
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
  
 +
=== START OF INFORMATION SECTION ===
 +
Device Model:    ST2000LM003 HN-M201RAD
 +
Serial Number:    S321J9DG805231
 +
LU WWN Device Id: 5 0004cf 2106b38eb
 +
Firmware Version: 2BC10001
 +
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
 +
Sector Sizes:    512 bytes logical, 4096 bytes physical
 +
Rotation Rate:    5400 rpm
 +
Device is:        Not in smartctl database [for details use: -P showall]
 +
ATA Version is:  ATA8-ACS T13/1699-D revision 6
 +
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
 +
Local Time is:    Tue Sep 20 00:05:59 2016 BST
 +
SMART support is: Available - device has SMART capability.
 +
SMART support is: Enabled
 +
AAM feature is:  Disabled
 +
APM feature is:  Disabled
 +
Rd look-ahead is: Enabled
 +
Write cache is:  Enabled
 +
ATA Security is:  Disabled, NOT FROZEN [SEC1]
 +
Wt Cache Reorder: Enabled
  
==Hot Swap==
+
=== START OF READ SMART DATA SECTION ===
 +
SMART overall-health self-assessment test result: PASSED
  
Although hot swapping of drives is supported to some extent, it is
+
General SMART Values:
still not something one can do easily.
+
Offline data collection status:  (0x00) Offline data collection activity
 +
                                        was never started.
 +
                                        Auto Offline Data Collection: Disabled.
 +
Self-test execution status:      (  0) The previous self-test routine completed
 +
                                        without error or no self-test has ever
 +
                                        been run.
 +
Total time to complete Offline
 +
data collection:                (22740) seconds.
 +
Offline data collection
 +
capabilities:                    (0x5b) SMART execute Offline immediate.
 +
                                        Auto Offline data collection on/off support.
 +
                                        Suspend Offline collection upon new
 +
                                        command.
 +
                                        Offline surface scan supported.
 +
                                        Self-test supported.
 +
                                        No Conveyance Self-test supported.
 +
                                        Selective Self-test supported.
 +
SMART capabilities:            (0x0003) Saves SMART data before entering
 +
                                        power-saving mode.
 +
                                        Supports SMART auto save timer.
 +
Error logging capability:        (0x01) Error logging supported.
 +
                                        General Purpose Logging supported.
 +
Short self-test routine
 +
recommended polling time:        (  1) minutes.
 +
Extended self-test routine
 +
recommended polling time:        ( 379) minutes.
 +
SCT capabilities:              (0x003f) SCT Status supported.
 +
                                        SCT Error Recovery Control supported.
 +
                                        SCT Feature Control supported.
 +
                                        SCT Data Table supported.
 +
</pre>
  
  
===Hot-swapping IDE drives===
+
==SATA Configuration (2011)==
  
Don't ! IDE doesn't handle hot swapping at all.  Sure, it may work for
 
you, if your IDE driver is compiled as a module (only possible in the
 
2.2 series of the kernel), and you re-load it after you've replaced
 
the drive.  But you may just as well end up with a fried IDE
 
controller, and you'll be looking at a lot more down-time than just
 
the time it would have taken to replace the drive on a downed system.
 
  
The main problem, except for the electrical issues that can destroy
+
SATA is beginning to support a new feature called "port multipliers",
your hardware, is that the IDE bus must be re-scanned after disks are
+
which effectively multiplex several SATA disks onto the same host SATA port.
swapped. While newer Linux kernels do support re-scan of an IDE bus
+
this can decrease cabling concerns. it's also fairly common to see multi-port
(with the help of the hdparm utility), re-detecting partitions is
+
SATA controllers, which put 4 ports onto the connector originated by Infiniband;
still something that is lacking.  If the new disk is 100% identical to
+
this makes it possible to create 24-port SATA controllers, for instance.
the old one (wrt. geometry etc.), it may work, but really, you are
+
walking the bleeding edge here.
+
  
 +
==Hot Swap (2011)==
 +
Note: for description of Linux RAID hotplug support, see the [[Hotplug]] page.
  
===Hot-swapping SCSI drives===
+
===Hot-swapping with SATA/SAS===
  
Normal SCSI hardware is not hot-swappable either. It may however work.
+
SATA/SAS hotplug support is required by the SATA/SAS specifications, therefore SATA/SAS platform is the one where hotplug should be least problematic. But still, you can fall in non-compliance pitfalls, so read on before you start experimenting!
If your SCSI driver supports re-scanning the bus, and removing and
+
appending devices, you may be able to hot-swap devices. However, on a
+
normal SCSI bus you probably shouldn't unplug devices while your
+
system is still powered up. But then again, it may just work (and you
+
may end up with fried hardware).
+
  
The SCSI layer should survive if a disk dies, but not all SCSI drivers
+
====Hotplug support in mainboard/disk controller chipsets====
handle this yet. If your SCSI driver dies when a disk goes down, your
+
Newer mainboard/disk controllers chipsets and their drivers usually support hotplug.  
system will go with it, and hot-plug isn't really interesting then.
+
  
 +
If the chipset is [http://en.wikipedia.org/wiki/Advanced_Host_Controller_Interface AHCI-compliant], it will be (probably) able to use the ahci kernel module providing hotplug and power managment support. The ahci module is present in the Linux kernel since 2.6.19.
 +
 +
But still, not all chipsets support hotplug. Also, some chipsets that could in theory support hotplug (but are not AHCI-compliant) don't have the necessary support in the linux kernel. For more information on SATA drivers' status, see http://ata.wiki.kernel.org/index.php/SATA_hardware_features.
 +
 +
====Hotplug support in SATA/SAS disks====
 +
All current SATA and SAS drives that have the 15 pin SATA power connector are hotplug-ready.
 +
There might be some very old historical SATA disks with 4-pin Molex power connector which do not have the 15 pin SATA power connector. Such old drives should never be hotplugged directly (without a hotswap bay) otherwise you risk their damage.
 +
 +
====Hotplug support by SATA/SAS cables====
 +
 +
For protecting the disk circuitry during the hotplug, the 15-pin SATA/SAS power connector on the cable side must have 2 pins (pin nr. 4 and 12) longer than the others.
 +
 +
Explanation:
 +
*on cable/backplane connector ("receptacle") side, pins 4 and 12 are longer and are called "[http://en.wikipedia.org/wiki/Hot_swapping#Connectors staggered pins]". These pins bring the GND to the disk before the other pins get attached, ensuring that no sensitive circuitry is connected before there is a reliable system ground
 +
*on the [http://www.tomshardware.com/reviews/SERIAL-RAID-CONTROLLERS-AMCC,1738-2.html device side, pins 3, 7, 13 are the staggered pins]. These pins bring the 3.3V, 5V and 12V power to the [http://en.wikipedia.org/wiki/Hot_swapping#Power_electronics precharge power electronics] in the disk before the other power pins are atached.
 +
 +
'''Important warning'''
 +
Normal 15-pin SATA power cable receptacle, found in ordinary power supplies or computer cases, does not have pins 4 and 12 staggered! In fact, it is quite hard to find a hotplug-compatible SATA power receptacle. On the first sight, the difference is subtle, see [http://www.circuitassemblyonline.info/brochure/SATA%20brochure.pdf pictures of several SATA receptacle types here] before you try start playing hotplug games with your drive!
 +
 +
'''!!!! Please remember, that without the staggered GND pins on the SATA power cable receptacle, you risk the damage of your disk when doing hotplug/hot-unplug !!!!
 +
'''
 +
 +
The hotplug-compatible SATA power receptacle must be present in all SAS/SATA hotswap cages.
 +
 +
In case you don't have hotswap cage, but you do have 15pin '''hotplug-compatible''' SATA power receptacle, this should be the correct sequence for plugging and unplugging the disk[http://www.asrock.com/support/qa/TSDQA-14.pdf]:
 +
 +
For hotplug:
 +
#connect the 15pin power receptacle to the disk
 +
#connect the 7pin data cable
 +
 +
For hot-unplug:
 +
#unplug the data cable from the disk
 +
#unplug the power cable
  
 
===Hot-swapping with SCA===
 
===Hot-swapping with SCA===
Line 114: Line 176:
  
  
* Remove the drive to replace from the array:
+
* Mark faulty and remove the drive to replace from the array:
  
     raidhotremove /dev/md0 /dev/sdb1
+
     mdadm -f /dev/md0 /dev/sdb1
 +
    mdadm -r /dev/md0 /dev/sdb1
  
  
Line 155: Line 218:
 
*  Add the drive to your array:
 
*  Add the drive to your array:
  
     raidhotadd /dev/md0 /dev/sdb2
+
     mdadm -a /dev/md0 /dev/sdb1
  
  
Line 167: Line 230:
 
find easier ways to do this, please discuss this on the linux-raid
 
find easier ways to do this, please discuss this on the linux-raid
 
mailing list.
 
mailing list.
 +
 +
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 +
|- padding:5px;padding-top:0.5em;font-size: 95%;
 +
| Back to [[Devices]] <span style="float:right; padding-left:5px;">Forward to [[RAID setup]]</span>
 +
|}

Latest revision as of 11:20, 20 September 2016

Back to Devices Forward to RAID setup

Contents

[edit] Hardware issues

This section will mention some of the hardware concerns involved when running software RAID. References to IDE and SCSI have been deleted, all recent drives are SATA.

If you are going after high performance, you should be using SSDs (or hybrid drives), and make sure you match the performance of the drives to the performance of the bus. Many motherboards come with 6 SATA connectors so setting up a RAID is easy and affordable.

See also the section on bottlenecks.

[edit] Drive Selection

[edit] Desktop and Enterprise drives

Disk drives now tend to come in two varieties, desktop drives from which most of the features needed for a decent raid have been deleted, and enterprise drives, which have the features but are designed to run 24/7. So if you want to run raid on a desktop system it's rather difficult to find a drive that is suitable.

[edit] TLER and SCT/ERC

TLER (Time Limited Error Recovery) is a WD creation, which means that drives will return within 7 seconds. Having introduced it, WD subsequently disabled it on most desktop drives, although it is enabled by default on enterprise drives.

SCT/ERC is the generic specification implemented by TLER.

If it's available this feature needs to be enabled. If it isn't enabled or available, the linux defaults will interact badly with the drive, and a single drive failure will usually take down the array.

[edit] smartctl -x

This command will tell you what the drive is capable of. If possible, it would be wise to see the output of it on the drive(s) you are thinking of buying. The following is the output from my laptop's Toshiba drive. Note especially where it says SCT Error Recovery Control is supported.

crappit:/home/anthony # smartctl -x /dev/sda
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.1.27-27-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST2000LM003 HN-M201RAD
Serial Number:    S321J9DG805231
LU WWN Device Id: 5 0004cf 2106b38eb
Firmware Version: 2BC10001
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Sep 20 00:05:59 2016 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Disabled
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (22740) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 379) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.


[edit] SATA Configuration (2011)

SATA is beginning to support a new feature called "port multipliers", which effectively multiplex several SATA disks onto the same host SATA port. this can decrease cabling concerns. it's also fairly common to see multi-port SATA controllers, which put 4 ports onto the connector originated by Infiniband; this makes it possible to create 24-port SATA controllers, for instance.

[edit] Hot Swap (2011)

Note: for description of Linux RAID hotplug support, see the Hotplug page.

[edit] Hot-swapping with SATA/SAS

SATA/SAS hotplug support is required by the SATA/SAS specifications, therefore SATA/SAS platform is the one where hotplug should be least problematic. But still, you can fall in non-compliance pitfalls, so read on before you start experimenting!

[edit] Hotplug support in mainboard/disk controller chipsets

Newer mainboard/disk controllers chipsets and their drivers usually support hotplug.

If the chipset is AHCI-compliant, it will be (probably) able to use the ahci kernel module providing hotplug and power managment support. The ahci module is present in the Linux kernel since 2.6.19.

But still, not all chipsets support hotplug. Also, some chipsets that could in theory support hotplug (but are not AHCI-compliant) don't have the necessary support in the linux kernel. For more information on SATA drivers' status, see http://ata.wiki.kernel.org/index.php/SATA_hardware_features.

[edit] Hotplug support in SATA/SAS disks

All current SATA and SAS drives that have the 15 pin SATA power connector are hotplug-ready. There might be some very old historical SATA disks with 4-pin Molex power connector which do not have the 15 pin SATA power connector. Such old drives should never be hotplugged directly (without a hotswap bay) otherwise you risk their damage.

[edit] Hotplug support by SATA/SAS cables

For protecting the disk circuitry during the hotplug, the 15-pin SATA/SAS power connector on the cable side must have 2 pins (pin nr. 4 and 12) longer than the others.

Explanation:

  • on cable/backplane connector ("receptacle") side, pins 4 and 12 are longer and are called "staggered pins". These pins bring the GND to the disk before the other pins get attached, ensuring that no sensitive circuitry is connected before there is a reliable system ground
  • on the device side, pins 3, 7, 13 are the staggered pins. These pins bring the 3.3V, 5V and 12V power to the precharge power electronics in the disk before the other power pins are atached.

Important warning Normal 15-pin SATA power cable receptacle, found in ordinary power supplies or computer cases, does not have pins 4 and 12 staggered! In fact, it is quite hard to find a hotplug-compatible SATA power receptacle. On the first sight, the difference is subtle, see pictures of several SATA receptacle types here before you try start playing hotplug games with your drive!

!!!! Please remember, that without the staggered GND pins on the SATA power cable receptacle, you risk the damage of your disk when doing hotplug/hot-unplug !!!!

The hotplug-compatible SATA power receptacle must be present in all SAS/SATA hotswap cages.

In case you don't have hotswap cage, but you do have 15pin hotplug-compatible SATA power receptacle, this should be the correct sequence for plugging and unplugging the disk[1]:

For hotplug:

  1. connect the 15pin power receptacle to the disk
  2. connect the 7pin data cable

For hot-unplug:

  1. unplug the data cable from the disk
  2. unplug the power cable

[edit] Hot-swapping with SCA

With SCA, it is possible to hot-plug devices. Unfortunately, this is not as simple as it should be, but it is both possible and safe.

Replace the RAID device, disk device, and host/channel/id/lun numbers with the appropriate values in the example below:


  • Dump the partition table from the drive, if it is still readable:
    sfdisk -d /dev/sdb > partitions.sdb


  • Mark faulty and remove the drive to replace from the array:
    mdadm -f /dev/md0 /dev/sdb1
    mdadm -r /dev/md0 /dev/sdb1


  • Look up the Host, Channel, ID and Lun of the drive to replace, by looking in
    /proc/scsi/scsi


  • Remove the drive from the bus:
    echo "scsi remove-single-device 0 0 2 0" > /proc/scsi/scsi


  • Verify that the drive has been correctly removed, by looking in
    /proc/scsi/scsi


  • Unplug the drive from your SCA bay, and insert a new drive
  • Add the new drive to the bus:
    echo "scsi add-single-device 0 0 2 0" > /proc/scsi/scsi


(this should spin up the drive as well)

  • Re-partition the drive using the previously dumped partition table:


    sfdisk /dev/sdb < partitions.sdb


  • Add the drive to your array:
    mdadm -a /dev/md0 /dev/sdb1


The arguments to the "scsi remove-single-device" commands are: Host, Channel, Id and Lun. These numbers are found in the "/proc/scsi/scsi" file.

The above steps have been tried and tested on a system with IBM SCA disks and an Adaptec SCSI controller. If you encounter problems or find easier ways to do this, please discuss this on the linux-raid mailing list.

Back to Devices Forward to RAID setup
Personal tools