Hotplug

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
(Short summary of hotplug)
(Fully automated hotplug and hot-unplug using UDEV rules)
 
(21 intermediate revisions by one user not shown)
Line 1: Line 1:
HW issues of the disk hotplugging are described in the Hotswap chapter of the[[Hardware_issues#Hot_Swap|Hardware issues]] page.
+
HW issues of the disk hotplugging are described in the Hotswap chapter of the [[Hardware_issues#Hot_Swap|Hardware issues]] page.
  
 
The Linux RAID supports hotplug operations fully from [http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug Hot-unplug branch] of the mdadm version 3.1.2.
 
The Linux RAID supports hotplug operations fully from [http://neil.brown.name/git?p=mdadm;a=shortlog;h=refs/heads/hotunplug Hot-unplug branch] of the mdadm version 3.1.2.
  
 
=mdadm versions < 3.1.2=
 
=mdadm versions < 3.1.2=
==How to do hotplug and hot-unplug==
+
In older version of mdadm, the hotplug & hot-unplug support is present, but for full automatic functionality, we need to employ some bits of scripting. First of all, look what madm provides by manually trying its features from command line:
In mdadm versions < 3.1.2, the possibilites for handling the hotplug are limited:
+
==Hot-unplug from command line==
*If the physical disk is still alive: mdadm --fail and then mdadm --remove all components belonging to that physical disk. Then the disk can be hot-unplugged without any problems
+
*If the physical disk is still alive:  
*If the physical disk is dead:
+
mdadm --fail /dev/mdX /dev/sdYZ
 +
mdadm --remove /dev/mdX /dev/sdYZ
 +
Do this for all RAIDs containing partitions of the failed disk. Then the disk can be hot-unplugged without any problems
 +
*If the physical disk is dead or unplugged, just do
 +
mdadm /dev/mdX --fail detached --remove detached
  
==Examples of behavior==
+
==Fully automated hotplug and hot-unplug using UDEV rules==
 +
In case you need fully automatic hot-plug and hot-unplug events handling, the UDEV "add" and "remove" events can be used for this.
 +
 
 +
Note: the following code had been validated on Linux Debian 5 (Lenny), with kernel 2.6.26 and udevd version 125.
 +
 
 +
Important notes:
 +
*the rule for "add" event MUST be placed in a file positioned after the "persistent_storage.rules" file, because it uses the ENV{ID_FS_TYPE} condition, which is produced by the persistent_storage.rules file during the "add" event processing.
 +
*The rule for "remove" event can reside in any file in the UDEV rules chain, but let's keep it together with the "add" rule :-)
 +
 
 +
For this reason, in Debian Lenny I placed the mdadm hotplug rules in file /etc/udev/rules.d/66-mdadm-hotplug.rules
 +
This is the content of the file:
 +
 
 +
SUBSYSTEM!="block", GOTO="END_66_MDADM"
 +
ENV{ID_FS_TYPE}!="linux_raid_member", GOTO="END_66_MDADM"
 +
ACTION=="add",  RUN+="/usr/local/sbin/handle-add-old $env{DEVNAME}"
 +
ACTION=="remove", RUN+="/usr/local/sbin/handle-remove-old $name"
 +
LABEL="END_66_MDADM"
 +
 
 +
(these rules are based on the UDEV rules contained in the [http://kerneltrap.org/mailarchive/linux-raid/2010/4/15/6884695/thread hot-unplug patches by Doug Ledford])
 +
 
 +
And here are the scripts which are called from these rules:
 +
 
 +
#!/bin/bash
 +
#This is the /usr/local/sbin/handle-add-old
 +
MDADM=/sbin/mdadm
 +
LOGGER=/usr/bin/logger
 +
mdline=`mdadm --examine --scan $1` #mdline contains something like "ARRAY /dev/mdX level=raid1 num-devices=2 UUID=..."
 +
mddev=${mdline#* }                #delete "ARRAY " and return the result as mddev
 +
mddev=${mddev%% *}                #delete everything behind /dev/mdX
 +
$LOGGER $0 $1
 +
if [ -n "$mddev" ]; then
 +
    $LOGGER "Adding $1 into RAID device $mddev"
 +
    log=`$MDADM -a $mddev $1 2>&1`
 +
    $LOGGER "$log"
 +
fi
 +
 
 +
#!/bin/bash
 +
#This is the /usr/local/sbin/handle-remove-old
 +
MDADM=/sbin/mdadm
 +
LOGGER=/usr/bin/logger
 +
$LOGGER "$0 $1"
 +
mdline=`grep $1 /proc/mdstat`  #mdline contains something like "md0 : active raid1 sda1[0] sdb1[1]"
 +
mddev=${mdline% :*}            #delete everything from " :" till the end of line and return the result as mddev
 +
$LOGGER "$0: Trying to remove $mdpart from $mddev"
 +
log=`$MDADM /dev/$mddev --fail detached --remove detached 2>&1`
 +
$LOGGER $log
 +
 
 +
=mdadm versions > 3.1.2=
 +
The hot-unplug support introduced in mdadm version 3.1.2 removed the necessity of scripting you see above.
 +
If your Linux distribution contains this or later version of mdadm, you hopefully have fully automatic hotplug and hot-unplug without any hassles.
 +
 
 +
=Examples of behavior WITHOUT the automatic hotplug/hot-unplug=
 
Let's have the following RAID configuration:
 
Let's have the following RAID configuration:
  
Line 22: Line 77:
 
The md0 contains the system, md1 is for data (but is not used yet).
 
The md0 contains the system, md1 is for data (but is not used yet).
  
 +
==Hot-unplug==
 
If we hot-unplug the disk /dev/sda, the /proc/mdstat will show:
 
If we hot-unplug the disk /dev/sda, the /proc/mdstat will show:
  
Line 50: Line 106:
 
       224612672 blocks [2/1] [_U]
 
       224612672 blocks [2/1] [_U]
  
 
+
At any point after the disk has been unplugged, we can remove its partitions from an array only by this command:
What we can do is remove the failed disk:
+
 
  # mdadm /dev/md0 --fail detached --remove detached
 
  # mdadm /dev/md0 --fail detached --remove detached
 
  mdadm: hot removed 8:1
 
  mdadm: hot removed 8:1
  
=mdadm versions > 3.1.2=
+
==Hotplug==
(to be finished)
+
(to be finished: example of how the kernel assigns new drive letter to the same old disk we have just unplugged, because it considers the /dev/sda as seized ...)

Latest revision as of 04:45, 16 April 2010

HW issues of the disk hotplugging are described in the Hotswap chapter of the Hardware issues page.

The Linux RAID supports hotplug operations fully from Hot-unplug branch of the mdadm version 3.1.2.

Contents

[edit] mdadm versions < 3.1.2

In older version of mdadm, the hotplug & hot-unplug support is present, but for full automatic functionality, we need to employ some bits of scripting. First of all, look what madm provides by manually trying its features from command line:

[edit] Hot-unplug from command line

  • If the physical disk is still alive:
mdadm --fail /dev/mdX /dev/sdYZ
mdadm --remove /dev/mdX /dev/sdYZ 

Do this for all RAIDs containing partitions of the failed disk. Then the disk can be hot-unplugged without any problems

  • If the physical disk is dead or unplugged, just do
mdadm /dev/mdX --fail detached --remove detached

[edit] Fully automated hotplug and hot-unplug using UDEV rules

In case you need fully automatic hot-plug and hot-unplug events handling, the UDEV "add" and "remove" events can be used for this.

Note: the following code had been validated on Linux Debian 5 (Lenny), with kernel 2.6.26 and udevd version 125.

Important notes:

  • the rule for "add" event MUST be placed in a file positioned after the "persistent_storage.rules" file, because it uses the ENV{ID_FS_TYPE} condition, which is produced by the persistent_storage.rules file during the "add" event processing.
  • The rule for "remove" event can reside in any file in the UDEV rules chain, but let's keep it together with the "add" rule :-)

For this reason, in Debian Lenny I placed the mdadm hotplug rules in file /etc/udev/rules.d/66-mdadm-hotplug.rules This is the content of the file:

SUBSYSTEM!="block", GOTO="END_66_MDADM"
ENV{ID_FS_TYPE}!="linux_raid_member", GOTO="END_66_MDADM"
ACTION=="add",  RUN+="/usr/local/sbin/handle-add-old $env{DEVNAME}"
ACTION=="remove", RUN+="/usr/local/sbin/handle-remove-old $name"
LABEL="END_66_MDADM"

(these rules are based on the UDEV rules contained in the hot-unplug patches by Doug Ledford)

And here are the scripts which are called from these rules:

#!/bin/bash
#This is the /usr/local/sbin/handle-add-old
MDADM=/sbin/mdadm
LOGGER=/usr/bin/logger
mdline=`mdadm --examine --scan $1` #mdline contains something like "ARRAY /dev/mdX level=raid1 num-devices=2 UUID=..."
mddev=${mdline#* }                 #delete "ARRAY " and return the result as mddev
mddev=${mddev%% *}                 #delete everything behind /dev/mdX
$LOGGER $0 $1
if [ -n "$mddev" ]; then
   $LOGGER "Adding $1 into RAID device $mddev"
   log=`$MDADM -a $mddev $1 2>&1`
   $LOGGER "$log"
fi
#!/bin/bash
#This is the /usr/local/sbin/handle-remove-old
MDADM=/sbin/mdadm
LOGGER=/usr/bin/logger
$LOGGER "$0 $1"
mdline=`grep $1 /proc/mdstat`  #mdline contains something like "md0 : active raid1 sda1[0] sdb1[1]"
mddev=${mdline% :*}            #delete everything from " :" till the end of line and return the result as mddev
$LOGGER "$0: Trying to remove $mdpart from $mddev"
log=`$MDADM /dev/$mddev --fail detached --remove detached 2>&1`
$LOGGER $log

[edit] mdadm versions > 3.1.2

The hot-unplug support introduced in mdadm version 3.1.2 removed the necessity of scripting you see above. If your Linux distribution contains this or later version of mdadm, you hopefully have fully automatic hotplug and hot-unplug without any hassles.

[edit] Examples of behavior WITHOUT the automatic hotplug/hot-unplug

Let's have the following RAID configuration:

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
      3903680 blocks [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      224612672 blocks [2/2] [UU]

The md0 contains the system, md1 is for data (but is not used yet).

[edit] Hot-unplug

If we hot-unplug the disk /dev/sda, the /proc/mdstat will show:

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[2](F) sdb1[1]
      3903680 blocks [2/1] [_U]

md1 : active raid1 sda2[0] sdb2[1]
      224612672 blocks [2/2] [UU]

We see that sda1 has role [2]. Since RAID1 needs only 2 components - [0] and [1], the [2] means "Spare disk". And it is (F)ailed.

But why the system thinks that /dev/sda2 in /dev/md1 is still OK? Because my system hasn't tried to access /dev/md1 yet (I have no data on /dev/md1). The /dev/sda2 will be marked as fault automatically as soon as I try to access /dev/md1:

# dd if=/dev/md1 of=/dev/null bs=1 count=1
1+0 records in
1+0 records out
1 byte (1 B) copied, 0.0184819 s, 0.1 kB/s
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[2](F) sdb1[1]
      3903680 blocks [2/1] [_U]

md1 : active raid1 sda2[2](F) sdb2[1]
      224612672 blocks [2/1] [_U]

At any point after the disk has been unplugged, we can remove its partitions from an array only by this command:

# mdadm /dev/md0 --fail detached --remove detached
mdadm: hot removed 8:1

[edit] Hotplug

(to be finished: example of how the kernel assigns new drive letter to the same old disk we have just unplugged, because it considers the /dev/sda as seized ...)

Personal tools