RAID Boot

From Linux Raid Wiki
Revision as of 04:53, 9 July 2008 by NjqGyv (Talk | contribs)

Jump to: navigation, search

raven riley free video sample jude law and a semester abroad muff gallery oklahoma realestate license my 5 wife movie cingular cell phone ringtone domain licencias de conducir miami inner universe music video kurupt torrent password elizabeth hurley movie clip bail bondsman training sadducee salt lake city home carver alien vs predator movie review animated gifs from video games ludacris pontiac torrent ad mandel center puppet making movie taxi driver abductor caffaro cheri video language activator longman andreas osiander cell motorola phone ringtone stunt kites push the tempo music video international space station motivating students jeanine garofalo movies serial switch boxes audio tags editor crack morgan webb nude water quality test results mixican pharmacy shut up bitch video wrestling lady com nude video warcraft world actonel claritin sony vaio notebook review pcg 50th wedding anniversary sitemap neopets cheat airborne video tamoxifen side effects cradle of love sirin domain we fuck in public online music videos quicktime infectious grooves torrent adult video links pirates of the caribbean zoe saldana online order ritalin silmix replacements bastards of young video chicken wire keyboard sounder serial vermont real estate companies female snailmail penpals j2ee modules losing weight on prozac stuck on you soundtrack movie aspect hd crack bulging pants advanced im password recovery 2.50 serial cozy cabin rentals lake tahoe movie theater biddeford nude photo bowling balls museum lights ata.com morris brown college ga housewarming gift ideas home blindfolds home ovnis videos holylemon.com mime video clips software localization pussy poppin jessica beil naked awave studio 8.8 crack amethyst cadwizz 2004 crack ambers free video adult video clips live chat hard money loan california ku tan spelling bee faint video puppydog lily thai movie gallery domain wishful thinking movie myspace myspace.com site vendetta investigational drug study domain pink floyd concert videos alprazolam cheap fishfinders evanescence not for your ears torrent garage door backup tennis warehouse magic dvd ripper serialz web make your doll hustler centerfolds jazz sheet music eee in ma 37205 3 bitter melon elapsed time activity omni health club indiana around the house drugs pc to tv s video cable joyangeles as seen on tv computers telsmith unbalanced kisses launch driver license newport ri value stream mapping make a music playlist for myspace laura orsolya movies picher woman new hampshire license plate super stretch hummer limo university of michigan pharmacy residency web movies in grants pass oregon evil dead army of darkness quotes kitchendraw v4.5 crack ebony shemale movie elk hunting gear list arena football game review video jade hsu video clips indian feet sexy filipina insurance jobs dog lover elton john songs auction tamer crack medication to treat drug addiction indicator video xy never gonna let you go lyric jesse dumanch video portable air coolers sensation white video link with ephedra buy christian discipline false teaching demi moore pregnant lymphoma chemotherapy drugs initial d the movie micromedia flash mx 2004 serial only free paul newman movie list iso buster 1.7 crack jackie carter videos pervirella torrent sitemap movie pussy sucking akane soma inspirational wall mirror mail site torrent invest in movie king kong movie release date microsoft antispamware beta movie seattle theater nude coed video movie kama ncaa incidents involving drugs and alcohol rj45 serial cable pinout math worksheets word problem domain stay awake drug empty house korean movie log cabins http lucy thai video clips springfield republican monkey classification hummingbird nectar recipe patra music video rn board california probate cases movie knife pitch black movie dummies minnie driver music video nalgas com star wars a new hope movie firtos of videocon sony video camera manual pirates who dont do anything mp3 movie a night at the here sober video clip image rollover effects kamusi ya kiswahili kates playground shower movie most powerful superhero bathtub girl skinny dipping in maine path client mary mccunn monica bedi movies pokemon 3 the movie website linux root password cracker web index supid videos.com here beastia arjona de musicales ricardo video latin girls video clips inder residency insect wingless sitemap alzheimers disease video adult video wholesale milla jovovich schylling toy advice and consent movie url air france in flight movies advanced respiratory inc. free nokia rap ringtone movie gods acyclovir buy prescription online perscription drugs francine plotycia link map open source video editing software nerf sitemap tourist information center rochdale lawnboy mowers sitemap solar shower sunless tanning system porn star cum shot movie strip club sydney lg lx5450 ringtones body corporate queensland lesbians gone wild available domains crack acid reflux relief sun ice center http talking ringtones sleeper movie review shenmue online movies website embers falling on dry grass torrent medical billing home rackmount monitor lortab addictions uncompile visual basic home solid wood bookcases the spill canvas ringtone nin music videos eternet hub star wars movie release 2005 american idol video train crafts kids eric clapton video downloads commercial real estate property shel flinstone porn rowe photographic video and audio amc gremlin pics carole bouquet nude preschool games hunting lodge canada asian video sex links mechanisms of drug resistance utkal university orissa bfgoodrich tire holton relays shemale daily movie michigan dept of motor vehicle couches adult free poker strip video http what is perky 740 adapter box mn wireless x lick many music video atreyu abuse drug victim ur movie beginnings quicktime wyoming codeine losing virginity video inside twistys video preview movie arts modified mini mental state more action actos class lawsuit Historically, when the kernel booted, it used a mechanism called 'autodetect' to identify partitions which are used in RAID arrays: it assumed that all partitions of type 0xfd are so used. It then attempted to automatically assemble and start these arrays.

This approach can cause problems in several situations (imagine moving part of an old array onto another machine before wiping and repurposing it: reboot and watch in horror as the piece of dead array gets assembled as part of the running RAID array, ruining it); kernel autodetect is correspondingly deprecated.

The recommended approach now is to use the initramfs system.

This system is documented in more detail than you're likely to care about in the file Documentation/filesystems/ramfs-rootfs-initramfs.txt in the kernel source tree, but in brief it allows you to store in the kernel image a nonswappable in-memory filesystem (the 'rootfs') which is uncompressed as the root filesystem as the kernel boots; the kernel runs /init and leaves it to find the root filesystem, chroot into it, and execute the real /sbin/init. It's sort of like the old initrd system, only your image never gets out of sync with the kernel, it's much easier to build the image (the kernel build system can put it together for you), the kernel can always find it, and there's no overcomplicated scheme for switching to the real root filesystem: you can just chroot.

This approach provides a great deal of flexibility: you can get your root filesystem from LVM layered over a RAID array stored on a dozen network block devices on machines in Gautemala, San Diego, and Tokyo if you really need to (although that particular combination might be a bit slow without careful use of 'write-mostly'). I've even heard that some people have a C compiler on there, and recompile third-party modules for the running kernel on the fly from the source code!

But this flexibility comes with a price, and getting the thing working in the first place is a bit tricky. If init isn't PID 1, you're in trouble; if you've left anything in the rootfs before chrooting, you've lost the memory it was in forever: if you mess up population you've got a useless kernel image; and populating it is sort of like working on an embedded system, because unless you want a 20Mb kernel image you'd better use small tools, like busybox, and preferably a small libc, like uClibc.

But it's useful if you want to mount RAID arrays before booting: e.g., examining the partitions and assembling arrays with a defined UUID, while leaving you with enough emergency repair facilities to figure out what's wrong if assembly fails.

mdadm comes with such a script, but choice is good, so here's another. It has a number of improvements over the mdadm variation:

  • It handles LVM2 as well as md (obviously if you boot off RAID you still have to boot off RAID1, but /boot can be a RAID1 filesystem of its own now, with / in LVM, on RAID, or both at once; you don't even need md on the machine anymore)
  • You can leave lvm or mdadm off the image if you don't need them, so you can use the same initramfs for many machines, only some of which use RAID or LVMed root filesystems
  • It fscks / before mounting it
  • If anything goes wrong, it drops you into an emergency shell in the rootfs, where you have all the power of ash with hardly any builtin commands, lvm and mdadm to diagnose your problem!
  • it supports a number of arguments: 'rescue', to drop into /bin/ash instead of init after mounting the real root filesystem, 'emergency', to drop into a shell on the initramfs before doing anything, and 'trace', to turn on shell tracing early in the init script execution, so if something's failing with a bizarre error message, you can tell what it is. It also supports numeric arguments 1 to 5, `single' and `-b', which just get passed down to init.
  • It supports root= and init= arguments, although for arcane reasons to do with LILO suckage you need to pass the root argument as `root=LABEL=/dev/some/device', or LILO will helpfully transform it into a device number, which is rarely useful if the device name is, say, /dev/emergency-volume-group/root. It gets the default volume group name from a file `vgname' which you have to arrange to put on the initramfs (sticking it in the usr/ subdirectory in the kernel tree will do). It also supports root-type= and root-options= arguments, so you can mount root with noatime or force filesystem detection should you need to.

The default VG name, root device name, mount options and filesystem type are all derived from the entry for / in /etc/fstab. This will generally do the right thing, unless your root filesystem isn't on LVM and you use a name like /dev/disk/by-label/root: the initramfs doesn't have udev on it and so won't understand such labels. You could fix this by passing the root= argument or by having a second fstab used just for the initramfs: both will work.

  • It doesn't waste memory. initramfs isn't like initrd: if you just chroot into the new root filesystem, the data in the initramfs stays around, in nonswappable kernel memory. And it's not gzipped by that point, either!

The downsides:

  • It needs busybox 1.2 or later, and a 2.6.12 kernel with sysfs and hotplug support; this is because it populates /dev with the `mdev' mini-udev tool inside busybox, and switches root filesystems with the `switch_root' tool, which chroots only after erasing the entire contents of the initramfs (taking great care not to recurse off that filesystem!)
  • If you link against uClibc you'll need mdadm 2.5.2 or later: earlier versions will crash.
  • if you link against uClibc (recommended), you need a CVS uClibc too (i.e., one newer than 0.9.27).
  • It doesn't try to e.g. set up the network: changing the script to do that isn't likely to be terribly difficult.
  • You need an /etc/mdadm.conf (if using md) and an /etc/lvm/lvm.conf, both taken by default from the system you built the kernel on: personally I'd recommend a really simple one with no device= lines, like
DEVICE partitions
ARRAY /dev/md0 UUID=some:long:uuid:here
ARRAY /dev/md1 UUID=another:long:uuid:here
ARRAY /dev/md2 UUID=yetanother:long:uuid:here
...

(I might change this to use --homehost in future, whereupon you'd only need to provide a file giving the hostname; but --homehost isn't widely-enough available yet, and I haven't tried it myself.)

Here's the init script:

#!/bin/ash
#
# init --- locate and mount root filesystem
#          By Nix <nix@esperi.org.uk>.
#
#          Placed in the public domain.
#

export PATH=/sbin:/bin

/bin/mount -t proc proc /proc
/bin/mount -t sysfs sysfs /sys
CMDLINE=`cat /proc/cmdline`

# Populate /dev from /sys

/bin/mount -t tmpfs tmpfs /dev
/sbin/mdev -s

# Locate the root filesystem's fstab entry; collapse spaces and tabs in it:
# extract its significant components. (There are three raw tabs in the next
# line, each next to a single space.)

FSENT=`sed -n '/[ 	]\/[ 	]/ { s,[ 	][ 	]*, ,g; p; }' < /etc/fstab`
ROOT="`echo $FSENT | tr ' ' '\n' | sed -n '1p'`"
TYPE="`echo $FSENT | tr ' ' '\n' | sed -n '3p'`"
OPTS="`echo $FSENT | tr ' ' '\n' | sed -n '4p'`"

# Parse arguments, engaging trace mode or dropping to rescue or emergency shells
# as needed. If there is a forced init program, root filesystem, root fs type or
# root fs options, accept the forcing.

INIT_ARGS=

for param in $CMDLINE; do
    case "$param" in
        init=*) eval "$param";;
	-b|single|s|S|[1-5]) INIT_ARGS="$INIT_ARGS $param";;
        trace) echo "Tracing init script.";
               set -x;;
        rescue) echo "Rescue boot mode: invoking ash.";
                init=/bin/ash;
                INIT_ARGS="-";;
        emergency) echo "Emergency boot mode. Dropping to a minimal shell.";
                   echo "Reboot with Ctrl-Alt-Delete.";
                   exec /bin/sh;;
        root=LABEL=*) ROOT=$(echo $1 | cut -d= -f3-);;
        root-type=*) TYPE=$(echo $1 | cut -d= -f2-);;
        root-options=*) OPTS=$(echo $1 | cut -d= -f2-);;
    esac
done

# Assemble the RAID arrays. We enable all that we can find, because we can't
# be sure which of them are needed to assemble the VG on which the root
# filesystem is located (if any). If you have RAID arrays which span devices
# which are not yet accessible, you'll probably want to add --no-degraded here,
# or build the initramfs with an mdadm.conf that does not mention the arrays
# you don't want assembled at this point.
#
# Perhaps we want to avoid starting degraded arrays no matter what, but I'd
# prefer my system to boot even if a drive fails.

if [ -x /sbin/mdadm ]; then
    /sbin/mdadm --assemble --scan --auto=md
fi

# If there are two slashes in the root filesystem location after the
# leading slash (e.g. /dev/raid/root), we assume that the middle
# component is the name of the volume group. Otherwise, we assume that
# no VG is involved.

VGNAME=
if [ "`echo $ROOT | sed 's,^/,,' | tr '/' '\n' | wc -l`" -eq 3 ]; then
    VGNAME="`echo $ROOT | sed 's,^/,,' | tr '/' '\n' | sed -n '2p'`"
fi

FAILED=

# Scan for volume groups. We activate only the group on which the
# root filesystem is stored; the other groups may span devices which
# are not yet accessible.

if [ -x /sbin/lvm -a -n $VGNAME ]; then
    /sbin/lvm vgscan --ignorelockingfailure --mknodes
Personal tools