RAID and filesystems

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
m (File System impact)
(Stripe unit)
Line 10: Line 10:
  
 
For raid 10, the picture seems rather more complicated.
 
For raid 10, the picture seems rather more complicated.
 +
 +
=== Optimal Stripe Unit ===
 +
 +
Choosing the optimal stripe unit is a trade off. Smaller stripe units means a read can be split across multiple drives, with the resulting increase in bandwidth. But this then collides with read-ahead, where the OS or drive may retrieve more than was requested in the expectation that it will be requested soon. Larger stripe units may help writes by reducing the number of disks that need to be written to.
 +
 +
Many file systems have have a block size which they use to allocate disk space. Files tend to accumulate at the start of these blocks, so you want to choose a stripe unit such that the stride width and block sizes do not fit neatly into each other. This then ensures that the files are spread evenly across the disks and reduces the risk of "hot spots" where data accumulates on some drives and empty space on others.
  
 
== File System impact ==
 
== File System impact ==

Revision as of 21:10, 14 January 2018

Contents

Raid layout

In order to work efficiently, file systems need to understand the disk structure they are running on. File system tuning is somewhat of an arcane art, and the different variants of mkfs usually contain code to optimise the file system layout to the underlying disks. If the file system is over LVM and/or RAID, the code looks at the setup and optimises the layout.

The important figures to note are the stripe unit and the stripe width. The stripe unit is the size of the data written per disk. This is usually thought of as a multiple of 512 bytes, as that was typically a single block on a disk. A single disk block is now often 4K. But a typical stripe unit will now be of the order of a megabyte. The stripe width is the number of data blocks in a stripe.

For raids 5 & 6, the stripe width is disks-minus-one or disks-minus-two.

For raid 1, the stripe width is 1.

For raid 10, the picture seems rather more complicated.

Optimal Stripe Unit

Choosing the optimal stripe unit is a trade off. Smaller stripe units means a read can be split across multiple drives, with the resulting increase in bandwidth. But this then collides with read-ahead, where the OS or drive may retrieve more than was requested in the expectation that it will be requested soon. Larger stripe units may help writes by reducing the number of disks that need to be written to.

Many file systems have have a block size which they use to allocate disk space. Files tend to accumulate at the start of these blocks, so you want to choose a stripe unit such that the stride width and block sizes do not fit neatly into each other. This then ensures that the files are spread evenly across the disks and reduces the risk of "hot spots" where data accumulates on some drives and empty space on others.

File System impact

BTRFS

EXT

XFS

Personal tools