Timeout Mismatch

From Linux Raid Wiki
(Difference between revisions)
Jump to: navigation, search
m
(Update for SMR drives)
(7 intermediate revisions by one user not shown)
Line 1: Line 1:
 
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 
|- padding:5px;padding-top:0.5em;font-size: 95%;  
 
|- padding:5px;padding-top:0.5em;font-size: 95%;  
| Back to [[Asking for help]] <span style="float:right; padding-left:5px;">Forward to [[Easy Fixes]]</span>
+
| Back to [[Asking for help]] <span style="float:right; padding-left:5px;">Forward to [[What's all this with USB?]]</span>
 
|}
 
|}
  
Most cheap modern drives do not support some form of managed error recovery. This seems to have started when the typical drive size hit 1TB. So for drives over 1TB, you should buy drives that are explicitly suitable for RAID. Most drives of 1TB or less are okay, but you should check first. RAID-rated drives aren't that much more expensive.
+
Most cheap modern desktop drives do not support some form of managed error recovery. This seems to have started when the typical drive size hit 1TB. So for drives over 1TB, you should buy drives that are explicitly suitable for RAID. Most drives of 1TB or less are okay, but you should check first. RAID-rated drives aren't that much more expensive. (For some strange reason, every 2 1/2" laptop drive I've come across does support it!)
 +
 
 +
Also, in 2019, a new technology called shingled magnetic recording (SMR) started becoming mainstream. Whereas drive usage limits on conventional drives are advisory, burst limits especially on SMR drives are mandatory, and interfere with raid operation. While all manufacturers have been quietly introducing SMR on their desktop lines, WD unfortunately also introduced it on their "suitable for NAS/RAID" WD Red drives. Unfortunately, combining SMR and RAID is not a good idea, with many reports of new WD Reds simply refusing to be added to an existing array.
  
 
For a quick summary of the problem, when the OS tries to read from the disk, it sends the command and waits. What should happen is that drive returns the data successfully.
 
For a quick summary of the problem, when the OS tries to read from the disk, it sends the command and waits. What should happen is that drive returns the data successfully.
Line 10: Line 12:
 
The proper sequence of events when something goes wrong is that the drive can't read the data, and it returns an error to the OS. The raid code then calculates what the data should be, and writes it back to the disk. Glitches like this are normal and, provided the disk isn't failing, this will correct the problem.
 
The proper sequence of events when something goes wrong is that the drive can't read the data, and it returns an error to the OS. The raid code then calculates what the data should be, and writes it back to the disk. Glitches like this are normal and, provided the disk isn't failing, this will correct the problem.
  
Unfortunately, with desktop drives, they can take over two minutes to give up, while the linux kernel will give up after 30 seconds. At which point, the RAID code recomputes the block and tries to write it back to the disk. The disk is still trying to read the data and fails to respond, so the raid code assumes the drive is dead and kicks it from the array. This is how a single error with these drives can easily kill an array.
+
Unfortunately, with desktop drives, they can take over two minutes to give up, while the linux kernel will give up after 30 seconds. SMR drives are even worse - there are reports of them stalling for over 10 minutes as the drive shuffles everything around to make space. When the kernel gives up, the RAID code recomputes the block and tries to write it back to the disk. The disk is still trying to read the data and fails to respond, so the raid code assumes the drive is dead and kicks it from the array. This is how a single error with these drives can easily kill an array.
  
To check whether this is the case, look at the output you got from smartctl, and see whether SCT Error Recovery Control is supported. If it isn't, this is your problem. On WD disks it may be called TLER. To just look at this parameter, you can use the command
+
To check whether these are the case, look at the output you got from smartctl, and see whether SCT Error Recovery Control is supported. If it isn't, you have desktop drives that are timing out on you. On WD disks it may be called TLER. To just look at this parameter, you can use the command
  
 
  smartctl -l scterc /dev/sdx
 
  smartctl -l scterc /dev/sdx
  
The following script was posted to the mailing list by Brad Campbell. Make sure it runs on every boot - the cheaper drives especially forget any settings you may make when the system is shut down. It will increase the timeout for all non-ERC drives. It also sets the timeout for ERC drives as many older desktop drives that do support it have inappropriate settings.
+
For SMR drives, the drive <em>should</em> report that the trim command is supported. Unfortunately, some (many?) cheaper SMR drives do not, and due to the nature of SMR drives that don't support trim  will have problems, leading to exactly the grief many have reported - the drive stalling for ever almost as it has to rewrite masses of data. Note, however, that SMR drives come in at least three types - DM (device managed) which may or may not support trimming, and HM (host managed) which shouldn't be a problem as they leave it to the computer to sort out.
 +
 
 +
69    14          1  Deterministic data after trim supported
 +
69      5          1  Trimmed LBA range(s) returning zeroed data supported
 +
 
 +
Unfortunately, some drives do not. The reason for this may be down to the ATA specification. If I have my facts right, version 4 of the ATA specification postdates these problematic drives. but is required for reporting these capabilities. The drives in question may stick to the v3 specification rather than the provisional v4 spec.
 +
 
 +
ATA Version is:  ACS-3 T13/2161-D revision 5
 +
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
 +
 
 +
Unfortunately, it now seems that if you want to run an array, you can NOT use cheap 2020 or later drives. For the current state of affairs (mid 2020) WD has said that all the WD Red line is now SMR. To get raid-suitable CMR you need to buy Red Plus, or Red Pro. You should never have been using Seagate Barracudas anyway, but these have now pretty much all moved over to SMR (and been renamed BarraCuda). Seagate have said that their IronWolf and IronWolf Pro lines will remain CMR, and the FireCuda line seems all CMR at the moment (I guess these will be a bit like the Red Pros, the CMR equivalent of the BarraCuda).
 +
 
 +
The following script was posted to the mailing list by Brad Campbell. Make sure it runs on every boot - the cheaper drives especially forget any settings you may make when the system is shut down. It will increase the timeout for all non-ERC drives. It also sets the timeout for ERC drives as many older desktop drives that do support it have inappropriate settings. Note that it does nothing for SMR drives.
  
 
  #!/bin/bash
 
  #!/bin/bash
Line 29: Line 43:
 
     blockdev --setra 1024 $i
 
     blockdev --setra 1024 $i
 
  done  
 
  done  
 +
 +
WARNING: This does not work for all drives, although it seems to be older 2010-era ones that fail. The smartctl command attempts to set the ERC timeout to 7 seconds. This should either succeed and return 0, or fail and return an error code. Unfortunately, for drives that do not support SCT at all, the attempt to set ERC fails but returns 0, fooling the script. Whenever you get a new drive, you should make sure it behaves as expected.
  
 
[TODO: Discuss assembling a broken array with --force]
 
[TODO: Discuss assembling a broken array with --force]
Line 44: Line 60:
  
 
[TODO: link to gmane if/when it comes back]
 
[TODO: link to gmane if/when it comes back]
 +
 +
External links about SMR drives
 +
 +
* https://www.extremetech.com/computing/309730-western-digital-comes-clean-shares-which-hard-drives-use-smr
 +
* https://blocksandfiles.com/2020/04/15/shingled-drives-have-non-shingled-zones-for-caching-writes/
 +
* https://blocksandfiles.com/2020/04/15/seagate-2-4-and-8tb-barracuda-and-desktop-hdd-smr/
 +
* https://blocksandfiles.com/2020/04/16/toshiba-desktop-disk-drives-undocumented-shingle-magnetic-recording/
 +
 +
* https://blog.westerndigital.com/wd-red-nas-drives/
 +
* https://www.seagate.com/gb/en/internal-hard-drives/cmr-smr-list/
 +
* https://toshiba.semicon-storage.com/ap-en/company/news/news-topics/2020/04/storage-20200428-1.html
  
 
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 
{| style="border:1px solid #aaaaaa; background-color:#f9f9f9;width:100%; font-family: Verdana, sans-serif;"
 
|- padding:5px;padding-top:0.5em;font-size: 95%;  
 
|- padding:5px;padding-top:0.5em;font-size: 95%;  
| Back to [[Asking for help]] <span style="float:right; padding-left:5px;">Forward to [[Easy Fixes]]</span>
+
| Back to [[Asking for help]] <span style="float:right; padding-left:5px;">Forward to [[What's all this with USB?]]</span>
 
|}
 
|}

Revision as of 00:47, 9 August 2020

Back to Asking for help Forward to What's all this with USB?

Most cheap modern desktop drives do not support some form of managed error recovery. This seems to have started when the typical drive size hit 1TB. So for drives over 1TB, you should buy drives that are explicitly suitable for RAID. Most drives of 1TB or less are okay, but you should check first. RAID-rated drives aren't that much more expensive. (For some strange reason, every 2 1/2" laptop drive I've come across does support it!)

Also, in 2019, a new technology called shingled magnetic recording (SMR) started becoming mainstream. Whereas drive usage limits on conventional drives are advisory, burst limits especially on SMR drives are mandatory, and interfere with raid operation. While all manufacturers have been quietly introducing SMR on their desktop lines, WD unfortunately also introduced it on their "suitable for NAS/RAID" WD Red drives. Unfortunately, combining SMR and RAID is not a good idea, with many reports of new WD Reds simply refusing to be added to an existing array.

For a quick summary of the problem, when the OS tries to read from the disk, it sends the command and waits. What should happen is that drive returns the data successfully.

The proper sequence of events when something goes wrong is that the drive can't read the data, and it returns an error to the OS. The raid code then calculates what the data should be, and writes it back to the disk. Glitches like this are normal and, provided the disk isn't failing, this will correct the problem.

Unfortunately, with desktop drives, they can take over two minutes to give up, while the linux kernel will give up after 30 seconds. SMR drives are even worse - there are reports of them stalling for over 10 minutes as the drive shuffles everything around to make space. When the kernel gives up, the RAID code recomputes the block and tries to write it back to the disk. The disk is still trying to read the data and fails to respond, so the raid code assumes the drive is dead and kicks it from the array. This is how a single error with these drives can easily kill an array.

To check whether these are the case, look at the output you got from smartctl, and see whether SCT Error Recovery Control is supported. If it isn't, you have desktop drives that are timing out on you. On WD disks it may be called TLER. To just look at this parameter, you can use the command

smartctl -l scterc /dev/sdx

For SMR drives, the drive should report that the trim command is supported. Unfortunately, some (many?) cheaper SMR drives do not, and due to the nature of SMR drives that don't support trim will have problems, leading to exactly the grief many have reported - the drive stalling for ever almost as it has to rewrite masses of data. Note, however, that SMR drives come in at least three types - DM (device managed) which may or may not support trimming, and HM (host managed) which shouldn't be a problem as they leave it to the computer to sort out.

69     14          1   Deterministic data after trim supported
69      5          1   Trimmed LBA range(s) returning zeroed data supported

Unfortunately, some drives do not. The reason for this may be down to the ATA specification. If I have my facts right, version 4 of the ATA specification postdates these problematic drives. but is required for reporting these capabilities. The drives in question may stick to the v3 specification rather than the provisional v4 spec.

ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Unfortunately, it now seems that if you want to run an array, you can NOT use cheap 2020 or later drives. For the current state of affairs (mid 2020) WD has said that all the WD Red line is now SMR. To get raid-suitable CMR you need to buy Red Plus, or Red Pro. You should never have been using Seagate Barracudas anyway, but these have now pretty much all moved over to SMR (and been renamed BarraCuda). Seagate have said that their IronWolf and IronWolf Pro lines will remain CMR, and the FireCuda line seems all CMR at the moment (I guess these will be a bit like the Red Pros, the CMR equivalent of the BarraCuda).

The following script was posted to the mailing list by Brad Campbell. Make sure it runs on every boot - the cheaper drives especially forget any settings you may make when the system is shut down. It will increase the timeout for all non-ERC drives. It also sets the timeout for ERC drives as many older desktop drives that do support it have inappropriate settings. Note that it does nothing for SMR drives.

#!/bin/bash
for i in /dev/sd? ; do
    if smartctl -l scterc,70,70 $i > /dev/null ; then
        echo -n $i " is good "
    else
        echo 180 > /sys/block/${i/\/dev\/}/device/timeout
        echo -n $i " is  bad "
    fi;
    smartctl -i $i | egrep "(Device Model|Product:)"
    blockdev --setra 1024 $i
done 

WARNING: This does not work for all drives, although it seems to be older 2010-era ones that fail. The smartctl command attempts to set the ERC timeout to 7 seconds. This should either succeed and return 0, or fail and return an error code. Unfortunately, for drives that do not support SCT at all, the attempt to set ERC fails but returns 0, fooling the script. Whenever you get a new drive, you should make sure it behaves as expected.

[TODO: Discuss assembling a broken array with --force]

The following links-to-email have been collected by Phil Turmel as background reading to the problem. Read the entire threads if you have time.

http://marc.info/?l=linux-raid&m=139050322510249&w=2
http://marc.info/?l=linux-raid&m=135863964624202&w=2
http://marc.info/?l=linux-raid&m=135811522817345&w=1
http://marc.info/?l=linux-raid&m=133761065622164&w=2
http://marc.info/?l=linux-raid&m=132477199207506
http://marc.info/?l=linux-raid&m=133665797115876&w=2
http://marc.info/?l=linux-raid&m=142487508806844&w=3
http://marc.info/?l=linux-raid&m=144535576302583&w=2

[TODO: link to gmane if/when it comes back]

External links about SMR drives

Back to Asking for help Forward to What's all this with USB?
Personal tools