Sunday 7 October 2012

Known Linux issues re: Lifetime of low-power USB hard drives

[I originally posted this on the Raspberry PI forum].

This could affect anyone with an ARM computer running *nix and a low-power USB hard drive.  Here's what my drive looked like when I realized I needed to check it. It's been spun up for about half a year:

$ MY_DRIVE_OPTS="-d sat /dev/sda"
$ sudo smartctl -a $MY_DRIVE_OPTS | grep Load_Cycle_Count
193 Load_Cycle_Count        0x0032   109   109   000    Old_age   Always       -       273664


For those not familiar with smartctl, some explanation may be needed.



The normalized SMART value (109 in this case) is limited to the range 0-255. I recorded the SMART attributes when I bought the drive, and the initial value was 200. Lower values are worse. "000" is the "threshold", the value at which this attribute is considered to indicate "Old_age". From this we can guess that the drive has been exercised almost half-way to old age in six months - not good.

273664 is the raw value, literally how many times the drive heads have been unloaded and unloaded. Online sources confirm that many drives are rated for a lifetime of 600000 load cycles. (This is distinct from spinning the drive up/down - something that happens less often, and is represented by a different SMART attribute).

This issue was widely responsible for reducing the lifetime of hard drives of laptops when running Linux. As you'd hope, it was fixed quite nicely - but it seems only for laptop drives on x86.

From personal experience on Debian, and from a quick look at the umbrella Launchpad bug, the fix for excessive rates of increase in Load_Cycle_Count was implemented in the acpi_support package. Currently no Linux ARM platforms use ACPI, and it looks like the acpi_support package can't be installed on ARM.

Note that configuring and running smartd isn't enough to warn you about an excessive rate of increase in Load_Cycle_Count. You need to run smartctl manually and look at how fast it's changing over time.

What would be a reasonable rate? Well, the kernel wiki suggests I shouldn't complain. If I saw the equivalent rate on a laptop being using 12 hours a day, it might last 2 years before exceeding the rated figure. That may be a useful starting point, but I'm not happy with the idea of a 1 year rated life for my 24h drive.

So now I editted Debian's /etc/hdparm.conf (this should work well if you're booting off the drive, otherwise you need to worry about boot order, i.e. does the hdparm script run before the drive has been detected).

/dev/disk/by-id/usb-WD_My_Passport_0740_575838314141315837363139-0:0 {
        apm = 254
}


I think 254 is supposed to work for most drives. 255 might be necessary for a few. And I think the launchpad bug implies some might be ok with 128, but I haven't tried that yet. Some drives may be hopeless and not take notice of any setting.

Then rebooting it would have been annoying, so I took a shortcut.

$ sudo /etc/init.d/hdparm restart

and now it's been spinning for several hours with the raw value of Load_Cycle_Count holding steady at 273683. So I'm much happier about it (though I haven't compared the power consumption).

1 comment:

sourcejedi said...

Another alternative might be to configure laptop-mode. See

https://wiki.debian.org/DebianDesktopHowTo

(and search for laptop-mode)