I’m convinced now that the Western Digital hard disk drives I have are faking their SMART data to make their health look better than it is (and reduce warranty returns).
I have three WD10EACS, 1TB, Green Caviar drives. One of those drives (purchased in late 2007), within a few hours out of the box, started making a repetitive sound that I can only guess was reallocation of bad sectors, a rhythmic chatter of the heads seeking to various parts of the disk. All of this occurred during “offline” when the drive wasn’t used by the computer (and even when the SATA data cable was disconnected). It would do this for hours then finally quiet down in idle mode.
Connected to the computer, the drive would seem to function fine. It passed all of its SMART self-tests. During all of the self-tests (short, long and conveyance) it functioned quietly without the rhythmic chatter of the heads seeking to various places. None of the SMART attributes show a single thing wrong, and even the raw values other than non-indicative items such as temperature and power on hours are 0.
I don’t trust the drive at all. I’ve switched its role from a primary drive to an external non-critical back up drive. Over the years, it would start chattering again. After a trip in the back of the car, it chattered again for about 5 hours. It’s obvious to me that all is not well with the drive, but more troubling is the SMART data/status that reveals nothing. The drive is still under warranty for a few more months, but I’m not inclined to return it. The $80 replacement cost is well worth paying to learn something from this. I’m tempted to open the drive and create a surface defect, run the extended self-test and see what I get.
The other two drives have exhibited the same problem, but to a much lesser degree. A few days ago on one of the drives, I was notified of an increased Seek Error Rate in the SMART data. Yet that vanished as soon as it was power cycled. Even the “Worst” value has reset to 200, when I’m pretty sure the notification said it had dropped to 100. The drive claims to save SMART data before entering a power saving mode, which I assume includes a power down scenario.
So I suspected that perhaps there was a software glitch in ActiveSmart, which I have since stopped using. It’s ridiculous to keep paying money for new versions and bug fixes. I’ve started using the free smartmontools. It’s not quite as user friendly as ActiveSmart, but after digesting the documentation and configuring it, it’s doing exactly what is needed. For my Windows system, the smartd.conf file has this line in it:
DEVICESCAN -a -I 194 -I 9 -m msgbox
I’ll probably add more “ignore” (-I) attributes as time goes on. But for now this is my starting point. (Currently ignoring temperature changes and power-on hours.)
I’ve been building computers for 20 years, as well as fixing other people’s computers. Given all of the various hard drive failures I’ve dealt with, Western Digital has never been a good experience. But then neither have many of the other manufacturers, with the possible exception of Samsung.
ETA: Here is something interesting I read a few days ago, Failure Trends in a Large Disk Drive Population.
These things always seem to come in clusters for me. I’ll be fine for a couple of years, then, within a few months, two or three drives go.