Hard Drive Tools

With respect to hard drives, the acronym “SMART” stands for Self-Monitoring, Analysis and Reporting Technology. This was built into many ATA-3 and later ATA, IDE and SCSI-3 hard drives. Basically anything after about 2005 should have it.

Ubuntu/Debian:

sudo apt-get install smartmontools

CentOS/Fedora/RH:

sudo yum install smartmontools

Gentoo:

sudo emerge sys-apps/smartmontools

Wiki: http://sourceforge.net/apps/trac/smartmontools/wiki

smartctl

The program smartctl is used to interface with the SMART features on the drive firmware. Here are a couple of easy things to get started with (however some versions do not have the –scan option):


$ smartctl --scan -d ata
/dev/hda -d ata # /dev/hda, ATA device
/dev/hdc -d ata # /dev/hdc, ATA device
$ sudo smartctl --info /dev/hdc
smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.33.1-xedvia] (local
build)
Copyright (C) 2002-11 by Bruce Allen,
http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus
Device Model:     ST3160023A
Serial Number:    5JS9MDKW
Firmware Version: 8.01
User Capacity:    160,041,885,696 bytes [160 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Thu Feb  7 09:27:18 2013 PST
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

Note that the “SMART support” is listed as available but disabled. To enable full diagnostic checking turn it on with something like this:


$ sudo smartctl --smart=on --offlineauto=on --saveauto=on /dev/hdc
=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.
SMART Attribute Autosave Enabled.
SMART Automatic Offline Testing Enabled every four hours.

In theory this should only need to be done once and the drive should remember this (because of the saveauto directive). The offlineauto will cause automatic testing every 4 hours. In theory it will wait “nicely” if the drive is already busy so performance should not be seriously impacted.
Testing

Here’s a way to run a “short” off-line test. This tests electrical and mechanical performance of the drive and does read testing.


$ sudo smartctl --test=short /dev/hda
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Thu Feb  7 10:13:19 2013
Use smartctl -X to abort test.

$ sudo smartctl --log=selftest /dev/hda
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     43398        -

$ sudo smartctl --log=selftest /dev/hdc
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     37994         7234643

The first command starts the test off and it tells you to come back in 1 or 2 minutes. The second command shows how to query the log file to see if anything bad came up. In this case hda was fine (“Completed without error”) but hdc had a very important “read error”. Replace that drive ASAP!