With respect to hard drives, the acronym “SMART” stands for Self-Monitoring, Analysis and Reporting Technology. This was built into many ATA-3 and later ATA, IDE and SCSI-3 hard drives. Basically anything after about 2005 should have it.
Ubuntu/Debian:
sudo apt-get install smartmontools
CentOS/Fedora/RH:
sudo yum install smartmontools
Gentoo:
sudo emerge sys-apps/smartmontools
Wiki: http://sourceforge.net/apps/trac/smartmontools/wiki
smartctl
The program smartctl is used to interface with the SMART features on the drive firmware. Here are a couple of easy things to get started with (however some versions do not have the –scan option):
$ smartctl --scan -d ata /dev/hda -d ata # /dev/hda, ATA device /dev/hdc -d ata # /dev/hdc, ATA device $ sudo smartctl --info /dev/hdc smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.33.1-xedvia] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus Device Model: ST3160023A Serial Number: 5JS9MDKW Firmware Version: 8.01 User Capacity: 160,041,885,696 bytes [160 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2 Local Time is: Thu Feb 7 09:27:18 2013 PST SMART support is: Available - device has SMART capability. SMART support is: Disabled
Note that the “SMART support” is listed as available but disabled. To enable full diagnostic checking turn it on with something like this:
$ sudo smartctl --smart=on --offlineauto=on --saveauto=on /dev/hdc === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Enabled. SMART Attribute Autosave Enabled. SMART Automatic Offline Testing Enabled every four hours.
In theory this should only need to be done once and the drive should remember this (because of the saveauto directive). The offlineauto will cause automatic testing every 4 hours. In theory it will wait “nicely” if the drive is already busy so performance should not be seriously impacted.
Testing
Here’s a way to run a “short” off-line test. This tests electrical and mechanical performance of the drive and does read testing.
$ sudo smartctl --test=short /dev/hda === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 1 minutes for test to complete. Test will complete after Thu Feb 7 10:13:19 2013 Use smartctl -X to abort test. $ sudo smartctl --log=selftest /dev/hda === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 43398 - $ sudo smartctl --log=selftest /dev/hdc === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 37994 7234643
The first command starts the test off and it tells you to come back in 1 or 2 minutes. The second command shows how to query the log file to see if anything bad came up. In this case hda was fine (“Completed without error”) but hdc had a very important “read error”. Replace that drive ASAP!