Suspend idle harddisks and monitor the suspension

Goal

Suspend idle hard disks and monitor the suspension state

Intro

To save energy modern disks can spin down when not used. Depending on the disk, savings from 3 to 10 Watts can be achieved. Using the SMART capabilities hdparm can set a spin down timeout on each disk. To check the state of a disk hdparm can also output the state of a disk. As there is no automatic mechanism to be notified if a disk changes its state from idle to suspended state or vice versa another solution is presented in this blog post. The idea is to query the disks for their state periodically and save the value to a database. The problem is, that between to measurements the state can not be tracked. So using smartctl, a tool from the smartmontools is used to get the SMART value “Start Stop Count” from each disk and also save its output to the database. This is important as a spin down timeout that is set too low can result in frequent spin downs and ups, which may shorten the lifetime of a hard drive.

hdparm

With hdparm [1] the spin down time can be set using the -S flag:

hdparm -S 120 /dev/sdb

This would set set timeout value to 10 minutes for the second hard drive.

The timeout value can have values between 0 and 255.  The value of 0 disables automatic spin down. Values from 1 to 240 set a time out from the length of the value multiplied by 5 seconds. This yields to timeouts from 5 seconds to 20 minutes. The meaning of the other values is described in the man page of hdparm.

To query the current state of a hard drive hdparm can be called with the -C flag.

hdparm -C /dev/sdb

This would output something similar to

/dev/sdb:
drive state is:  active/idle

if the disk is idle or

/dev/sdb:
drive state is:  standby

if the disk has been suspended.

The spin down time setting is lost on system shutdown and must be set again on system start up. This can be done by the hdparm daemon. The configuration file is located at /etc/hdparm.conf on Debian systems. See the man page for specific instructions. The configuration file itself has also good documentation.

smartctl

Calling smartctl [2] with the -a flag outputs many lines of information for that hard drive, containing its health status and many other values.

smartctl -a /dev/sdb

would output the state of sdb.

The meaning of the output is described in its man page. For now only one line in the “Vendor specific SMART Attributes” section is relevant. The line beginning with “4 Start_Stop_Count” contains values for the number of times the hard disk has been powering up the disk motor. The column tagged with raw value holds the absolute number.

The above command spins up the hard drive. To prevent this the -n flag can be passed additionally with the value standby to smartctl.

smartctl -a -n standby /dev/sdb

Putting it all into a bash script

At the bottom of this blog post a bash script is attached which gathers information from hdparm and smartctl, saves the values into a database and generates a report of the saved data.

To work the script needs sqlite, hdparm and smartmontools (smartctl).

The script can be called with several commands: create, clean, collect, generate and prune. Providing no command will output a short usage description. The meaning of the commands are:

  • create: creates the database file and creates the database table schema. The location of the database file can be changed by altering the DB_FILE variable at the top of the script. Also the table name can be changed. The corresponding variable is named TABLE.
  • clean: drops the database table and therefore removes all values.
  • collect: queries hdparm and smartctl to gather information about the current power state and Start Stop Count value for the specified hard drives. The hard drives which should be monitored can be configured by setting the DRIVES variable. This is a white space separated list. The script does not power up the hard drives, when in standby mode. To automate the collection of data a cron job can be set up. Calling the script each minute gathers 1440 data sets for each drive per day. This would take approximately 100kbyte per day per hard drive of disk space.
  • generate: queries the database and outputs a report for the last day, week and month. This can be adjusted by altering the DAYS variable. It is a white space separated list of integers. Each integer specifies a period of time in days. The generate command could be called in a daily cron job, which would send the output to the machine administrator via email.
  • prune: To avoid the endless growth of the database a prune command is implemented. The default value of the KEEP_DAYS variable is set to 31, yielding to 31 days. It is also recommended to call it in a cron job daily.

The script also has variables for the location of the sqlite, hdparm and smartctl executables, to ease the change in different environments.

Further improvements

The script could be extended by using the rrdtool to generate nice graphics and send them via email. It also could be used as a basis for monitoring software like munin, cacti or nagios.

Update 24.03.2011

The original script contains two bugs. The first is, in the download file the line endings were not correct. The new version can be found below in the attachments section. The other bug concerns the usage of sqlite. When multiple processes try to read the database, one of the processes will fail with an error message, but leaving the database in an consistant state.

Attachments

disk-suspend-state.sh.gz
disk-suspend-state.sh.gz with corrected line endings

Resources

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>