Uninterruptible Power Supply

Network UPS Tools (NUT) Logo

Network UPS Tools (NUT) is a client/server monitoring system that allows computers to share uninterruptible power supply (ups) and power distribution unit (PDU) hardware. Clients access the hardware through the server, and are notified whenever the power status changes.

Topology

In the following scenario, the server acts as the master. He monitors and controls the UPS via USB data cable and keeps the slaves updated about the current situation.

UPS Diagram

The router and the NAS act as slaves. They get status updates about the UPS battery state and power supply from the master.

If the master gets notified by the UPS that its battery is getting close to depletion, he will instruct the slaves to shutdown.

The master will start its own shutdown procedure and instruct the UPS to cut the power, after he has confirmation from all the slaves, that they are shutting down.

The Wi-Fi AP has no Network UPS Tools installed, thus he is not aware of the current situation. He will however be shut down by the master via remote SSH command, in case the battery is low during a power outage.

The Ethernet switch, has also no knowledge about the UPS. He will shut down uncontrolled, when the UPS battery is depleted or when the UPS is ordered to cut the power by the master.

Shutdown Plan

Here is what happens step-by-step in case of main power loss:

  1. Main power failure occurs:

    1. UPS device switches power to battery.
    2. UPS device notifies master with a “On Battery” event message.
    3. Master notifies slaves with a “On Battery” event message.
  2. USP Battery is getting close to depletion:

    1. UPS device notifies master with a “Battery Low” event message.

    2. Master issues “Forced Shutdown” command message to all slaves.

    3. Master issues remote shutdown commands by SSH to any unmanaged devices.

    4. Unmanaged devices start their shutdown procedure.

    5. Slaves receive the “Forced Shutdown” command message.

    6. Slaves may issue “Shutdown” notification message to their users.

    7. Slaves wait the set “Final Delay” time. This is to process the above notifications.

    8. Slaves notify the master with a “Notify Shutdown” event message.

    9. Slaves start their shutdown procedure:

      1. Ends all running processes.
      2. Unmounts all file systems.
      3. Remounts file systems as read-only.
      4. Halts the system (but doesn’t power off).
    10. Master waits until he received “Notify Shutdown” event messages from all slaves.

    11. Master issues a “Shutdown” notification message to its users.

    12. Master waits the set “Final Delay” time. This is to process the above notifications.

    13. Master starts his shutdown procedure:

      1. Sets the “Killpower” flag
      2. Ends all running processes.
      3. Unmounts all file systems.
      4. Remounts file systems as read-only.
      5. Looks for the “Killpower” flag.
      6. Issues the “Kill Power” command to the UPS device.
      7. Halts the system (but doesn’t power off).
    14. UPS device receives the “Kill Power” command from the master:

      1. UPS waits for the “Shutdown Delay” time to pass. This is to give all system enough time to properly shut down.
      2. UPS device cuts power on all outlets.
    15. All connected systems loose power.

  3. Main power supply has been restored:

    1. UPS device starts to reload its battery.
    2. UPS device waits for the “Startup Delay” time to pass. This is to reload the battery to a safe minimum level.
    3. UPS device restores power on all outlets.
    4. All connected systems start up.

Installation

sudo apt install nut

Configuration

nut.conf

This file tells the installed Network UPS Tools in which mode it should run. Depending on this setting the required modules are then started.

/etc/nut/nut.conf

1
2
3
4
5
6
7
# IMPORTANT NOTE:
# This file is intended to be sourced by shell scripts. You MUST NOT use
# spaces around the equal sign!
#
# Required. Recognized values are none, standalone, netserver and netclient.
# Defaults to none.
MODE=netserver

See the nut.conf(5) manpage for more possible options.

ups.conf

This file is read by the driver controller. It tells the Network UPS Tools what kind of UPS device it has to work with. Some settings to control communications with the device. Also some of the UPS device parameters can be overridden.

/etc/nut/ups.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#
# ups.conf - UPS definitions for Network UPS Tools
#
# Set global directives, and individual UPS device options for the NUT UPS
# device drivers.
#

# Wait 45 seconds for the driver to finish starting.
maxstartdelay = 45

# Try 3 times to start the driver, before giving up.
maxretry = 3

# Wait 5 seconds between attempts to start the driver.
retrydelay = 5

# Start a user "nut".
user = nut

#
# Our UPS device(s)
#
[apc]
    # Back-UPS RS 900G FW:879.L4 .I USB FW:L4
    driver = usbhid-ups
    port = auto
    desc = "APC Back-UPS RS 900G"
    offdelay = 120
    ondelay = 240


[ups]
    # Pretend to be a Synology NAS, so other DiskStations will connect here.
    driver = dummy-ups
    port = apc@localhost
    desc = "Synology UPS server"

See the ups.conf(5) and usbhid-ups(8) manpages for more possible options.

upsd.conf

Here we control access to the server and set some other miscellaneous configuration values.

/etc/nut/upsd.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#
# upsd.conf - Configuration for Network UPS Tools upsd
#

# Interfaces to listen for TCP connections from clients.
# if not specified, the default port is 3493.
# This will only be read at startup of upsd.  If you make changes here,
# you'll need to restart upsd, reload will have no effect.
#
LISTEN 127.0.0.1
LISTEN ::1
LISTEN 192.0.2.10
LISTEN 2001:db8:c0de::10

See the upsd.conf(5) manpage for more possible options.

upsd.users

Administrative commands such as setting variables and the instant commands are powerful, and access to them needs to be restricted. This file defines who may access them, and what is available.

Each user gets its own section. The fields in that section set the parameters associated with that user’s privileges. The section begins with the name of the user in brackets, and continues until the next user name in brackets or EOF. These users are independent of /etc/passwd.

/etc/nut/upsd.users

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#
# upsd.users - User definitions for NUT upsd
#

[adminuser]
    
    # Administrative user
    password = ********
    
    # Allow changing values of certain variables in the UPS.
    actions = SET

    # Allow setting the "Forced Shutdown" flag in the UPS. 
    actions = fsd

    # Allow all instant commands
    instcmds = ALL


[server]
    
    # The localhost, master server
    password = ********

    # Allow required instant commands to act as master.
    upsmon master


[router]
    # OpenWRT router
    password = ********
    
    # Allow required actions to act as slave.
    upsmon = slave


[monuser]

    # Pretend to be a Synology NAS, so other DiskStations will connect here.
    password = secret

    # Allow required actions to act as slave.
    upsmon = slave

See the upsd.users(5) manpage for more possible options.

upsmon.conf

This file’s primary job is to define the systems that upsmon(8) will monitor and to tell it how to shut down the system when necessary.

Additionally, other optional configuration values can be set in this file.

/etc/nut/upsmon.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#
# upsmon.conf - Configuration for Network UPS Tools upsmon
#

# Drop privileges to the following user profile after startup.
RUN_AS_USER nut

# List of UPS devices to monitor.
MONITOR apc@localhost 1 upsmon ******* master

# Number of power supplies receiving power to keep this system running.
MINSUPPLIES 1

# Command to run when to shutdown this system.
SHUTDOWNCMD "/bin/systemctl halt"

# Command to run on +EXEC events.
NOTIFYCMD "/sbin/upssched"

# Change behavior of upsmon on certain events.
#
# Possible values for the flags:
#
# SYSLOG - Write the message in the syslog
# WALL   - Write the message to all users on the system
# EXEC   - Execute NOTIFYCMD (see above) with the message
# IGNORE - Don't do anything
#
# If you use IGNORE, don't use any other flags on the same line.
#
# NOTIFYFLAG <notify type> <flag>[+<flag>][+<flag>] ...
#
NOTIFYFLAG ONLINE   SYSLOG+WALL+EXEC
NOTIFYFLAG ONBATT   SYSLOG+WALL+EXEC
NOTIFYFLAG LOWBATT  SYSLOG+WALL+EXEC
NOTIFYFLAG FSD      SYSLOG+WALL+EXEC
NOTIFYFLAG COMMOK   SYSLOG+WALL+EXEC
NOTIFYFLAG COMMBAD  SYSLOG+WALL+EXEC
NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC
NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC
NOTIFYFLAG NOCOMM   SYSLOG+WALL+EXEC
NOTIFYFLAG NOPARENT SYSLOG+WALL


# Poll the UPS every 5 seconds.
POLLFREQ 5

# If the UPS is on battery, poll it every 5 seconds.
POLLFREQALERT 5

# Wait no more then 15 seconds for "Notify Shutdown" messages from slaves.
HOSTSYNC 15

# Wait no more then 15 seconds to consider an unreachable UPS as dead.
DEADTIME 15

# Location of the flag-file to make UPS turn itself off.
POWERDOWNFLAG /etc/killpower

# Warn every 12 hours if battery needs to be replaced.
RBWARNTIME 43200

# Warn every 5 minutes, if UPS is unreachable.
NOCOMMWARNTIME 300

# Wait 5 seconds before starting to shut down.
FINALDELAY 5

See the upsmon.conf(5) manpage for more possible options.

upssched.conf

This file controls the operations of upssched(8), the timer-based helper program for upsmon(8).

Here we can define our own script, which will be executed on certain events.

/etc/nut/upssched.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#
# Network UPS Tools - upssched.conf file
#

# The command script to run
CMDSCRIPT /usr/local/bin/ups-scheduled-tasks

# Command pipe and lock-file
PIPEFN /run/nut/upssched.pipe
LOCKFN /run/nut/upssched.lock

# The UPS is back on line.
# Cancel any running "On Battery" timer, then execute the "Online" command.
AT ONLINE apc@localhost CANCEL-TIMER onbatt online

# The UPS is on battery.
# Start a 10 seconds timer, then execute the "On Battery" command.
AT ONBATT apc@localhost START-TIMER onbatt 10

# The UPS battery is low (as determined by the driver).
# Execute the "Low Battery" command immediately.
AT LOWBATT apc@localhost EXECUTE lowbatt

# The UPS has been commanded into the "Forced Shutdown" mode.  
# Execute the "Forced Shutdown" command immediately.
AT FSD apc@localhost EXECUTE fsd

# Communication with the UPS has been established.
# Cancel any running "Communications Lost" timer, then execute the
# "Communications Restored" command.
AT COMMOK apc@localhost CANCEL-TIMER commbad commok

# Communication with the UPS was just lost.
# Start a 15 seconds timer, then execute the "Communications Lost" command.
AT COMMBAD apc@localhost START-TIMER commbad 15

# The local system is being shut down.    
# Execute the "Notify Shutdown" command immediately.
AT SHUTDOWN apc@localhost EXECUTE shutdown

# The UPS needs to have its battery replaced.
# Start a 5 minutes timer, then execute the "Replace Battery" command.
AT REPLBATT apc@localhost START-TIMER replbatt 300

# The UPS can’t be contacted for monitoring.
# Start a 15 seconds timer, then execute the "No Communications" command.
AT NOCOMM apc@localhost START-TIMER nocomm 15

See the upssched.conf(5) manpage for more possible options.

Securing Configuration Files

Secure the configuration files, to protect the various access crontrols and credentials.

$ sudo chown -R root:nut /etc/nut
$ sudo chmod 0770 /etc/nut
$ sudo chmod 0640 /etc/nut/*

Scheduled Command Script

We use this to send out remote commands by SSH to systems who’s power-lines are connected to the UPSm but they don’t have Network UPS Tools installed and so won’t be able to know by themselves when they should shutdown.

In the following example this will be a MikroTik device called “ap.example.com”

We assume that there is a user profile called nut on the remote device and that it has a properly installed SSH public key. to allow password-less logins.

We create the script for sending out remote commands:

/usr/local/bin/ups-scheduled-tasks

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#! /bin/sh
#
# SSH connection settings
SSH_HOST='ap.example.net'
SSH_USER='nut'
SSH_KEY='/etc/nut/ssh/id_rsa'

case $1 in
    onbatt)
        message="Power Failure on UPS ${UPSNAME}!"
        echo -e "Warning: UPS $UPSNAME experienced a power failure and is now running on battery!" \
        | mail -s"Warning: $message" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    online)
        message="Power restored on UPS $UPSNAME"
        echo -e "Power on UPS $UPSNAME has been restored." \
        | mail -s"$message" root
        remote_cmd="log info message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    lowbatt)
        message="Low battery on UPS ${UPSNAME}!"
        echo -e "Warning: UPS $UPSNAME is low on battery! All connected Systems will be shut down soon." \
        | mail -s"Warning: $message" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    fsd)
        message="Forced Shutdown from UPS ${UPSNAME}!"
        echo -e "Warning: All Systems connected to UPS $UPSNAME will be shut down now!" \
        | mail -s"Warning: $message" root
        remote_cmd="log error message=\"${message}\" ; beep 0.5 ; delay 4000ms ; beep 0.5 ; system shutdown!"
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    commok)
        message="Communications restored with UPS $UPSNAME"
        echo -e "Communications with UPS $UPSNAME have been restored." \
        | mail -s"$message" root
        remote_cmd="log info message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    commbad)
        message=""
        echo -e "Warning: Lost communications with UPS ${UPSNAME}!" \
        | mail -s"Warning: Lost communications with UPS ${UPSNAME}!" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    shutdown)
        message="System $HOST is shutting down now!"
        echo -e "Warning: System $HOST is shutting down now!" \
        | mail -s"Warning: $message" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    replbatt)
        message="Replace battery on UPS ${UPSNAME}!"
        echo -e "Warning: The UPS $UPSNAME needs to have its battery replaced!" \
        | mail -s"Warning: $message" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    nocomm)
        message="The UPS $UPSNAME can’t be contacted for monitoring!"
        echo -e "Warning: The UPS $UPSNAME can’t be contacted for monitoring!" \
        | mail -s"Warning: $message" root
        remote_cmd="log warning message=\"${message}\""
        ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
        ;;
    *)
        logger -t ups-scheduled-tasks "Unrecognized command: $1"
        ;;
esac

Make the script executable by the nut system user:

$ sudo chown nut:nut /usr/local/bin/ups-scheduled-tasks
$ sudo chmod 740 /usr/local/bin/ups-scheduled-tasks

Powering Off the UPS

After the server has terminated all processes, unmounted file-systems and re-mounted them read-only, it is safe to cut off the power.

The following script will be executed by the systemd-halt.service(8) just before turning off the CPU, by placing it in the directory /lib/systemd/system-shutdown.

/lib/systemd/system-shutdown/nut.shutdown

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/bin/sh

# Test for the shutdown flag
if @SBINDIR@/upsmon -K >/dev/null 2>&1; then

    # Check if a power race workaround has been configured
    wait_delay=`/bin/sed -ne 's#^ *POWEROFF_WAIT= *\(.*\)$#\1#p' @CONFPATH@/nut.conf`

  # Command the UPS driver(s) to run their shutdown sequence
  @SBINDIR@/upsdrvctl shutdown

  if [ -n "$wait_delay" ] ; then
    /bin/sleep $wait_delay
    # We need to pass --force twice here to bypass systemd and execute the
    # reboot directly ourself.
    /bin/systemctl reboot --force --force
  fi
fi

exit 0

Make it executable:

$ sudo chmod +x /lib/systemd/system-shutdown/nut.shutdown

See upsdrvctl(8)

UPS Device Configuration

Depending on your model, some settings of your USP device can be set, by changing values of variables stored in the device EEPROM.

This is done the upsrw command.

To get a list of the writable configuration variables on your model:

$ upsrw apc@localhost

[battery.charge.low]
Remaining battery level when UPS switches to LB (percent)
Type: STRING
Maximum length: 10
Value: 10

[battery.runtime.low]
Remaining battery runtime when UPS switches to LB (seconds)
Type: STRING
Maximum length: 10
Value: 120

[input.sensitivity]
Input power sensitivity
Type: STRING
Maximum length: 10
Value: medium

[input.transfer.high]
High voltage transfer point (V)
Type: STRING
Maximum length: 10
Value: 294

[input.transfer.low]
Low voltage transfer point (V)
Type: STRING
Maximum length: 10
Value: 176

[ups.delay.shutdown]
Interval to wait after shutdown with delay command (seconds)
Type: STRING
Maximum length: 10
Value: 20

Timing

For the above procedure to work as intended timing is critical.

To be on the safe side:

  1. Every device needs to have enough time to be safely halted.
  2. Additional processing time and delays have to be added.
  3. The UPS device needs to send its “Low Battery” notification, when there is just about enough battery time left for the whole process to complete.
UPS Forced Shutdown Timeline

Power Down Time

For every device, the master, the slaves and any remote controller device:

  1. Have a stop watch ready
  2. Start the stopwatch and initiate a full power down of the device.
  3. Stop the stopwatch, when the device has turned its power down.
  4. Note the time of slowest device.
  5. Add configured delays from upsmon.conf like HOSTSYNC and FINALDELAY.

Device power down times:

Device Power Down Delays Total Time
Server (master) 20 sec 20 sec 40 sec
NAS (slave, slowest) 90 sec 5 sec 95 sec
Router (slave) 10 sec 5 sec 15 sec
Wi-Fi AP (remote controlled) 10 sec 5 sec 15 sec

“Low Battery” Time

The minimum required “Low Battery” time of the UPS is at least the time needed to shut down the slowest device.

We can set this by programming the EEPROM of the UPS device with the upsrw command:

$ upsrw -s battery.runtime.low=120 -u adminuser apc@localhost

“Shutdown Delay” Time

If the slowest device is a slave, the master will tell the UPS to cut the power, before that slave has completed his shutdown.

The UPS must therefore delay the the power-off, until the last slave has completed his shutdown:

Slowest slave time - master time = “Shutdown Delay” time

95 sec - 40 sec = 55 sec

Luckily our UPS device can be programmed to delay the power-off:

$ upsrw -s ups.delay.shutdown=60 -u adminuser apc@localhost

“Power-On Delay” Time

When the power comes back, the UPS should reload the battery to a safe level, before turning the power back on. By safe level, I mean: The battery must have been charged enough that all devices can fully boot, and there would be still enough power for the whole configured “Low Battery” time.

Without an appropriate re-charging delay, the power could come back just for a short time (which is often the case in power-failures), the devices start to boot up, but will be powered off again uncontrolled either during boot or during shutdown.

Unfortunately this feature not supported on my UPS device model.

There might be features who provide similar functionality:

  • ups.delay.start
  • “Reboot Delay” time
  • “Minimum Charge” to return online
  • “Load On Delay” time (load.on.delay)

If I could, I would set my UPS to roughly charge double from what is needed by the “Low Battery” time:

$ upsrw -s ups.delay.poweron=240 -u adminuser apc@localhost