Uninterruptible Power Supply

Network UPS Tools (NUT) Logo

Network UPS Tools (NUT) is a client/server monitoring system that allows computers to share uninterruptible power supply (UPS) and power distribution unit (PDU) hardware. Clients access the hardware through the server, and are notified whenever the power status changes.

The package nut-upsmon provides the client program, that is responsible for the most important part of UPS monitoring - shutting down the system when the power goes out.

The upsmon command can call out to other helper programs for notification purposes during power events. upsmon can monitor multiple systems using a single process. Every UPS that is defined in the upsmon.conf configuration file is assigned a power value and a type (slave or master).

Installation

These are the OpenWrt software packages to install:

  • nut-server: NUT server or standalone; only for the host attached directly to the UPS. Note will require a ‘nut-dirver-xxx’ driver to actually connect to the UPS.

  • nut-driver-usbhid-ups: Driver for most USB attached UPS devices.

  • nut-driver-dummy-ups: Allows to provides virtual UPS as alias of real ones.

  • nut-upsmon: Monitoring and/or triggering shutdown (e.g. client mode; can be on a server too and is in fact recommended on all hosts).

  • nut-upssched: Schedule script actions from some time after a UPS event.

router$ opkg update
router$ opkg install nut-driver-dummy-ups nut-driver-usbhid-ups nut-server \
    nut-upsmon nut-upssched luci-app-nut collectd-mod-nut

Topology

In our scenario, our UPS is connected directly to our router via USB.

                      |--------- USB -----------|
                      |                         |
                    |-----|                 |--------|
                    |     |... power-line...| Router |---WAN---
                    |     |                 |--------|
                    |     |                      |
                    |     |                     LAN
                    |     |                      |
                    |     |                 |--------|
                    | UPS |... power-line...| Switch |
                    |     |                 |--------|
                    |     |                      |
                    |     |                     LAN
                    |     |                      |
                    |     |                 |--------|
|AC|...power-line...|     |... power-line...| Server |
                    |-----|                 |--------|

We therefore connect servers, NAS and other equipment to the NUT daemon running on the router to get status information about our power supply and receive shutdown-commands.

NUT Server

The NUT server connects directly to UPS-device by USB-cable. It then provides the UPS services to local and remote clients.

Edit the file /etc/config/nut_server:

#
# OpenWrt Luci configuration for the Network UPS Tools Dameon
#

# NUT device driver for APC Back-UPS
# provides the 'apc' UPS device to clients
config driver 'apc'
	option driver 'usbhid-ups'
	option port 'auto'
	option vendorid '051d'
	option productid '0002'
	option serial '000000000000'
	option maxreport 'true'
	option desc '"APC Back-UPS RS 900G"'

# A virtual copy fo the 'apc' device
# which emulates a Synology NAS with attached UPS
config driver 'ups'
	option port 'apc@localhost'
	option driver 'dummy-ups'
	option desc '"Emulated Synology UPS"'

# Administration user profile for control and configuration
# of the connected UPS device
config user
	option username 'admin'
	option password '********'
	option upsmon 'master'
    option actions 'set fsd'
	list instcmd 'beeper.disable'
	list instcmd 'beeper.enable'
	list instcmd 'beeper.mute'
	list instcmd 'beeper.off'
	list instcmd 'beeper.on'
	list instcmd 'load.off'
	list instcmd 'load.off.delay'
	list instcmd 'shutdown.reboot'
	list instcmd 'shutdown.stop'
	list instcmd 'test.battery.start.deep'
	list instcmd 'test.battery.start.quick'
	list instcmd 'test.battery.stop'
	list instcmd 'test.panel.start'
	list instcmd 'test.panel.stop'

# User profile for the this local OpenWrt router monitor
# Note: As a "master" he will tell other clients and the USP device itself when
# to shutdown
config user
	option username 'router'
	option password '********'
	option upsmon 'master'

# User profile for the OpenWrt collectd statistics package
config user
	option username 'collectd'
	option password '********'
	option upsmon 'slave'

# Sample user profile for a remote server connecting here
config user
	option username 'server'
	option password '********'
	option upsmon 'slave'

# Hardwired user profile and password used by all Synoolgy NAS devices to
# connect with the emulated Synology NAS defined above
config user
	option username 'monuser'
	option password 'secret'
	option upsmon 'slave'

# Run the driver as root, to avoid various NUT USB driver issues.
config driver_global 'driver_global'
	option user 'root'

# Listen on all interfaces (local and remote)
config listen_address
	option address '0.0.0.0'

config upsd 'upsd'

# -*- mode: TXT; indent-tabs-mode: nil; tab-width: 4; -*-

NUT Monitor

The NUT monitoring client connects to the local NUT dameon as a master and is the one deciding what actions have to be taken based on the condition of the power supply.

Edit the file /etc/config/nut_monitor:

#
# OpenWrt Luci configuration for the Network UPS Tools Monitor
#

# Set various defaults
config upsmon 'upsmon'
    option runas nutmon
    option minsupplies '1'
    option shutdowncmd '/usr/sbin/nutshutdown'
    option notifycmd '/usr/sbin/upssched'
    list defaultnotify 'SYSLOG'
    list defaultnotify 'EXEC'
    option pollfreq '5'
    option pollfreqalert '5'
    option hostsync '15'
    option deadtime '15'
    option powerdownflags '/var/run/killpower'
    option finaldelay '5'

# Master user profile to use while connecting to the local NUT dameon
config master
    option upsname 'apc'
    option powervalue '1'
    option hostname 'localhost'
    option username 'router'
    option password '********'

# -*- mode: TXT; indent-tabs-mode: nil; tab-width: 4; -*-

Notify Command Configuration

In the above /etc/config/nut_monitor file we defined the option notifycmd as /usr/sbin/upssched. And also added “EXEC” to the list of acttions to be taken to defaultnotify

The upssched program needs additional configuration which is not provided by OpenWrt’s Luci interface and therefore is setup directly in the NUT configuration file /etc/nut/upssched.conf:

# Network UPS Tools - upssched.conf
#
# ============================================================================
#
# CMDSCRIPT <scriptname>
#
# This script gets called to invoke commands for timers that trigger.
# It is given a single argument - the <timername> in your
# AT ... START-TIMER defines.
#
# *** This must be defined *before* the first AT line.  Otherwise the
#     program will complain and exit without doing anything.
#
# A shell script with a big case..esac construct should work nicely for this.
# An example has been provided to help you get started.
#
# ***** Note for OpenWrt Users ****
# Use your own script in a custom location, as modifications to the sample file
# provided in /usr/bin/upssched-cmd will be lost on updates.
#
#CMDSCRIPT /usr/bin/upssched-cmd
CMDSCRIPT /root/upssched-cmd

# ============================================================================
#
# PIPEFN <filename>
#
# This sets the file name of the FIFO that will pass communications between
# processes to start and stop timers.  This should be set to some path where
# normal users can't create the file, due to the possibility of symlinking
# and other evil.
#
# Note: if you are running Solaris or similar, the permissions that
# upssched sets on this file *are not enough* to keep you safe.  If
# your OS ignores the permissions on a FIFO, then you MUST put this in
# a protected directory!
#
# Note 2: by default, upsmon will run upssched as whatever user you have
# defined with RUN_AS_USER in upsmon.conf.  Make sure that user can
# create files and write to files in the path you use for PIPEFN and
# LOCKFN.
#
# My recommendation: create a special directory for upssched, make it
# owned by your upsmon user, then use it for both.
#
# This is commented out by default to make you visit this file and think
# about how your system works before potentially opening a hole.
#
# ***** Note for OpenWrt Users ****
# The directory /var/run is re-created after every reboot, so the directory is
# not available, create and use your own custom location instead.
#
# PIPEFN /var/run/nut/upssched/upssched.pipe
PIPEFN /srv/nut/upssched/upssched.pipe

# ============================================================================
#
# LOCKFN <filename>
#
# REQUIRED.  This was added after version 1.2.1.
#
# upssched needs to be able to create this filename in order to avoid
# a race condition when two events are dispatched from upsmon at nearly
# the same time.  This file will only exist briefly.  It must not be
# created by any other process.
#
# You should put this in the same directory as PIPEFN.
#
# LOCKFN /var/run/nut/upssched/upssched.lock
LOCKFN /srv/nut/upssched/upssched.lock

# ============================================================================
#
# AT <notifytype> <upsname> <command>
#
# Define a handler for a specific event <notifytype> on UPS <upsname>.
#
# <upsname> can be the special value * to apply this handler to every
# possible value of <upsname>.
#
# Run the command <command> via your CMDSCRIPT when it happens.
#
# Note that any AT that matches both the <notifytype> and the <upsname>
# for the current event will be used.

# ============================================================================
#
# Possible AT commands
#
# - START-TIMER <timername> <interval>
#
#   Start a timer called <timername> that will trigger after <interval>
#   seconds, calling your CMDSCRIPT with <timername> as the first
#   argument.
#
#   Example:
#   Start a timer that'll execute when any UPS (*) has been gone 10 seconds
#
#   AT COMMBAD * START-TIMER upsgone 10

#   -----------------------------------------------------------------------
#
# - CANCEL-TIMER <timername> [cmd]
#
#   Cancel a running timer called <timername>, if possible. If the timer
#   has passed then pass the optional argument <cmd> to CMDSCRIPT.
#
#   Example:
#   If a specific UPS (myups@localhost) comes back online, then stop the
#   timer before it triggers
#
#   AT COMMOK myups@localhost CANCEL-TIMER upsgone

#   -----------------------------------------------------------------------
#
# - EXECUTE <command>
#
#   Immediately pass <command> as an argument to CMDSCRIPT.
#
#   Example:
#   If any UPS (*) reverts to utility power, then execute
#   'ups-back-on-line' via CMDSCRIPT.
#
#   AT ONLINE * EXECUTE ups-back-on-line


# The UPS is back on online power.
# Cancel any running "On Battery" timer, then execute the "Online" command.
AT ONLINE apc CANCEL-TIMER onbatt online

# Online power failure: The UPS is running on battery
# Start a 10 seconds timer, then execute the "On Battery" command.
AT ONBATT apc START-TIMER onbatt 10

# The UPS battery is low (as determined by the driver).
# Execute the "Low Battery" command immediately.
AT LOWBATT apc EXECUTE lowbatt

# The UPS has been commanded into the "Forced Shutdown" mode.
# Execute the "Forced Shutdown" command immediately.
AT FSD apc EXECUTE fsd

# Communications with the UPS has been (re-)established.
# Cancel any running "Communications Lost" timer, then execute the
# "Communications Restored" command.
AT COMMOK apc CANCEL-TIMER commbad commok

# Communication with the UPS was just lost.
# Start a 15 seconds timer, then execute the "Communications Lost" command.
AT COMMBAD apc START-TIMER commbad 15

# The local system is being shut down.
# Execute the "Notify Shutdown" command immediately.
AT SHUTDOWN apc EXECUTE shutdown

# The UPS needs to have its battery replaced.
# Start a 5 minutes timer, then execute the "Replace Battery" command.
AT REPLBATT apc EXECUTE replbatt 300

# The UPS can’t be contacted for monitoring.
# Start a 15 seconds timer, then execute the "No Communications" command.
AT NOCOMM apc START-TIMER nocomm 15

NUT Monitor runs under the nutmon user profile. He needs to be able to read the upssched configuration file:

$ chown nutmon:nutom /etc/nut/upssched.conf

Create the directory for the FIFO pipe and lockfile in a custom persistent location and set the owner to one used by the NUJT monitor on OpenWrt:

$ mkdir -p /srv/nut/upssched
$ chown nutmon /srv/nut/upssched

Notify Command Script

Copy the very minimal provided sample script /usr/bin/upssched-cmd to your own location, i.e. /root/upssched-cmd and customize it to your needs:

#! /bin/sh
#
# This script should be called by upssched via the CMDSCRIPT directive.
#
# Here is a quick example to show how to handle a bunch of possible
# timer names with the help of the case structure.
#
# This script may be replaced with another program without harm.
#
# The first argument passed to your CMDSCRIPT is the name of the timer
# from your AT lines.

case $1 in

online)
    logger -t upssched-cmd "The UPS is back on online power."
    message="$(printf "Note: Power supply to UPS %s has been restored." "$UPSNAME")"
    /usr/bin/create_notification -s news "${message}"
    /usr/bin/notifier
    ;;

onbatt)
    logger -t upssched-cmd "Online power failure. The UPS $UPSNAME is running on battery!"
    message="$(printf "Warning: UPS %s experienced a power failure and is now running on battery!" "$UPSNAME")"
    /usr/bin/create_notification -s error "${message}"
    /usr/bin/notifier
    ;;

lowbatt)
    logger -t upssched-cmd "The UPS battery is running low!"
    message="$(printf "Critical: The battery of UPS %s is running low!" "$UPSNAME")"
    /usr/bin/create_notification -s error "${message}"
    /usr/bin/notifier
    ;;

fsd)
    logger -t upssched-cmd "UPS is being shutdown by the master."
    message="$(printf "Warning: The UPS %s has been forced to shutdown now!" "$UPSNAME")"
    /usr/bin/create_notification -s restart "${message}"
    /usr/bin/notifier
    ;;

commok)
    logger -t upssched-cmd "Communications with the UPS has been (re-)established."
    message="$(printf "Note: Communications with the UPS %s has been (re-)established." "$UPSNAME")"
    /usr/bin/create_notification -s news "${message}"
    /usr/bin/notifier
    ;;

commbad)
    logger -t upssched-cmd "Communications with the UPS has been lost."
    message="$(printf "Warning: Communications with the UPS %s has been lost!" "$UPSNAME")"
    /usr/bin/create_notification -s error "${message}"
    /usr/bin/notifier
    ;;

shutdown)
    logger -t upssched-cmd "The system is being shutdown!"
    message="$(printf "Critical: The system is shutting down now!")"
    /usr/bin/create_notification -s restart "${message}"
    /usr/bin/notifier
    ;;

replbatt)
    logger -t upssched-cmd "The UPS battery is failing and needs to be replaced!"
    message="$(printf "Warning: The UPS battery in the UPS %s is failing and needs to be replaced!" "$UPSNAME")"
    /usr/bin/create_notification -s error "${message}"
    /usr/bin/notifier
    ;;

nocomm)
    logger -t upssched-cmd "The UPS is not responding!"
    message="$(printf "Warning: The UPS %s is not repsonding!" "$UPSNAME")"
    /usr/bin/create_notification -s error "${message}"
    /usr/bin/notifier
    ;;

*)
    logger -t upssched-cmd "Unrecognized command: $1"
    ;;
esac

Turris OS notifications

If you happen to be the lucky owner of a Turris device, you can use their built-in notification system.

The notifier displays important messages on the homepage of their reForis web-interface. They are preserved accross reboots until aknowldged and deleted by the user.

If setup correctly notifications will also be sent out by mail.

Testing

Testing the whole chain of events, notifications and actions of network with multiple devices, some of them unaware (i.e. Ethernet switches) is crucial. You can be almost sure, something will not work as expected.

Things to look for while testing:

  • Can master and slaves still communicate while on battery power?

  • Do all slaves receive the shutdown command from the master when battery power is low?

  • Does the master wait long enough for all the slaves to react on the shutdown command?

  • Do all devices have enough time (default is 2 minutes) to power down?

  • Do all devices startup again when power returns?

  • What happens when power returns in the middle of the shutdown procedure?

Testing the Shutdown Sequence

The first step is to see how upsdrvctl will behave without actually turning off power. To do so, use the -t argument:

On the master:

router$ nut upsdrvctl -t shutdown
Network UPS Tools - UPS driver controller 2.7.2
*** Testing mode: not calling exec/kill
   0.000000
...
   0.000690 Shutdown UPS: ups
   0.000711 exec:  /lib/nut/usbhid-ups -a ups -k

The second step is to let master actually tell the UPS to turn off the power:

/usr/local/ups/sbin/upsmon -c fsd

The master and the slaves should then start their shutdown procedure as if the battery had gone critical. Including turning off power by the UPS at the end.

This is much easier on your UPS equipment, and it beats crawling under a desk to find the plug.

References