Uninterruptible Power Supply
Network UPS Tools (NUT) is a client/server monitoring system that allows computers to share uninterruptible power supply (ups) and power distribution unit (PDU) hardware. Clients access the hardware through the server, and are notified whenever the power status changes.
Topology
In the following scenario, the server acts as the master. He monitors and controls the UPS via USB data cable and keeps the slaves updated about the current situation.
The router and the NAS act as slaves. They get status updates about the UPS battery state and power supply from the master.
If the master gets notified by the UPS that its battery is getting close to depletion, he will instruct the slaves to shutdown.
The master will start its own shutdown procedure and instruct the UPS to cut the power, after he has confirmation from all the slaves, that they are shutting down.
The Wi-Fi AP has no Network UPS Tools installed, thus he is not aware of the current situation. He will however be shut down by the master via remote SSH command, in case the battery is low during a power outage.
The Ethernet switch, has also no knowledge about the UPS. He will shut down uncontrolled, when the UPS battery is depleted or when the UPS is ordered to cut the power by the master.
Shutdown Plan
Here is what happens step-by-step in case of main power loss:
Main power failure occurs:
UPS device switches power to battery.
UPS device notifies master with a “On Battery” event message.
Master notifies slaves with a “On Battery” event message.
USP Battery is getting close to depletion:
UPS device notifies master with a “Battery Low” event message.
Master issues “Forced Shutdown” command message to all slaves.
Master issues remote shutdown commands by SSH to any unmanaged devices.
Unmanaged devices start their shutdown procedure.
Slaves receive the “Forced Shutdown” command message.
Slaves may issue “Shutdown” notification message to their users.
Slaves wait the set “Final Delay” time. This is to process the above notifications.
Slaves notify the master with a “Notify Shutdown” event message.
Slaves start their shutdown procedure:
Ends all running processes.
Unmounts all file systems.
Remounts file systems as read-only.
Halts the system (but doesn’t power off).
Master waits until he received “Notify Shutdown” event messages from all slaves.
Master issues a “Shutdown” notification message to its users.
Master waits the set “Final Delay” time. This is to process the above notifications.
Master starts his shutdown procedure:
Sets the “Killpower” flag
Ends all running processes.
Unmounts all file systems.
Remounts file systems as read-only.
Looks for the “Killpower” flag.
Issues the “Kill Power” command to the UPS device.
Halts the system (but doesn’t power off).
UPS device receives the “Kill Power” command from the master:
UPS waits for the “Shutdown Delay” time to pass. This is to give all system enough time to properly shut down.
UPS device cuts power on all outlets.
All connected systems lose power.
Main power supply has been restored:
UPS device starts to reload its battery.
UPS device waits for the “Startup Delay” time to pass. This is to reload the battery to a safe minimum level.
UPS device restores power on all outlets.
All connected systems start up.
Installation
sudo apt install nut
Configuration
nut.conf
This file tells the installed Network UPS Tools in which mode it should run. Depending on this setting the required modules are then started.
1# IMPORTANT NOTE:
2# This file is intended to be sourced by shell scripts. You MUST NOT use
3# spaces around the equal sign!
4#
5# Required. Recognized values are none, standalone, netserver and netclient.
6# Defaults to none.
7MODE=netserver
See the nut.conf(5) manpage for more possible options.
ups.conf
This file is read by the driver controller. It tells the Network UPS Tools what kind of UPS device it has to work with. Some settings to control communications with the device. Also some of the UPS device parameters can be overridden.
1#
2# ups.conf - UPS definitions for Network UPS Tools
3#
4# Set global directives, and individual UPS device options for the NUT UPS
5# device drivers.
6#
7
8# Wait 45 seconds for the driver to finish starting.
9maxstartdelay = 45
10
11# Try 3 times to start the driver, before giving up.
12maxretry = 3
13
14# Wait 5 seconds between attempts to start the driver.
15retrydelay = 5
16
17# Start a user "nut".
18user = nut
19
20#
21# Our UPS device(s)
22#
23[apc]
24 # Back-UPS RS 900G FW:879.L4 .I USB FW:L4
25 driver = usbhid-ups
26 port = auto
27 desc = "APC Back-UPS RS 900G"
28 offdelay = 120
29 ondelay = 240
30
31
32[ups]
33 # Pretend to be a Synology NAS, so other DiskStations will connect here.
34 driver = dummy-ups
35 port = apc@localhost
36 desc = "Synology UPS server"
See the ups.conf(5) and usbhid-ups(8) manpages for more possible options.
upsd.conf
Here we control access to the server and set some other miscellaneous configuration values.
1#
2# upsd.conf - Configuration for Network UPS Tools upsd
3#
4
5# Interfaces to listen for TCP connections from clients.
6# if not specified, the default port is 3493.
7# This will only be read at startup of upsd. If you make changes here,
8# you'll need to restart upsd, reload will have no effect.
9#
10LISTEN 127.0.0.1
11LISTEN ::1
12LISTEN 192.0.2.10
13LISTEN 2001:db8:c0de::10
See the upsd.conf(5) manpage for more possible options.
upsd.users
Administrative commands such as setting variables and the instant commands are powerful, and access to them needs to be restricted. This file defines who may access them, and what is available.
Each user gets its own section. The fields in that section set the parameters
associated with that user’s privileges. The section begins with the name of
the user in brackets, and continues until the next user name in brackets or
EOF. These users are independent of /etc/passwd
.
1#
2# upsd.users - User definitions for NUT upsd
3#
4
5[adminuser]
6
7 # Administrative user
8 password = ********
9
10 # Allow changing values of certain variables in the UPS.
11 actions = SET
12
13 # Allow setting the "Forced Shutdown" flag in the UPS.
14 actions = fsd
15
16 # Allow all instant commands
17 instcmds = ALL
18
19
20[server]
21
22 # The localhost, master server
23 password = ********
24
25 # Allow required instant commands to act as master.
26 upsmon master
27
28
29[router]
30 # OpenWRT router
31 password = ********
32
33 # Allow required actions to act as slave.
34 upsmon = slave
35
36
37[monuser]
38
39 # Pretend to be a Synology NAS, so other DiskStations will connect here.
40 password = secret
41
42 # Allow required actions to act as slave.
43 upsmon = slave
See the upsd.users(5) manpage for more possible options.
upsmon.conf
This file’s primary job is to define the systems that upsmon(8) will monitor and to tell it how to shut down the system when necessary.
Additionally, other optional configuration values can be set in this file.
1#
2# upsmon.conf - Configuration for Network UPS Tools upsmon
3#
4
5# Drop privileges to the following user profile after startup.
6RUN_AS_USER nut
7
8# List of UPS devices to monitor.
9MONITOR apc@localhost 1 upsmon ******* master
10
11# Number of power supplies receiving power to keep this system running.
12MINSUPPLIES 1
13
14# Command to run when to shutdown this system.
15SHUTDOWNCMD "/bin/systemctl halt"
16
17# Command to run on +EXEC events.
18NOTIFYCMD "/sbin/upssched"
19
20# Change behavior of upsmon on certain events.
21#
22# Possible values for the flags:
23#
24# SYSLOG - Write the message in the syslog
25# WALL - Write the message to all users on the system
26# EXEC - Execute NOTIFYCMD (see above) with the message
27# IGNORE - Don't do anything
28#
29# If you use IGNORE, don't use any other flags on the same line.
30#
31# NOTIFYFLAG <notify type> <flag>[+<flag>][+<flag>] ...
32#
33NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC
34NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
35NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
36NOTIFYFLAG FSD SYSLOG+WALL+EXEC
37NOTIFYFLAG COMMOK SYSLOG+WALL+EXEC
38NOTIFYFLAG COMMBAD SYSLOG+WALL+EXEC
39NOTIFYFLAG SHUTDOWN SYSLOG+WALL+EXEC
40NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC
41NOTIFYFLAG NOCOMM SYSLOG+WALL+EXEC
42NOTIFYFLAG NOPARENT SYSLOG+WALL
43
44
45# Poll the UPS every 5 seconds.
46POLLFREQ 5
47
48# If the UPS is on battery, poll it every 5 seconds.
49POLLFREQALERT 5
50
51# Wait no more then 15 seconds for "Notify Shutdown" messages from slaves.
52HOSTSYNC 15
53
54# Wait no more then 15 seconds to consider an unreachable UPS as dead.
55DEADTIME 15
56
57# Location of the flag-file to make UPS turn itself off.
58POWERDOWNFLAG /etc/killpower
59
60# Warn every 12 hours if battery needs to be replaced.
61RBWARNTIME 43200
62
63# Warn every 5 minutes, if UPS is unreachable.
64NOCOMMWARNTIME 300
65
66# Wait 5 seconds before starting to shut down.
67FINALDELAY 5
See the upsmon.conf(5) manpage for more possible options.
upssched.conf
This file controls the operations of upssched(8), the timer-based helper program for upsmon(8).
Here we can define our own script, which will be executed on certain events.
1#
2# Network UPS Tools - upssched.conf file
3#
4
5# The command script to run
6CMDSCRIPT /usr/local/bin/ups-scheduled-tasks
7
8# Command pipe and lock-file
9PIPEFN /run/nut/upssched.pipe
10LOCKFN /run/nut/upssched.lock
11
12# The UPS is back on line.
13# Cancel any running "On Battery" timer, then execute the "Online" command.
14AT ONLINE apc@localhost CANCEL-TIMER onbatt online
15
16# The UPS is on battery.
17# Start a 10 seconds timer, then execute the "On Battery" command.
18AT ONBATT apc@localhost START-TIMER onbatt 10
19
20# The UPS battery is low (as determined by the driver).
21# Execute the "Low Battery" command immediately.
22AT LOWBATT apc@localhost EXECUTE lowbatt
23
24# The UPS has been commanded into the "Forced Shutdown" mode.
25# Execute the "Forced Shutdown" command immediately.
26AT FSD apc@localhost EXECUTE fsd
27
28# Communication with the UPS has been established.
29# Cancel any running "Communications Lost" timer, then execute the
30# "Communications Restored" command.
31AT COMMOK apc@localhost CANCEL-TIMER commbad commok
32
33# Communication with the UPS was just lost.
34# Start a 15 seconds timer, then execute the "Communications Lost" command.
35AT COMMBAD apc@localhost START-TIMER commbad 15
36
37# The local system is being shut down.
38# Execute the "Notify Shutdown" command immediately.
39AT SHUTDOWN apc@localhost EXECUTE shutdown
40
41# The UPS needs to have its battery replaced.
42# Start a 5 minutes timer, then execute the "Replace Battery" command.
43AT REPLBATT apc@localhost START-TIMER replbatt 300
44
45# The UPS can’t be contacted for monitoring.
46# Start a 15 seconds timer, then execute the "No Communications" command.
47AT NOCOMM apc@localhost START-TIMER nocomm 15
See the upssched.conf(5) manpage for more possible options.
Securing Configuration Files
Secure the configuration files, to protect the various access controls and credentials.
$ sudo chown -R root:nut /etc/nut
$ sudo chmod 0770 /etc/nut
$ sudo chmod 0640 /etc/nut/*
Scheduled Command Script
We use this to send out remote commands by SSH to systems who’s power-lines are connected to the UPSm but they don’t have Network UPS Tools installed and so won’t be able to know by themselves when they should shutdown.
In the following example this will be a MikroTik device called “ap.example.com”
We assume that there is a user profile called nut on the remote device and that it has a properly installed SSH public key. to allow password-less logins.
We create the script for sending out remote commands:
/usr/local/bin/ups-scheduled-tasks
1#! /bin/sh
2#
3# SSH connection settings
4SSH_HOST='ap.example.net'
5SSH_USER='nut'
6SSH_KEY='/etc/nut/ssh/id_rsa'
7
8case $1 in
9 onbatt)
10 message="Power Failure on UPS ${UPSNAME}!"
11 echo -e "Warning: UPS $UPSNAME experienced a power failure and is now running on battery!" \
12 | mail -s"Warning: $message" root
13 remote_cmd="log warning message=\"${message}\""
14 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
15 ;;
16 online)
17 message="Power restored on UPS $UPSNAME"
18 echo -e "Power on UPS $UPSNAME has been restored." \
19 | mail -s"$message" root
20 remote_cmd="log info message=\"${message}\""
21 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
22 ;;
23 lowbatt)
24 message="Low battery on UPS ${UPSNAME}!"
25 echo -e "Warning: UPS $UPSNAME is low on battery! All connected Systems will be shut down soon." \
26 | mail -s"Warning: $message" root
27 remote_cmd="log warning message=\"${message}\""
28 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
29 ;;
30 fsd)
31 message="Forced Shutdown from UPS ${UPSNAME}!"
32 echo -e "Warning: All Systems connected to UPS $UPSNAME will be shut down now!" \
33 | mail -s"Warning: $message" root
34 remote_cmd="log error message=\"${message}\" ; beep 0.5 ; delay 4000ms ; beep 0.5 ; system shutdown!"
35 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
36 ;;
37 commok)
38 message="Communications restored with UPS $UPSNAME"
39 echo -e "Communications with UPS $UPSNAME have been restored." \
40 | mail -s"$message" root
41 remote_cmd="log info message=\"${message}\""
42 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
43 ;;
44 commbad)
45 message=""
46 echo -e "Warning: Lost communications with UPS ${UPSNAME}!" \
47 | mail -s"Warning: Lost communications with UPS ${UPSNAME}!" root
48 remote_cmd="log warning message=\"${message}\""
49 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
50 ;;
51 shutdown)
52 message="System $HOST is shutting down now!"
53 echo -e "Warning: System $HOST is shutting down now!" \
54 | mail -s"Warning: $message" root
55 remote_cmd="log warning message=\"${message}\""
56 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
57 ;;
58 replbatt)
59 message="Replace battery on UPS ${UPSNAME}!"
60 echo -e "Warning: The UPS $UPSNAME needs to have its battery replaced!" \
61 | mail -s"Warning: $message" root
62 remote_cmd="log warning message=\"${message}\""
63 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
64 ;;
65 nocomm)
66 message="The UPS $UPSNAME can’t be contacted for monitoring!"
67 echo -e "Warning: The UPS $UPSNAME can’t be contacted for monitoring!" \
68 | mail -s"Warning: $message" root
69 remote_cmd="log warning message=\"${message}\""
70 ssh $SSH_HOST -l $SSH_USER -i $SSH_KEY $remote_cmd
71 ;;
72 *)
73 logger -t ups-scheduled-tasks "Unrecognized command: $1"
74 ;;
75esac
Make the script executable by the nut system user:
$ sudo chown nut:nut /usr/local/bin/ups-scheduled-tasks
$ sudo chmod 740 /usr/local/bin/ups-scheduled-tasks
Powering Off the UPS
After the server has terminated all processes, unmounted file-systems and re-mounted them read-only, it is safe to cut off the power.
The following script will be executed by the
systemd-halt.service(8)
just before turning off the CPU, by placing it in the directory /lib/systemd/system-shutdown
.
/lib/systemd/system-shutdown/nut.shutdown
1#!/bin/sh
2
3# Test for the shutdown flag
4if @SBINDIR@/upsmon -K >/dev/null 2>&1; then
5
6 # Check if a power race workaround has been configured
7 wait_delay=`/bin/sed -ne 's#^ *POWEROFF_WAIT= *\(.*\)$#\1#p' @CONFPATH@/nut.conf`
8
9 # Command the UPS driver(s) to run their shutdown sequence
10 @SBINDIR@/upsdrvctl shutdown
11
12 if [ -n "$wait_delay" ] ; then
13 /bin/sleep $wait_delay
14 # We need to pass --force twice here to bypass systemd and execute the
15 # reboot directly ourself.
16 /bin/systemctl reboot --force --force
17 fi
18fi
19
20exit 0
Make it executable:
$ sudo chmod +x /lib/systemd/system-shutdown/nut.shutdown
See upsdrvctl(8)
UPS Device Configuration
Depending on your model, some settings of your USP device can be set, by changing values of variables stored in the device EEPROM.
This is done the upsrw command.
To get a list of the writable configuration variables on your model:
$ upsrw apc@localhost
[battery.charge.low]
Remaining battery level when UPS switches to LB (percent)
Type: STRING
Maximum length: 10
Value: 10
[battery.runtime.low]
Remaining battery runtime when UPS switches to LB (seconds)
Type: STRING
Maximum length: 10
Value: 120
[input.sensitivity]
Input power sensitivity
Type: STRING
Maximum length: 10
Value: medium
[input.transfer.high]
High voltage transfer point (V)
Type: STRING
Maximum length: 10
Value: 294
[input.transfer.low]
Low voltage transfer point (V)
Type: STRING
Maximum length: 10
Value: 176
[ups.delay.shutdown]
Interval to wait after shutdown with delay command (seconds)
Type: STRING
Maximum length: 10
Value: 20
Timing
For the above procedure to work as intended timing is critical.
To be on the safe side:
Every device needs to have enough time to be safely halted.
Additional processing time and delays have to be added.
The UPS device needs to send its “Low Battery” notification, when there is just about enough battery time left for the whole process to complete.
Power Down Time
For every device, the master, the slaves and any remote controller device:
Have a stop watch ready
Start the stopwatch and initiate a full power down of the device.
Stop the stopwatch, when the device has turned its power down.
Note the time of slowest device.
Add configured delays from
upsmon.conf
like HOSTSYNC and FINALDELAY.
Device power down times:
Device |
Power Down |
Delays |
Total Time |
---|---|---|---|
Server (master) |
20 sec |
20 sec |
40 sec |
NAS (slave, slowest) |
90 sec |
5 sec |
95 sec |
Router (slave) |
10 sec |
5 sec |
15 sec |
Wi-Fi AP (remote controlled) |
10 sec |
5 sec |
15 sec |
“Low Battery” Time
The minimum required “Low Battery” time of the UPS is at least the time needed to shut down the slowest device.
We can set this by programming the EEPROM of the UPS device with the upsrw command:
$ upsrw -s battery.runtime.low=120 -u adminuser apc@localhost
“Shutdown Delay” Time
If the slowest device is a slave, the master will tell the UPS to cut the power, before that slave has completed his shutdown.
The UPS must therefore delay the the power-off, until the last slave has completed his shutdown:
Slowest slave time - master time = “Shutdown Delay” time
95 sec - 40 sec = 55 sec
Luckily our UPS device can be programmed to delay the power-off:
$ upsrw -s ups.delay.shutdown=60 -u adminuser apc@localhost
“Power-On Delay” Time
When the power comes back, the UPS should reload the battery to a safe level, before turning the power back on. By safe level, I mean: The battery must have been charged enough that all devices can fully boot, and there would be still enough power for the whole configured “Low Battery” time.
Without an appropriate re-charging delay, the power could come back just for a short time (which is often the case in power-failures), the devices start to boot up, but will be powered off again uncontrolled either during boot or during shutdown.
Unfortunately this feature not supported on my UPS device model.
There might be features who provide similar functionality:
ups.delay.start
“Reboot Delay” time
“Minimum Charge” to return online
“Load On Delay” time (load.on.delay)
If I could, I would set my UPS to roughly charge double from what is needed by the “Low Battery” time:
$ upsrw -s ups.delay.poweron=240 -u adminuser apc@localhost