We make heavy use of Nagios and the Remote Plugin Executor, NRPE1, at work (yeah I know I know, but it’s not my decision. Yet.). It runs as a service on remote servers and allows execution of scripts etc. Unfortunately NRPE does crash from time to time - like with a month in between - on one of our older machines. Kinda annoying of course, since no contact triggers critical alarms in Nagios. Not having the time to look into why the service is failing I decided to setup systemd to restart NRPE in case of faillure2. Surely this is the best way of resolving the issue…
Start by copying
/etc/systemd/system/. Then add the following to the
Reload the configuration and restart the service:
systemctl daemon-reload systemctl restart nrpe
One way to verify that this works is to gracefully kill the nrpe process and then watching it restart itself after some time.