Running Nagios as a System Service on RancherOS

Nagios is a fantastic monitoring tool, and I wanted to see if I could
get the agent to run as a system container on RancherOS, in order to
monitor the host and any Docker containers running on it. It turned out
to be incredibly easy. In this blog post, I’ll walk through how to
launch the Nagios agent as system container in RancherOS. Specifically,
I’ll use two vagrant boxes to cover:

  1. Provisioning a server with the Rancher control plane
  2. Adding a second server running Rancher OS
  3. Installing a Nagios agent as system container on the second server
  4. Connecting the Nagios agent to the Nagios management server

System Containers in RancherOS

First, for anyone who isn’t familiar with RancherOS, it is a minimal
distribution of Linux designed specifically to run Docker. RancherOS
runs a Docker daemon as PID 1, a role typically occupied by the init
system or systemd in most distributions. This daemon runs essential
system services like SSH, syslog or NTP as containers, and is called
system docker.

A second Docker daemon, called user docker, is launched as a
container. This is where any new containers started by the user are
created, as well as containers placed by Rancher or other management
services.

To give the Nagios agent access to all of the data from the server, as
well as the system and user containers, it should run in the system
docker instance. I will run this setup in 2 Vagrant virtual machines.

Set up Rancher

Even though we could monitor RancherOS with Nagios directly, I’m going
to set up Rancher in this deployment to manage the containers we create.
The Rancher team provides a Vagrantfile to run RancherOS in a VM here:
https://github.com/rancher/os-vagrant and
another Vagrantfile for Rancher here:
https://github.com/rancher/rancher
But, since I want to have both in one Vagrant setup, I merged both
Vagrantfiles into one and added the option to run multiple RancherOS
instances in one.

You can find my new Vagrant file here:
[https://github.com/buster/rancher-tutorial]{.c10}

The first step (after installing Vagrant, of course) is to clone this
repository and edit the Vagrantfile to match your IP addresses in the
lines:

# The number of VMs will be added to the following string,
# so Rancher will be on 192.168.0.200, the first RancherOS instance on 192.168.0.201, etc.
$rancher_ip_start = “192.168.0.20”
$rancherui_ip = $rancher_ip_start + “0”
# the number of rancher instances
$n_rancher = 1

* *

Leave $n_rancher at 1 for now.

After editing this file, run `vagrant up’.

Vagrant will now first setup the Rancher VM, which means Vagrant will
download the Virtualbox image, start it and Docker will then download
and run the Rancher Server and the Rancher Agent. Afterwards, the second
VM, which will host our RancherOS instance, will be started and the
RancherOS instance will register itself at the Rancher Server.

When finished, browse to the Rancher IP (http://192.168.0.200:8080/ in
my case) and observe your new and shiny VMs:

Adding a System Container to Rancher

The next task is to set up the Nagios Agent on the RancherOS instance.

For that you will need to log in to the server, which you do by running
`vagrant ssh rancher1`.

There you will have access to the user docker (by calling `docker`)
and to the system docker by calling `sudo system-docker`.

A system container is not different from your usual docker container,
except that it is run by the system docker and that has no networking by
default. Thus, it needs to inherit the network of the host (–net=host
parameter):

sudo system-docker run -d –net=host –name nagios-agent buster/nagios-agent

This nagios agent container comes with a minimal configuration to check
the load on the second RancherOS
instance.

[]Deploying the Nagios Server to Rancher

In order for the Nagios agent to make any sense, we will also need a
Nagios Server which polls the Nagios Agent.

This is as easy as any other Rancher deployment, by clicking on “Add
Container” in the Rancher UI.

There we will make use of the already existing Nagios Server docker
container from
https://registry.hub.docker.com/u/cpuguy83/nagios/
Also don’t forget to go to the `Ports` tab and map port 80 to port
8081 so that you can login on nagios.

Add this container and after a while, the Nagios Server will be up and
running! Browse to
http://192.168.0.200:8081/ and
observe the Nagios UI running. The default username is
nagiosadmin and the password is nagios.

[]Configure Nagios Server

The Nagios Server only knows itself right now, so we will need to
configure it to poll the Nagios Agent.

This can be done in /opt/nagios/etc/conf.d/rancher1.cfg, for example.

Rancher offers a very nice terminal into the running containers, which
you can reach by click on the container and afterwards on the “execute
shell” url:

Now, you can edit the config file by running `nano
/opt/nagios/etc/conf.d/rancher1.cfg`.

Add the following lines to the file:

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
define host{
use linux-server
host_name rancher1
address 192.168.0.201
}
define service{
use linux-server
host_name rancher1
service_description Current Users
check_command check_nrpe!check_users
}
define service{
use generic-service
host_name rancher1
service_description Current Load
check_command check_nrpe!check_load
}

Afterwards you can check if the configuration file format is correct by
running `nagios -v /opt/nagios/etc/nagios.cfg`.

To check that the nrpe server on the second host is running you can also
run a check manually: `/opt/nagios/libexec/check_nrpe -H
192.168.0.201 -c check_load`

After you have verified the working Nagios setup you simply need to
restart the Nagios Server container by clicking on the symbol:

Now, you can login to Nagios again and see the Nagios Plugins doing
their work:

Conclusion

Using Nagios to monitor multiple RancherOS servers is as easy as running
a preconfigured publicly available Docker container from
https://registry.hub.docker.com

Starting a system docker container requires a few additional steps
compared to running a user container, but hopefully we’ve explained
them clearly here.

In the next few weeks RancherOS will ship 0.3, which includes support
for predefined system services. That will make configuration of new
agents in the Nagios server as easy as executing a docker run command.

If you’d like to get started with RancherOS, you can download it from
GitHub here. Also, we’re always demoing new features and answering lots
of questions at each months Rancher Meetup, which you can find a link
to for below.

Sebastian Schulze is a Technology Consultant from Germany, with
experience in Linux, Solaris, Docker, and Vagrant. You can contact him
via github at:
https://github.com/buster

Source

Leave a Reply

Your email address will not be published. Required fields are marked *