Nagios Exercises
PART I
-----------------------------------------------------------------------------
1. Install Nagios
Do this as root.
# apt-get install nagios2
- You will be asked for a password for the nagios admin Web user
- remember it!
Now do this so that we can have pretty icons
# apt-get install nagios-images
2. Create the Web user password file:
# htpasswd -c /etc/nagios2/htpasswd.users nagiosadmin
New password:
Re-type new password:
2. You should already have a working Nagios!
- Open a browser, and go to
http://localhost/nagios2/
- At the login prompt, login as:
user: nagiosadmin
pass:
3. Let's look at the interface together...
# cd /etc/nagios2/
# ls -l
-rw-r--r-- 1 root root 1598 2007-09-01 00:03 apache2.conf
-rw-r--r-- 1 root root 9573 2006-12-20 22:20 cgi.cfg
-rw-r--r-- 1 root root 4653 2006-12-20 22:20 commands.cfg
drwxr-xr-x 2 root root 4096 2007-09-01 00:03 conf.d
-rw-r--r-- 1 root root 26 2007-09-01 00:05 htpasswd.users
-rw-r--r-- 1 root root 30431 2006-12-20 22:20 nagios.cfg
-rw-r----- 1 root nagios 1293 2006-12-20 22:19 resource.cfg
drwxr-xr-x 2 root root 4096 2006-12-20 22:20 stylesheets
# ls -l conf.d/
-rw-r--r-- 1 root root 1687 2006-12-20 22:19 contacts_nagios2.cfg
-rw-r--r-- 1 root root 413 2006-12-20 22:19 extinfo_nagios2.cfg
-rw-r--r-- 1 root root 1152 2006-12-20 22:19 generic-host_nagios2.cfg
-rw-r--r-- 1 root root 1803 2006-12-20 22:19 generic-service_nagios2.cfg
-rw-r--r-- 1 root root 210 2007-09-01 00:03 host-gateway_nagios2.cfg
-rw-r--r-- 1 root root 976 2006-12-20 22:19 hostgroups_nagios2.cfg
-rw-r--r-- 1 root root 2163 2006-12-20 22:19 localhost_nagios2.cfg
-rw-r--r-- 1 root root 806 2006-12-20 22:19 services_nagios2.cfg
-rw-r--r-- 1 root root 1609 2006-12-20 22:19 timeperiods_nagios2.cfg
PART II
-----------------------------------------------------------------------------
1. According to what we saw in class, let's add a new host
- Pick any PC in the room, i.e. something other than pc10!
# cd /etc/nagios2/conf.d/
# vi pc10.cfg
define host {
use generic-host
host_name pc10
alias PC 10 at intERLab
address _______________ [pc10's IP address here]
}
... Save and quit
2. Let's create a new hostgroup for the occasion, and add our host
to it
- Edit the file hostgroups_nagios2.cfg and add a new group:
# vi hostgroups_nagios2.cfg
define hostgroup {
hostgroup_name interlab-pcs
alias intERLab PCs
members pc10
}
3. Now let's associate some services to that host
# vi services_nagios2.cfg
- Find the section called "check that ssh services are running",
and change the line:
hostgroup_name ssh-servers
to
hostgroup_name ssh-servers, interlab-pcs
4. Verify that your configuration file is OK:
# nagios2 -v /etc/nagios2/nagios.cfg
... You should get :
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the check.
5. Reload Nagios
# /etc/init.d/nagios2 reload
NOTES:
- This is the standard way of updating the nagios configuration.
There is a bug in the Ubuntu init script (/etc/init.d/nagios2).
You should do the following instead:
# /etc/init.d/nagios2 stop
# /etc/init.d/nagios2 start
Each time you make changes - otherwise you will end up with
multiple Nagios instansces running. To resolve this problem
you can do:
# ps auxwww | grep nagios
# killall nagios2
# /etc/init.d/nagios2 start
6. Go to the web interface (http://localhost/nagios2) and check the host
you just added
7. Add ALL the PCs in the room!
- Add all the PCs in the room to the config
- Check HTTP for all PCs in the room
- Remember to verify the configuration file!
- I suggest that you create a single config file called pcs.cfg
to do this.
NOTE:
- This requires a bit of planning, but you should have all the elements
for doing this...
- Think well about the logical structure of the files -- it should be
possible for you to do this without doing too much work!
PART III
-----------------------------------------------------------------------------
1. Now let's create a complete Nagios configuration for our
classroom network.
NOTES:
- This requires more planning. You have switches, routers, the
and the gateway server. In addition, the IP addresses that you
use are for your network router, the classroom router, and the
other network's router depend on your position in the network.
- You want to use internal IP address for your network's router,
the gateway router, but the external IP address for the other
network's router.
- Note that the switches are not running Telnet, they are
using ssh. So you should do an ssh check on them.
- The routers, except one, are running ssh.
- We have two unmanaged switches in our network. These will
cause us problems later. Don't worry about them now.
- It is important that you properly define the parent for
devices. Some examples are given below. Devices can have
more than one parent, but in our network this is not
true.
2.) Create a file to define the configuration of your network
gateway. This is usually a router box, but in our case
this is the noc box as we are doing NAT in our classroom.
Thus, the noc box is the parent box for our backbone router.
Do this in the file "/etc/nagios2/conf.d/
Sample entry:
# a host definition for the gateway of the default route
define host {
host_name gateway
alias Default Gateway (the noc)
address 10.10.10.1
use generic-host
}
define service {
use generic-service
host_name gateway
service_description Interface
check_command check-host-alive
}
NOTE:
- We have defined a default service for this entry as well. A
simple ping to verify that the host is alive. This is redundant
as the gateway box in this case is the box on which Nagios is
running for the classroom. For your use, however, this makes
sense as you need to verify that your gateway out of the
classroom network is up.
3.) Create a file to define the configuration for your routers.
Maybe "/etc/nagios2/conf.d/routers.cfg". there should be
three entries in this file.
Sample entry:
define host {
use generic-host
host_name rtr10.10.1
alias router for 10.10.1 on backbone
# address 10.10.1.254
address 10.10.11.1
parents gw-rtr
}
4.) Create a file to define the configuration for your switches.
Maybe "/etc/nagios2/conf.d/switches.cfg". There should be
two entries in this file.
Sample entry:
define host {
use generic-host
host_name switch1
alias switch 1 mgmt@intERLab
address 10.10.1.253
parents rtr10.10.1
}
5.) In the file "/etc/nagios2/conf.d/hostgroups_nagios2.cfg"
create hostgroups for all the routers, switches and
pcs in the classroom.
Sample entry:
# hostgroup definition for AIT intERLab Network Management Workshop
define hostgroup {
hostgroup_name cisco-routers
alias Cisco Routers at AIT intERLab
members gw-rtr,rtr10.10.1,rtr10.10.2
}
6.) In the file "/etc/nagios2/conf.d/services_nagios2.cfg" you
define what groups (not individual devices) will have what
service checks run on them.
Sample entry (yours may not be as complex at first):
# check that ssh services are running
define service {
hostgroup_name ssh-servers,interlab-pcs,cisco-routers,internet-srv-ssh,switches
service_description SSH
check_command check_ssh
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
}
7.) The file "/etc/nagios2/conf.d/extinfo_nagios2.cfg" defines
details for each device defined. For instance, here are
some sample entries you could use to build prettier Nagios
results for your various devices:
================ extinfo_nagios2.cfg ===================
define hostextinfo {
host_name rtr10.10.1
icon_image cook/router.png
icon_image_alt Router
statusmap_image cook/router.gd2
}
define hostextinfo {
host_name switch1
icon_image cook/network_switch.png
icon_image_alt Network Switch
statusmap_image cook/network_switch.gd2
}
define hostextinfo {
host_name pc1
icon_image base/ubuntu.png
icon_image_alt Debian GNU/Linux
statusmap_image base/ubuntu.gd2
}
================ extinfo_nagios2.cfg ===================
NOTES:
- You don't have the "ubuntu.*" icons by default. If
you get an error about this when restarting Nagios,
then change "ubuntu.*" to be "linux.*".
- We have additional images available for you to use.
You can download these from the Nagios Plugins and
Add Ons Exchnage site at:
http://www.nagiosexchange.org/
- To get the Ubuntu icons for nagios you can do the following:
# cd /tmp
# wget http://noc/files/imagepack-ubuntu.tar.gz
# tar xvzf imagepack-ubuntu.tar.gz
# cd logos
# sudo mv * /usr/share/nagios/htdocs/images/logos/base/.
Now you will have the ubuntu logos available to use in Nagio.
8. If you have gotten here and are still reading you can download
an entire set of Nagios configuration files for this network
that will only need a few changes for your machine. These are
availabe here:
http://noc/conf/etc/nagios2/
You can copy the files using wget or scp. For insance:
# cd /etc/nagios2
# sudo bash
# scp -r inst@noc:/var/www/share/conf/etc/nagios2/* .
would overwrite whatever you have in your /etc/nagios2
directory and sub-directories with these preconfigured files.
9.) You sill need to update a few files. Including:
/etc/nagios2/conf.d/routers.cfg
/etc/nagios2/conf.d/pcs.cfg
You should make sure that you have the correct IP
addresses defined in routers.cfg for your network view,
and you will want to comment out your pcs entry in
the file pcs.cfg
Remember to restart Nagios for changes to take affect.
PART IV
-----------------------------------------------------------------------------
1.) Here we will tie in the ability of Nagios and Trac to work
together to help document your network. The concept if
quite simple. First, go to your local Trac project install
page at:
http://localhost/trac/ait
Log in as the admin user so that you can edit the Trac
wiki.
2.) Create an entry for your PC in the wiki. You can do this by
clicking on the "Edit this page" button and entering in a
link like this (example for PC1, use your PC number instead):
[wiki:PC1 PC1] : '''10.10.1.1'''
Save the page.
3.) Click on the PC1 item that's grey with a question mark. Now
create this page. Enter in some text about your PC and save
the page.
4.) In Nagios you need to edit the file:
/etc/nagios2/conf.d/extinfo_nagios2.cfg
and update your PCs entry in this file with a line like this:
notes_url http://localhost/trac/ait/wiki/PC1
You can place this on a line after the "host_name" entry.
Remember to change "PC1" to your PCs number.
5.) Restart Nagios.
6.) If you look in your Nagios Service Detail view there should now be
a new icon next to your machine's entry. This looks like a folder.
Click on this and the URL you entered for the notes_url entry in
the extinfo_nagios2.cfg file will open. You can, also, click on
the machines' icon in the graph views, then click again and this
page will open.
PART V
-----------------------------------------------------------------------------
1.) Now we will create a plug-in for Nagios. This plug-in will do the
following:
* Ping a set of (external) servers.
* If one server is down a warning will be generated.
* If two servers are down a critical state will be generated.
This will be part of our scripting session. The instructions for
doing this are here:
http://noc/presos/scripting/bash.html
PART VI
-----------------------------------------------------------------------------
1.) We will update our Nagios contacts definion,
"/etc/nagios2/conf.d/contacts_nagios2.cfg" to add a local user to
that will receive alerts for certain condition.
2.) Next we will add another user for our Trac ticketing system so
that a ticket is automatically generated for specific events.
3.) The first step in this is to update track with a new plug-in
called email2trac. You can find this plug-in and more details on
its installation, use and configuration here:
https://subtrac.sara.nl/oss/email2trac
To install the email2trac plug-in do the following:
# sudo bash
# cd /usr/local/src
# wget http://noc/files/email2trac.tar.gz
# tar xvzf email2trac.tar.gz
# cd email2trac-0.13
# ./configure
# make
# make install
This is one of those times when you need to use source to
install software on your system.
4.) Now you need to configure the email2trac plug-in. Edit the file
/usr/local/etc/email2trac.conf and change the lines that read:
project: /data/trac/jouvin
to read:
project: /trac/ait
To etter understand what you are doing, and to see all the various
options you can set read the more complete configuration docu-
mentation available here:
https://subtrac.sara.nl/oss/email2trac/wiki/Email2tracConfiguration
5.) Next we need to create an alias in our email system that will
receive emails for the trac system and then pipe them through
the email2trac plug-in to the trac Project installed on your
machine. To do this edit the file /etc/aliases and add a line
that reads:
trac-tickets: "|/usr/local/bin/run_email2trac --project=bas"
6.) Save your changes. Now, we are going to replace the MTA that is
currently installed on your system with the Postfix MTA. While
a complete understanding of the use of Postfix is a complex
topic, the actual installation is trivial under Ubuntu. To
install Postfix simply do:
# sudo apt-get install postfix
- When prompted for the mail server configuration to choose select:
"Internet site:"
- When prompted for your system mail name it should be something
like "pc1", "pc2", etc... (i.e., "pcN"). This is fine. Just
select "Ok".
- When prompted where to deliver mail for "postmaster", "root",
etc enter in:
inst
and select "Ok".
- For the "Other destinations to accept mail question accept
what is shown by choose "Ok".
- For "Force synchronous updates on mail queue" select "No".
- For the networks blocks on which your host should relay mail
just accept the default of "127.0.0.0/8" and press "Ok".
- For "Mailbox size limit" select the default of "0" and press
"Ok".
- For "Local address extension character" keep the default of
"+" and press "Ok".
- For Internet protocols to use select "ipv4" and press "Ok".
At this point Postfix should finish installing and then start.
7.) Now you can test to see if Trac is accepting email and creating
new tickets. To do this do the following:
# mail trac-tickets@localhost
Subject: Ticket Test
Type in some text... Then, press ENTER and on a newline
type a single "." and press ENTER again.
Cc:
#
If everything worked you should have just created an email that
went to trac-tickets@localhost, which is really in your /etc/alias
files and points to the run_email2trac program in your machines'
/usr/local/bin directory. This takes your email and creates a
Trac ticket. To verify this open a web browser and go to:
http://localhost/trac/ait
And click on the "View Tickets" link, then click on "Active
Tickets". Your ticket should appear if everything worked.
If you like this system, then you would want to become
familiar with what options are available to you for using the
email2trac plug-in.
PART VII
-----------------------------------------------------------------------------
1.) Now you have all the bits and pieces necessary to have Trac auto-
matically generate a notification email to the trac-tickets alias
for a service check. This, in turn, would generate a ticket in the
Trac project ticketing system.
2.) In the file /etc/nagios2/conf.d/contacts_nagios2.cfg you need to
create an entry for the trac-tickets user. This entry goes in the
"Contacts" section of the file. Here is a sample you can use:
define contact{
contact_name trac-tickets
alias Trac
service_notification_period 24x7
host_notification_period 24x7
service_notification_options c ; c = critical. Dont' create tickets for other states.
host_notification_options d ; d = down. Don't create tickets for other states.
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
email trac-tickets@localhost
}
3.) Next you need to create a contact group that contains the trac-
tickets contact. In this case our group will only have one user,
but you can certainly have multiple users in this group. Here
is a sample entry you can use:
define contactgroup{
contactgroup_name tickets
alias email to ticket system for Trac
members trac-tickets
}
4.) Now, before we can define the service we wish to monitor that
generate this ticket, for this scenario, we need to create a
hostgroup with just one entry. Edit the file:
/etc/nagios2/conf.d/hostgroups_nagios2.cfg
and add the entry:
# hostgroup definition for AIT intERLab Network Management Workshop
define hostgroup {
hostgroup_name gateway-router
alias Gateway router at AIT intERLab
members gw-rtr
}
5.) Finally, edit the file /etc/nagios2/conf.d/services_nagios2.cfg
and add the following (long) entry:
# check gw-rtr if live
define service {
hostgroup_name gateway-router
service_description PING-RTR
check_command check-router-alive
use generic-service
notifications_enabled 1
check_period 24x7
normal_check_interval 1
retry_check_interval 1
max_check_attempts 3
notification_period 24x7
notification_options w,u,c,r
contact_groups tickets
notification_interval 0 ; set > 0 if you want to be renotified
}
Restart Nagios. In theory, if the gateway router on interface
10.10.10.10 goes down, this should generate a notification email
that will be delivered to trac-tickets@localhost, which, in turn
will create a new ticket in our Trac project.
Did you notice that the notification options here are set to
"w,u,c,r"? But, in the file contacts_nagios2.cfg the trac-tickets
contact entry overrides these options with just "c" for critical.
Thus, only a single email should arrive for each hard-state change
to critical for this particular device and service, which will
only generate a single ticket.
We'll try simuating this in class. You could try simulating using
a different service to force ticket generation (maybe http service
on a neighbor's box?).