Building a streaming cluster with Icecast, LVS and other cools apps

That's Icecast, LVS (Heartbeat, Ldirectord, Hbagent), Net-Snmp, Nrpe and some custom monitoring script for monitoring with Nagios and graphing with Cacti.

Index

Introduction / Background / Premise

A while back the author of this howto was approached by a large commercial Danish radio station to discuss building a more scalable streaming cluster. Historically Radio 100FM, the radio station in question, had had a couple of Icecast servers running with DNS round-robin between the Icecast servers and whilst this is a reasonable setup there are some problems associated with it.

Namely:

The goal was to build something that would scale well beyond 10,000 streams. If your requirement is for less streams, you may wish to cut some corners. It's worth mentioning that 10,000 streams is not really a problem if you use a 24 or 32kbps codec, but Danish listeners just expect better, so our streams are 128kbps. In other words we are looking to get way past the 1gbps barrier.

So based on these requirement it was decided to setup a Linux Virtual Server (LVS) setup based on the Direct-Routing model. So for out setup we have 4 servers:

Our setup looks like this: Put diagram 1 here

We will use the following IP addresses for the equipment in our setup:

The Directors will be running with a stateful failover mechanism so should the active director fail, the standby will takeover with no loss of service - the streaming client will not notice.

In addition, we will be configuring 802.3ad on the directors - Interface bonding the Linux people call this. Not because we need the additional bandwidth, but because our director will receive lot of small packets, and we'd like several of the cores in our multi-core directors to deal with the load of these.

So traffic in the direction from the client to the icecast servers will flow like illustrated below (ascii diagrams rule...):

+-----------------+         +----------+         +----------------+
| Streming Client |---->----| Director |---->----| Icecast Server |
+-----------------+         +----------+         +----------------+

In the other direction from the streaming Icecast servers to the client the traffic flows directly from server to client:

+----------------+         +------------------+
| Icecast Server |---->----| Streaming Client |
+----------------+         +------------------+

So really all our Director has to deal with is the small acknowledgement packets we see. High packet load, load bandwidth usage (compared to the actual audio streams.

In our setup we use the Gentoo distribution throughout. It's nice. It's very nice......

Setting up the directors

To setup the directors we proceed as follows......

Preparing and installing the kernel

First, let's go ahead and verify that the kernel has support for LVS. We can do this in the usual fashion by changing directory to /usr/src/linux and typing "make menuconfig".

If you don't have anything in that directory you may get the code from kernel.org or if you are running Gentoo you will probably run "emerge gentoo-sources" to install the kernel sources.

The relevant section is: Networking --> Networking options --> IP virtual server support

Ensure that this option is marked with an asterisk (star, * or whatever you'd like to call it).

On Gentoo, assuming you are using the Grub bootloader, copy the newly compiled kernel from /usr/src/linux/arch/x86/boot/bzImage to /boot. Then you will need to update /boot/grub/grub.conf appropriately to reflect the changes:

title Gentoo Linux 2.6.25.17 with IPVS support
root (hd0,0)
kernel /boot/bzImage root=/dev/cciss/c0d0p5

Compiling and installing a kernel can be tricky. The fine nuances are outside the scope of this howto. Please see them for help.

Installing Ipvsadm

Next we'll install ipvsadm. ipvsadm is the tool that will interact with the load balancing capabilities we compiled into the kernel. On Gentoo you can do this by running emerge ipvsadm.

In the setup we made, we used ipvsadm version 1.24

Installing Heartbeat

Next, we'll install heartbeat. This is the software that will run on both directors. It ensures that if the primary director dies, that the failover director takes over.

There's a couple of options that we need to add to give us the functionallity we require in our setup. To get this functionallity you will need to add them to the auto configure script of heartbeat. Since we are doing it on Gentoo, we will change the relevant ebuild file. We will use heartbeat version 2.0.8.

Change directory to /usr/portage/sys-cluster/heartbeat. Then edit the file heartbeat-2.0.8.ebuild. There's a section in it that's called src_compile(). To all the other "configure" options you should add:

--enable-fatal-warnings=NO \
--enable-snmp-subagent \

After you've made the changes you need to run the command ebuild heartbeat-2.0.8.ebuild manifest

You are now ready to compile and install the heartbeat software. You may do this by running:

USE='ldirectord snmp' emerge '=heartbeat-2.0.8'

Installing Ntpd

To install ntpd simply type: emerge openntpd

Timekeeping is important on any system - don't skip this step....

Installing NRPE

To install nrpe simply type: emerge nagios-nrpe

We will use NRPE (a remote execution to Nagios) to monitor the system and send us alerts if things go wrong

Configuring Hearbeat

Next, let's go ahead and configure heartbeat. We'll start with the file /etc/ha.d/ha.cf:

logfacility local0
keepalive 200ms
deadtime 2000ms
warntime 1000ms
bcast bond0
auto_failback off
node director1 director2

For a detailed explanation of these directives, see: bzcat /usr/share/doc/heartbeat-2.0.8/ha.cf.bz2 | more

Notes:

Next, we'll add a resource that we will be monitoring with heartbeat. We do this in the file /etc/ha.d/haresources:

director1 10.0.0.1 ldirectord

For a detailed explanation of these directives, see: bzcat /usr/share/doc/heartbeat-2.0.8/haresources.bz2 | more

Finally we will add some security to ensure that no "rouge" heartbeat server on the LAN can interfere with our setup by adding some security. In the file /etc/ha.d/authkeys, add the following lines:

auth 1
1 md5 SecretPasswordOnlyYouKnow

You might wish to use a slightly less lame password - anything will do

For a detailed explanation of these directives, see: bzcat /usr/share/doc/heartbeat-2.0.8/authkeys.bz2 | more

The active and standby director should be configured in exactly the same way, so you will want to do this twice. The primary director you should give the hostname director1 and the secondary director you should give the hostname director2

Lastly, go ahead and add the lines below to the file /etc/init.d/heartbeat somewhere in the start() section under RunStartStop post-start directive:

sleep 5
/usr/lib/heartbeat/hbagent -d > /dev/null 2>&1 &

This will start the heartbeat snmp sub-agent when heartbeat starts

Configuring Ldirectord

Ok, so now let's configure ldirectord, the program that interfaces with the virtual server functionallity in the kernel via ipvsadm. There's only one file we need to set paramters in. That file is /etc/ha.d/ldirectord.cf:

checktimeout=3
checkinterval=1
autoreload=yes
logfile="/var/log/ldirectord.log"
logfile="local0"
quiescent=no

virtual=10.0.0.1:80
	real=10.0.0.13:80 gate 100
	real=10.0.0.14:80 gate 100
	service=http
	scheduler=wlc
	persistent=60
	netmask=255.255.255.255
	protocol=tcp

Notes:

For a detailed explanation of these directives, see: bzcat /usr/share/doc/heartbeat-2.0.8/ldirectord.cf.bz2 | more

Configuring the stateful fail-over daemon

Create a file like the one below called /etc/init.d/syncd:


#!/sbin/runscript

opts="start stop"

depend() {
        need net
}

start() {
	ebegin "Starting Server State Sync Daemon"
	einfo "Running in Master mode"
	ipvsadm --start-daemon master --mcast-interface bond0
	eend $?
}

stop() {
	ebegin "Stopping Server State Sync Daemon"
	ipvsadm --stop-daemon master
	eend $?
}

The daemon that shares the states of the tcp connections between the active and standby director runs inside the kernel. The script above starts and stops the daemon. We will add it to the startup sequence later with rc-update. For now, just create the file and don't forget to chmod it with an executable bit: chmod +x syncd.

This is necessary so that in the scenario where the active director fails, the virtual IP address is moved to the standby director (thus making it active), for the client connections to persist, it must know what real server serves which client.

Configuring Net-SNMP

You will remember that when we "emerged" heartbeat we added the "snmp" USE flag. This installed Net-SNMP. A sample configuration file, snmpd.conf.example was put in /etc/snmp as part of this installation. Let's copy that to /etc/snmp/snmpd.conf:

cp /etc/snmp/snmpd.conf.example /etc/snmp/snmpd.conf

Now we need to add a few lines to it to fit our purpose. Add the following lines to the snmpd.conf file:

rocommunity	SecretSnmpCommunity
master		agentx
p2sink		localhost

The rocommunity will be used as a password for getting snmp information from the snmp daemon in our management system that we will setup later. The agentx line enables the snmp sub-agent.

Remember we compiled heartbeat with the snmp sub-agent. This will allow us to monitor which director is active and which one is standby

To make use of the heartbeat snmp sub-agent OIDs please download the mib LINUX-HA-MIB.mib and place it in your snmp mib directory. Typically this is /usr/share/snmp/mibs

Lastly, edit /etc/conf.d/snmpd to include the line:

SNMPD_FLAGS="${SNMPD_FLAGS} -x /var/agentx/master"

Configuring Ntpd

Edit the file /etc/conf.d/ntpd to include the following line:

NTPD_OPTS="-s"

Running ntpd with this flag will allow it to change the servers clock to any extent when ntpd is first started. If you don't run it with this option, ntpd might fail after a reboot because the hardware clock is too much out of sync.

Configuring NRPE

NRPE installs a default configuration file in /etc/nagios called nrpe.cfg. Amend it to look like the following:


log_facility=daemon
pid_file=/var/run/nrpe.pid

server_port=5666
server_address=10.0.0.1X 
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=10.20.0.1
dont_blame_nrpe=1
debug=0
command_timeout=60
connection_timeout=300

command[check_users]=/usr/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/nagios/libexec/check_disk -w 20% -c 10% -p $ARG1$
command[check_zombie_procs]=/usr/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/nagios/libexec/check_procs -w 150 -c 200 
command[check_proc]=/usr/nagios/libexec/check_procs -c 1:4 -w 1:4 -C $ARG1$

You should amend the server_address IP address to match the IP of the server you are currently configuring.

Configuring 802.3ad

To configuring 802.3ad interface bonding you will need to edit /etc/conf.d/net as shown below


config_eth0=( "null" )
config_eth1=( "null" )

slaves_bond0="eth0 eth1"

config_bond0=( "10.0.0.1X netmask 255.255.255.0" )
routes_bond0=( "default via 10.0.0.254" )

This will bond together interface eth0 and eth1 under the bond0 interface. Modify the IP address to match the director you are configuring.

You will of course also need to configure the switch the directors are attach to accordingly. We use a Juniper EX3200 and provide the sample configuration here:


    ge-0/0/12 {
        ether-options {
            802.3ad ae0;
        }
    }
    ge-0/0/13 {
        ether-options {
            802.3ad ae0;
        }
    }
    ge-0/0/14 {
        ether-options {
            802.3ad ae1;
        }
    }
    ge-0/0/15 {
        ether-options {
            802.3ad ae1;
        }
    }

    ae0 {
        aggregated-ether-options {
            minimum-links 2;
            link-speed 1g;
        }
        unit 0 {
            family ethernet-switching {
                port-mode access;
                vlan {
                    members Internet;
                }
            }
        }
    }
    ae1 {
        aggregated-ether-options {
            minimum-links 2;
            link-speed 1g;
        }
        unit 0 {
            family ethernet-switching {
                port-mode access;
                vlan {
                    members Internet;
                }
            }
        }
    }

Making it all start on startup

We now need to make sure all the services start on startup. To this effect we will use rc-update to add the service to the Gentoo startup process

Execute the following commands to add the relevant scripts in /etc/init.d to the startup sequence:


rc-update add bond0 default
rc-update add snmpd default
rc-update add ntpd default
rc-update add nrpe default
rc-update add heartbeat default
rc-update add syncd default

Setting up the Real Servers / Icecast Servers

To setup the Real Servers with Icecast etc., proceed as follows:

Preparing and installing the kernel

Firstly, let's check that we have a the correct options in the kernel configuration. See the section above on how to "make menuconfig" in the Directors section.

The real servers will have a dummy interface with the Virtual IP address. Whereas this can be configured on a physical interface, the author prefers a seperate loopback interface (back choice of words as we are infact not using a loopback interface in the Linux sense of the word, but a dummy interface). The dummy interface also has the advantage that it never go down.

To ensure that we can create a dummy interface, we must have support in the kernel by enabling: Device Drivers --> Network device support --> Dummy net driver support. In our case we are happy with a module so we press space until we see an M.

The Real servers must have an interface with the Virtual IP address (which is also on the active Director remember), but we don't want the Real servers to ARP for the IP address. (This would cause a race condition in the Ethernet switch). So to achieve this we patch the kernel with a particular patch that gives us the option to suppress ARP replies for a particular interface (eventually our "dummy" interface).

The kernel patch is kernel version specific. At the time of writing the relevant ARP suppression patches are available here. Go ahead and download the relevant patch for your kernel version.

Then change the working directory to /usr/src/linux and execute the command: patch -p1 --dry-run < /path/to/patch/hidden-X.X.XX-X.diff. If everything looks good (no errors), then go ahead and run: patch -p1 < /path/to/patch/hidden-X.X.XX-X.diff

Now go ahead and compile and install the new kernel and modules. After you are done, you should add a line saying just: dummy to the end of the file /etc/modules.autoload.d/kernel-2.X. Change X depending on your kernel version.

Now to suppress the ARP requests on the dummy interface add the following lines to the file: /etc/sysctl.conf:


net.ipv4.conf.all.hidden = 1
net.ipv4.conf.dummy0.hidden = 1

In addtion, Ryan has found over time that tuning the kernel with the following parameters make Icecast and the operating system perform better together. You should look these values up and determine what the best values might be for your hardware configuration. (or go crazy and just use these values...):


net.core.rmem_max = 524288
net.core.rmem_default = 524288
net.core.wmem_max = 524288
net.core.wmem_default = 524288
net.core.optmem_max = 131072
net.core.somaxconn = 15360
net.ipv4.tcp_wmem = "262144 1048576 4194304"
net.ipv4.tcp_rmem = "262144 1048576 4194304"

Installing Ntpd

To install ntpd simply type: emerge openntpd

Timekeeping is important on any system - don't skip this step....

Installing NRPE

To install nrpe simply type: emerge nagios-nrpe

We will use NRPE (a remote execution to Nagios) to monitor the system and send us alerts if things go wrong.

Aside from the standard nagios plugins we will also be using a small custom PHP script to check if we have the correct number of "established" streams from the encoders. You can download check_encoder here. Put the downloaded file with the rest of the Nagios plugins in /usr/nagios/libexec.

This script requires PHP to run. If you don't have PHP installed you can do so by executing: emerge php

Installing Icecast

To install icecast simply type: emerge icecast

Icecast will be our streaming platform. It will accept streams from our two redundant encoders and send it across the Internet to our end-user streaming clients.

Installing Daemontools

To install daemontools simply type: emerge daemontools

We will use Daemontools to "supervise" our icecast process. If it dies it will restart it and also notify the administrator.

Installing Ssmtp

To install ssmtp simply type: emerge ssmtp

We will use Ssmtp to send emails to the admin incase the icecast process dies.

Installing Net-SNMP

To install Net-SNMP simply type: emerge net-snmp

Net-SNMP we'll use to monitor traffic usage, packets per second and other interesting things.

Configuring the dummy and ethernet interfaces

You should edit the /etc/conf.d/net file to contain both the physical and dummy interface addresses like so:


config_eth0=( "10.0.0.1X netmask 255.255.255.0" )
routes_eth0=( "default via 10.0.0.254" )

preup() {
              /sbin/ifconfig dummy0 -arp;
              return 0
         }

config_dummy0=( "10.0.0.1 netmask 255.255.255.255" )

Now go ahead and symlink the new interface in the usual fashion: ln -s /etc/init.d/net.dummy0 /etc/init.d/net.lo. This will allow us to easily start and stop it.

Configuring NRPE

NRPE installs a default configuration file in /etc/nagios called nrpe.cfg. Amend it to look like the following:


log_facility=daemon
pid_file=/var/run/nrpe.pid

server_port=5666
server_address=10.0.0.1X 
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=10.20.0.1
dont_blame_nrpe=1
debug=0
command_timeout=60
connection_timeout=300

command[check_users]=/usr/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/nagios/libexec/check_disk -w 20% -c 10% -p $ARG1$
command[check_zombie_procs]=/usr/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/nagios/libexec/check_procs -w 150 -c 200 
command[check_proc]=/usr/nagios/libexec/check_procs -c 1:4 -w 1:4 -C $ARG1$
command[check_encoder]=/usr/nagios/libexec/check_encoder -e $ARG1$ -p $ARG2$ -s $ARG3$

You should amend the server_address IP address to match the IP of the server you are currently configuring.

Configuring Ntpd

Edit the file /etc/conf.d/ntpd to include the following line:

NTPD_OPTS="-s"

Running ntpd with this flag will allow it to change the servers clock to any extent when ntpd is first started. If you don't run it with this option, ntpd might fail after a reboot because the hardware clock is too much out of sync.

Configuring Icecast

This icecast configuration file is in /etc/icecast2/icecast.xml.

Here follows with notes our configuration:

First we set some general settings. Refer to the icecast website for a detailed explanation:


<icecast>
    <limits>
        <clients>6000</clients>
        <sources>15</sources>
        <threadpool>50</threadpool>
        <queue-size>262144</queue-size>
        <client-timeout>45</client-timeout>
        <header-timeout>15</header-timeout>
        <source-timeout>10</source-timeout>
        <burst-on-connect>1</burst-on-connect>
        <burst-size>65536</burst-size>
    </limits>

Here we set the password used by our encoders to supply audio to our mountpoints:

    <authentication>
        <source-password>SssshVerySecret</source-password>
        <relay-password>SssshVerySecret</relay-password>

        <admin-user>admin</admin-user>
        <admin-password>DontTellAnyone</admin-password>
    </authentication>

    <hostname>myhost.mydomain.tld</hostname>

Here we set which address to listen to. It is essential that you bind icecast to both the ethernet and dummy interface - hence the two listen statements:

    <listen-socket>
        <port>80</port>
        <bind-address>10.0.0.1</bind-address>
    </listen-socket>

    <listen-socket>
        <port>80</port>
        <bind-address>10.0.0.1X</bind-address>
    </listen-socket>

Here we create our mount points. We have two (live) streams that come from the encoder. Therefore we configure a backup mount point for the primary mount point. So in the event that we loose audio from the primary encoder for whatever reason, all our streaming clients are moved to the backup mount point (which incidentally has exactly the same audio, thus the end-user will never notice if the primary encoder fails:

    <mount>
        <mount-name>/stream1.mp3</mount-name>
        <username>userXYZ</username>
        <password>SssshVerySecret</password>
        <max-listeners>7000</max-listeners>
        <burst-size>65536</burst-size>
        <fallback-mount>/stream1_backup.mp3</fallback-mount>
        <fallback-override>1</fallback-override>
        <public>0</public>
    </mount>

    <mount>
        <mount-name>/stream1_backup.mp3</mount-name>
        <username>userXYZ</username>
        <password>SssshVerySecret</password>
        <max-listeners>7000</max-listeners>
        <burst-size>65536</burst-size>
        <public>0</public>
    </mount>

    <mount>
        <mount-name>/stream2.mp3</mount-name>
        <username>userXYZ</username>
        <password>SssshVerySecret</password>
        <max-listeners>7000</max-listeners>
        <burst-size>65536</burst-size>
        <fallback-mount>/stream2_backup.mp3</fallback-mount>
        <fallback-override>1</fallback-override>
        <public>0</public>
    </mount>

    <mount>
        <mount-name>/stream2_backup.mp3</mount-name>
        <username>userXYZ</username>
        <password>SssshVerySecret</password>
        <max-listeners>7000</max-listeners>
        <burst-size>65536</burst-size>
        <public>0</public>
    </mount>

More general settings. Refer to the icecast website for a detailed explanation:

    <paths>
        <basedir>/usr/share/icecast</basedir>

        <logdir>/var/log/icecast</logdir>
        <webroot>/usr/share/icecast/web</webroot>
        <adminroot>/usr/share/icecast/admin</adminroot>

        <alias source="/" dest="/status.xsl"/>
    </paths>

    <logging>
        <accesslog>access.log</accesslog>
        <errorlog>error.log</errorlog>
      	<loglevel>2</loglevel>
      	<logsize>10000</logsize>
    </logging>

    <security>
        <chroot>0</chroot>
        <changeowner>
            <user>icecast</user>
            <group>nogroup</group>
        </changeowner>
    </security>
</icecast>

Horses for courses. Your icecast configuration may differ depending on what you are trying to do, but you get the message: If you are streaming something live, then get 2 redundant encoders and let icecast failover in the fashion described here. Even if you don't believe your encoder will fail, it's nice that you can take an encoder out and service it or change the configuration of it without it affecting your users.

Configuring Daemontools

Now let's configure daemontools to restart icecast in the rare circumstance that it should crash. The process that is.

When you "emerged" daemontools it created /service. Go ahead now and create a directory called /service/icecast. Inside this directory create a file called run. Now edit /service/icecast/run to look like the following:


#!/bin/sh
/usr/sbin/ssmtp admin@yourdomain.tld < server-restart.msg
sleep 10
/usr/bin/icecast -c /etc/icecast2/icecast.xml

Now create a file called /service/icecast/server-restart.msg that looks something like the following:


To: admin@yourdomain.tld
From: supervise-on-realserver1@youricecastserver.yourdomain.tld
Subject: Icecast on Real Server 1 (re)started

So make yourself useful and find out why if this is unexpected...


/Supervise Daemon

So here we are using ssmtp to send a mail to the admin everytime the icecast server is started (included when it's restarted). So let's configure ssmtp next....

Configuring Ssmtp

The ssmtp configuration file is in /etc/ssmtp/ssmtp.conf. It should contain the following lines:


root=postmaster
mailhub=smtp.yourdomain.tld
hostname=icecastserver.yourdomain.tld
FromLineOverride=YES

The mailhub directive should be set to an smtp server that will accept to relay mail to the email address of your administrator.

Configuring Net-SNMP

A sample configuration file, snmpd.conf.example was put in /etc/snmp as part of the Net-SNMP installation. Let's copy that to /etc/snmp/snmpd.conf:

cp /etc/snmp/snmpd.conf.example /etc/snmp/snmpd.conf

Now we need to add a few lines to it to fit our purpose. Add the following lines to the snmpd.conf file:


rocommunity	SecretSnmpCommunity

The rocommunity will be used as a password for getting snmp information from the snmp daemon in our management system that we will setup later.

Making it all start on startup

We now need to make sure all the services start on startup. To this effect we will use rc-update to add the service to the Gentoo startup process

Execute the following commands to add the relevant scripts in /etc/init.d to the startup sequence:


rc-update add eth0 default
rc-update add dummy0 default
rc-update add snmpd default
rc-update add ntpd default
rc-update add nrpe default
rc-update add svscan default

It's worth noting that it is "svscan" that starts icecast by executing the run script we created in /service/icecast.

Configuring Nagios Servers

Now let's look at how to configure Nagios

Configure Nagios Commands

Ok, here it may get a little abstract. We are not going to go through how to setup Nagios from scratch. Other howtos do this better. But when you've set it up, to monitor the directors and the real servers, you should create the following (custom) commands:


define command{
        command_name    check_remote_proc
        command_line    $USER2$/check_nrpe -H $HOSTADDRESS$ -c check_proc -a $ARG1$
        }

define command{
	command_name    check_remote_load  
	command_line    $USER2$/check_nrpe -H $HOSTADDRESS$ -c check_load
	}

define command{
	command_name    check_remote_disk  
	command_line    $USER2$/check_nrpe -H $HOSTADDRESS$ -c check_disk -a $ARG1$
	}

define command{
	command_name    check_remote_procs  
	command_line    $USER2$/check_nrpe -H $HOSTADDRESS$ -c check_total_procs
	}

define command{
	command_name    check_director_name
	command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -o LINUX-HA-MIB::LHANodeName.$ARG1$ -C SecretSnmpCommunity -l "Node $ARG1$"
	}

define command{
	command_name    check_director_active
	command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -o LINUX-HA-MIB::LHAResourceGroupMaster.1 -C SecretSnmpCommunity -l "Active Node"
	}

define command{
	command_name    check_encoder
	command_line    $USER2$/check_nrpe -H $HOSTADDRESS$ -c check_encoder -a $ARG1$ $ARG2$ $ARG3$
	}


So we have basically a series of command that check the NRPE way and one that checks the ldirectord snmp sub-agent mibs. (Again, on your Nagios server, download the mib from the URL mentioned above in the Net-SNMP configuration instruction for the Directors above).

Configure Nagios Checks for a Director

Put a file in your cfg_dir that looks something like the following per director:


define host{
	use		generic-host	; Inherit default values from a template
	host_name	director1.yourdomain.tld	; The name we're giving to this host
	alias		Director 1 ; A longer name associated with the host
	address		10.0.0.1X	; IP address of the host
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Current Load
	check_command		check_remote_load
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Process: heartbeat
	check_command		check_remote_proc!heartbeat
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Process: hbagent
	check_command		check_remote_proc!hbagent
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Process: sshd
	#check_command		check_remote_proc!sshd
	check_command		check_ssh
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Process: snmpd
	check_command		check_remote_proc!snmpd
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Process: ntpd
	check_command		check_remote_proc!ntpd
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Total processes
	check_command		check_remote_procs
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Partition: /
	check_command		check_remote_disk!/dev/cciss/c0d0p5 # <--- Change to whatever your disk is...
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	LVS: Node 1 Name
	check_command		check_director_name!1
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	LVS: Node 2 Name
	check_command		check_director_name!2
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	LVS: Active Node
	check_command		check_director_active
	}


define service{
	use			generic-service
	host_name		director1.yourdomain.tld
	service_description	Ping
	check_command		check_ping!50.0,80%!100.0,100%
	}


You may customise as you see fit, but this is a good start

Configure Nagios Checks for a Real / Icecast server

Put a file in your cfg_dir that looks something like the following per icecast server:



define host{
	use		generic-host	; Inherit default values from a template
	host_name	real1.yourdomain.tld	; The name we're giving to this host
	alias		Real Server 1 ; A longer name associated with the host
	address		10.0.0.1X	; IP address of the host
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Current Load
	check_command		check_remote_load
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Process: icecast
	check_command		check_remote_proc!icecast
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Process: svscan
	check_command		check_remote_proc!svscan
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Process: sshd
	#check_command		check_remote_proc!sshd
	check_command		check_ssh
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Process: snmpd
	check_command		check_remote_proc!snmpd
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Process: ntpd
	check_command		check_remote_proc!ntpd
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Total processes
	check_command		check_remote_procs
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Partition: /
	check_command		check_remote_disk!/dev/cciss/c0d0p5  # <--- Change to whatever your disk is...
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Partition: /usr
	check_command		check_remote_disk!/dev/cciss/c0d0p6  # <--- Change to whatever your disk is...
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Partition: /var
	check_command		check_remote_disk!/dev/cciss/c0d0p7  # <--- Change to whatever your disk is...
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Ping
	check_command		check_ping!50.0,80%!100.0,100%
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Streams: Encoder 1
	check_command		check_encoder!10.30.0.1!80!2
	}


define service{
	use			generic-service
	host_name		real1.yourdomain.tld
	service_description	Streams: Encoder 2
	check_command		check_encoder!10.30.0.1!80!2
	}


Again, you may customise as you see fit, but this is a good start

Configuring Cacti Server

Configuring Cacti

We will use Cacti to graph a number of parameters. There are other better howtos that details this, but we will here make a few notes.

We use Cacti to monitor the number of streams to a particular Icecast server. To achieve this we use this script.

In order to configure Cacti to collect this information, we would recommend that you read this howto on the cacti document website. It's very comprehensive and easy to follow.

Conclusion

This completes our streaming cluster howto. Good luck and feel free to drop us a line if you feel anything in this howto can be improved.

Ryan Graakjær
Lasse L. Johnsen
2008