Ever wanted to monitor the status of your Jenkins slaves? As I’ve said before, Jenkins is one of my favorite tools.
At SEP, we program all the things, so we need many different build environments. Jenkins slaves to the rescue!
We have several slaves that run the following environments:
Occasionally there is a hiccup and one will go down, or not stay connected to the master. Jenkins doesn’t necessarily tell you that a node is down, so we cooked up a way for Jenkins to tell us.
There are typically two types of node failures:
ping
is really good at telling me whether a machine is on or not. So, I created a Jenkins job that is based on ping to tell me if the slave is up or not:
Make a build step for “execute shell” with the following contents (replace slave-hostname
with the hostname of your slave):
ping -c 4 slave-hostname
Jenkins obviously knows that a slave isn’t connected… but doesn’t give us a great way to monitor it.
This information is available in the Jenkins computer API.
So…
Make a build step for “execute shell” with the following contents (replace slave-hostname
with the hostname of your slave):
ENDPOINT="http://jenkins.sep.com/computer/api/xml?xpath=computerSet/computer\[displayName='slave-hostname'\]/offline"
ENDPOINT_RESULT=$(curl $ENDPOINT 2> /dev/null)
if [ $ENDPOINT_RESULT = "<offline>false</offline>" ] ; then
exit 0
fi
exit 1
Once I had Jenkins jobs for monitoring my slaves, I can hook them up to any sort of notification system I want. For example, I have the twilio plugin set up to send me a text message anytime a build node goes down, so I can go hook it back up.