Server Uptime Monitoring with Python

We had a black out over the weekend, while I was away, so I decided to add a few more tools to keep me informed. See, when the power goes out the server can’t exactly let me know that it’s down, can it?

Fortunately, I have a slice at SliceHost which never goes down, so I decided to use that to monitor our in-house server. I put together a simple python script to do the job. It simply queries the HTTP Headers of the home page of each of our sites and emails me and sends a TXT to my cellphone if it gets anything other than a 200 status code.

The script itself is run by a cron demon every 15 minutes, but it won’t overwhelm my phone because I have a built-in Message Delay in the script.


#! /usr/bin/env python

import httplib
from urlparse import urlparse
from datetime import datetime, timedelta
import os, platform, time

# Modify your settings here
SITES = ( "http://www.site.com", ) # Sites to monitor
MAIL_FROM = "admin@site.com"
MAIL_TO = ["Twoway.*********@messaging.nextel.com","ben@site.com"]
MESSAGE_DELAY = 24 * 60 # The time between alerts in minutes
LOCK_FILE = "remote_mon_lock" # Lock file used for alert staggering

def send_alert(status, site):
"""
Send an email alert to MAIL_TO if the status is not 200
Use a Lock File to ensure that message don't get sent more
frequently than the MESSAGE_DELAY, to prevent large TXT bills.
"""
if status != 200:
if not os.path.exists(LOCK_FILE):
old_mod_date = datetime.now() - timedelta(minutes=MESSAGE_DELAY + 1)
old_timestamp = time.mktime(old_mod_date.timetuple())
f = open(LOCK_FILE, 'w')
f.close()
os.utime(LOCK_FILE, (old_timestamp, old_timestamp))

mod_date = datetime.fromtimestamp(os.path.getmtime(LOCK_FILE))
next_send_time = mod_date + timedelta(minutes=MESSAGE_DELAY)
if datetime.now() > next_send_time:
import smtplib
s = smtplib.SMTP()
s.connect()
s.sendmail(MAIL_FROM, MAIL_TO, "%s - %s" % (str(status), site))
s.close()
os.utime(LOCK_FILE, None)

for site in SITES:
url = urlparse(site)
error = ""
try:
conn = httplib.HTTPConnection(url[1])
# Use a HEAD request to get the status code
conn.request("HEAD", url[2])
status = conn.getresponse()
error = status.status
print "%d : %s : %s" % (status.status, site, datetime.now())
except:
print "Connection Failed : %s : %s" % (site, datetime.now())
error = "Connection Failed"

send_alert(error, site)

Advertisements

One response to “Server Uptime Monitoring with Python

  1. You know, it’s kind of difficult to understand python script without proper indentation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s