Monitoring BackgrounDRb with God

Update: The :clean_unix_socket behaviour was merged into God and has now been released in God 0.7.8, so you don't need to add that big block of code to the top of your god.conf file to make it work! :-)

God is a very neat piece of software, frequently used by Rails developers to monitor mongrel servers, and restart them if/when they crash or use up too many system resources.

Its use isn't limited purely to monitoring web servers though; you can monitor pretty much anything you like. Let's take a look at how to monitor the Ruby job processing daemon, BackgrounDRb.

In normal use BackgrounDRb will create a Unix socket, typically in your /tmp directory. Mine is called /tmp/backgroundrbunix_localhost_2000 as it's running on port 2000. If BackgrounDRb crashes it won't always remove the socket, which is a problem as BackgrounDRb won't start up again if the socket file already exists.

So the trick is to remove the Unix socket with god. But how? As god is written in Ruby we can easily teach it new tricks.

Have a read of this block of god configuration, that tells it how to monitor BackgrounDRb:

God.watch do |w|
  w.uid = 'app'
  w.gid = 'app'
  w.name = 'myapp-backgroundrb'
  w.interval = 30.seconds
  w.start = "#{rails_root}/script/backgroundrb start"
  w.stop = "#{rails_root}/script/backgroundrb stop"
  w.start_grace = 10.seconds
  w.restart_grace = 10.seconds

  w.pid_file = "#{rails_root}/log/backgroundrb.pid"
  w.behavior(:clean_pid_file)

  w.unix_socket = "/tmp/backgroundrbunix_localhost_2000"
  w.behavior(:clean_unix_socket)
end

Everything in there is your normal run-of-the-mill God config, with the exception of the last two lines. By default god won't let you assign a value to your watch for unix_socket, and it doesn't come with a behaviour called :clean_unix_socket either.

Add this code to the top of your God config and your Unix sockets will magically be cleaned up when BackgrounDRb is restarted:

God::Watch.class_eval do 
  attr_accessor :unix_socket 
end 

module God
  module Behaviors
    class CleanUnixSocket < Behavior
      def valid?
        valid = true
        if self.watch.unix_socket.nil?
          valid &= complain("Attribute 'unix_socket' must be specified", self)
        end
        valid
      end 
      def before_start 
        File.delete(self.watch.unix_socket) 
        "deleted unix socket" 
      rescue 
        "no unix socket to delete" 
      end
    end
  end
end

This was discussed recently on the god mailing list. Glenn Gillen (who I was pairing with when we came up with this) deserves a good slice of the credit.