Band Saw: Log monitoring and alerting for the GNOME Desktop

Band Saw was a syslog monitoring program that I wrote for the GNOME desktop, back in 2004.

At the time I was leading the development team at a company that was deploying its software to hundreds of laptops and servers. Each of these computers was dedicated to running a bespoke application that we'd developed in-house, consisting of seven separate interacting services.

This was long before microservices became cool.

Each computer was also synchronising its local database with several other computers in the network, and the laptops were doing it over a GPRS connection.

Think "mesh network", but at the database layer, over a low bandwidth mobile network.

It would be fair to say that it was quite a complicated distributed system.

How do you do "observability" in a system like that? Bear in mind that this was quite a few years before cloud computing was a thing, and SaaS applications for monitoring your software would become an option.

In order to keep tabs on the behaviour of this network of autonomous nodes, we decided to rely heavily on the Unix syslog daemon. Each of the seven services on each computer logged its activity to its local syslog daemon, which was configured to forward any error messages to a central syslog server.

That central syslog server was able to forward any messages relating to a potential incident to the desktop of one of the support staff (the dev team were basically "3rd line support").

We all ran Linux on our desktops (so all had a local syslog server that was capable of receiving these messages) and we all ran the GNOME desktop.

Our fleet of laptops were running a graphical app written in Python. We'd built that app with GTK+ and the GNOME libraries, so writing a graphical monitoring tool for our own use was really no big deal.

With Band Saw, all the developers and support staff were able to browse the messages that had come in from the company's network of computers.

As syslog packets from our customers' computers arrived over the network, they were piped to a Unix socket, where Band Saw picked them and displayed them in its window. We could then enter a string to find logs that we were interested in.

Band Saw's main window

We could also configure Band Saw to alert us to messages that we felt warranted an immediate interruption.

Configuring Band Saw's filters

Whenever a message arrived that we considered "serious", one of our filters filters would match it, and Band Saw's notification icon (in the corner of our screens) would blink.

Band Saw's flashing notification

Clicking the icon would display the error message (I clearly lacked imagination when creating this screenshot, "Something has gone wrong!" is the actual error message):

Band Saw's alert dialog

We had always been very hot on investigating, tracking down, and fixing errors, but real-time alerting changed the game overnight. If a colleague found a bug, we were often able to rock up at their desk to explain what we were doing about it before they noticed anything had gone wrong! :-)

The source for Band Saw was originally released on SourceForge.net. I recently imported all my SourceForge projects to GitHub (it's at gma/bandsaw), and that prompted me to do this little write up.

Articles about Band Saw