Monday, August 10, 2009

The Buildbot Experience..

I've had been pushing myself to submit a patch to buildbot and then write a post about it, but days,weeks and months have passed by and I haven't even read the source code (let alone writing some of my own).. well what can I say, I've been incredibly lazy (the word "busy" can be substituted here.. but that would be a lie). Let me spread some buildbot love rather than just be yet another leach.
Now what is buildbot? According to the website:
The BuildBot is a system to automate the compile/test cycle required by most software projects to validate code changes. By automatically rebuilding and testing the tree each time something has changed, build problems are pinpointed quickly, before other developers are inconvenienced by the failure. The guilty developer can be identified and harassed without human intervention.
It's a build/test automation system that can run on a variety of platforms. Well there are a variety of free and commercial apps out there which do the same. I chose it because it was relatively light weight and was written in python (an impressive list of clientèle was a factor too).

As the relatively small company (note "small" here just means employee strength) where I work in grew; software maintenance, integration and the task of porting the code to all the platforms and then testing them just became an incredibly arduous task. We already had a semi-agile system in place and were using version control, but it was just not enough. In a small firm, even a single dev day spent on anything other than coding/design is a day wasted (and we devs are known to be lazy and don't really like doing the same things again and again).

Our product is supported on many different platforms, ranging from the ancient(linux 2.2, mac 10.2, Sun Sparc 5.8 etc) to relatively new ones(freebsd7, macosx-universal, windows 2008 etc) totaling to about 13 different platform. Commercial systems were out of the question because none of them supported all the platforms that we had, so were any .NET based ones (not possible for *nix systems) and we preferred python based system over java because our test bed was completely in python and we just didn't think that adding another language to the mix would be a good idea in the longer run. Setting up buildbot on the newer systems was a breeze (just apt-get/yum/portage etc was good enough in most of the cases, a rare recompile on some others). The older systems and Windows were a bit messy (big surprise there!). Some of the systems didn't even have python (or had a really old version), let alone twistd (a buildbot dependency). After 3 days of hushed cursing, head banging and a lots of ugly hacks, I got buildbot to run on all of the *nix platforms. Windows kind of made me feel completely handicapped because once the install failed, I had absolutely no idea how to get around the situation. I ended up setting up a proxy Linux system for windows (credit for the idea goes to one of my colleagues), which identified itself as a windows machine and just did the needed compilation over ssh on Windows (using VS2008).

Once the crude system was setup, we started adding bells and whistles. Every svn update now triggers a build on all the 12 different platforms, sends emails to a group of people who want to be notified if something is broken and to the person who broke the code. Another process triggers a nightly builds which updates the code,does a clean build, run a set of core regression tests, creates a package(tagged with date and revision number) and posts the status on a pretty page. The status page (html and css) was also hacked to list the tests that failed on particular machines. It also uploads the package to a different repository which can then be used by the QA team. Later some more hacks were done to maintain just a small history of packages (nightly builds, not the production builds) and then some (idiosyncratic to the product)... Check out Google Chrome's buidbot page.. now imagine that for 12 different platforms!!.

We did get the obvious benefits:
  • no integration downtime.
  • no downtime to port the product to different OSes.
  • continuous integration made the development process more robust/agile.
There were some not-so-obvious benefits too:
  • Since creating a package on all the platforms just involved clicking a button on the web interface, debugging->packaging->testing cycle was made a lot faster as the QA didn't have to wait for developer to create the package (who would normally try to include all the fixes in a package before creating a package), which means faster feedback on the remaining issues/builds, which means a happy PM :)
  • Automating build across different platforms meant that all the platform specific hacks would have to be cleaned up, which meant a more elegant build process, which lead to faster build time and did point to some issues that were being overlooked.
  • The green color signals a "pass" on the buildbot status page. Surprisingly we(all the devs) find it rewarding to see a green on the status page with our names beneath it, I think I can safely say that our productivity has gone up and our favorite color is "buildbot green" :) .
The whole system works so well and has relieved me of so much of repetitive/boring work, that I was guilty of using it for free. And since I am a poor developer, all I could do was offer some CPU cycles on my home machine (did that for about 4-5 months, until the summer heat forced me to switch off my PC) and offer my help to write some code (which I'll get around to doing one of these days).

On a totally unrelated note, I'll be taking a trip to Europe (Paris, Zurich, Munich, Prague, Brussels, Brugge, Amsterdam), so drop me a line if you've been to any of these places and would like to recommend me something that I should not miss.