We can automate away drudgery, repetitiveness, unthinking work.

We see real value in automation. We can automate away drudgery, repetitiveness, unthinking work. This leaves us all free to actually think about creating well-written, easy-to-read, robust and secure code. It affords us the space to grok the problems we are approaching, and crucially enjoy our work.

For those who may not recognise this term (or buzzword, if you’re cynical): IT Automation allow people to simplify workflow by getting different systems to talk to one another with the least of human effort. It is a key strategy for saving money, producing better work (for us, that’s software), and keeping people happy and engaged with their work.

Automation as a concept in IT has existed for some time. Automation is de-facto a part of DevOps and Infrastructure as Code, and there are a number of mature tools available: Chef’s first version was released in 2009, Salt in 2011 and Ansible in 2012.

The logic

When you do exactly the same thing often why not put in some more work at the beginning, make a script that does it for you and then reap the benefits later. A real life nerd dream.

Some may even say that doing that thing more then once qualifies - relevant XKCD:

For now let’s just agree that when your dev team has more then a couple of projects that are maintained and released quite often you need some kind of automation software to speed the development process up. At the very least, handling tasks such as provisioning a new server and the deployment process.

Early automation at Omni-Digital

For a long time Omni-Digital used fabric - a quite popular solution in the Python community (8000 stargazers on GitHub) that allows you to create to simple Python functions that can be run directly from the command line. It is likely that this simplicity (no need to learn a new syntax, vocabulary and structure - it’s just Python!) and a very early release date (version 0.0.1 was released in Jan 2008) contributed a lot to fabric’s popularity.

Our fabric scripts worked OK, mostly did what they were supposed to do, but the more we used fabric, the more things we noticed that we couldn’t do things easily which other automation software had built into it their cores. No support for Python 3 was a final deal-breaker.

The choices we then considered were the three mentioned above - Chef, Salt and Ansible. After some consideration we chose the Python-powered Ansible and haven’t really looked back. So far, so good.

Our original distinction between regularly-released and one-off projects

At the beginning we maintained a distinction between projects for our clients - regularly released, with multiple environments (on multiple servers) deployed in the exact same way - and tools we self-host and use internally like Sentry and Taiga. The former were first integrated with Fabric and then ported to Ansible; the latter were never done in Fabric and were not originally added to Ansible.

Leaving software we did not write out of Ansible made sense to us at the beginning. We don’t control the codebase and the deployment/upgrade is usually either done via ‘git pull’ (Taiga), ‘pip install foobar’ (Sentry) or something similar. And once it’s up and running you’ll probably only visit the server when there’s an upgrade that does some cool stuff you like. All that is even more valid if you’re just trying something out, like we originally did with Taiga - no point in putting your time into automating something that may very well never be actually used.

So… there’s not really a big ‘BUT WAIT!“coming up and everything listed above - including the XKCD graph - is valid. But over the course of the last couple of months when we needed to upgrade Taiga to a newer version (the sweet, sweeeeet comment editing) and then creating a new Sentry instance (R.I.P. the old one) we found that using Ansible was actually very beneficial in both places and should/could arguably be implemented sooner. It will be worthwhile detailing our experiences of each.

Upgrading Taiga

As mentioned previously, Taiga was originally deployed just to test it, see if it would work with our workflow and if it could replace Asana as our main project management tool. Well, it could, and it quickly did, making our test instance the de facto production instance, which as you know (or maybe you don’t know yet) isn’t the ideal situation. It worked well so there was no reason to provision a new server, especially since it wouldn’t be as easy as running ‘$ ansible-playbook roles/taiga/playbook.yml -i projects/taiga/inventories/production’.

But the minor issues were quietly piling up - new SSH keys could only be added by hand, as would any config changes; updating Taiga to newer version required some time reserved in case of any problems and couldn’t be considered bullet proof and stress free (not that anything can, but you get the point); finally, each new feature that was added to our Ansible stack as a whole (like better firewall configuration, automatic security updates installation, etc.) had to be applied here manually or (more often) not applied at all.

The list could probably go on, with some issues more valid, some less. I’m also pretty sure there are problems that would come up in the future (my money would be on SSL certificate renewal not being painless).

I want to stress here that I’m not criticising the decision to create the test instance quickly, and the moment when it changes from demo to production isn’t always set and obvious. The line between Is it worth it? is also really hard to see, while being arguably the most important one.

Ultimately we agreed to add Taiga to our Ansible stack, both as a role and as a project. It’s actually a bit weird because you’re actually just parsing the official installation instructions and copying the actual commands. Not very creative, but then again - what automation stuff is?

Don’t worry though - you’ll be in geek heaven at the end anyway. Just press enter, sit back and drink your coffee.

1
ansible-playbook roles/taiga/playbook.yml -i projects/taiga/inventories/production

Sentry

The Sentry situation was a bit worse - when an (human) error borked the production server, we were faced with no Ansible scripts to quickly get it up and running and no time to do it manually because of business deadlines. Oops.

Getting Sentry up and running again became a priority, and learning from our mistakes meant that doing it all with help of Ansible was kind of a no brainer at this point. We don’t intend to bork it up again, but if we do, it’s good to know that a database backup and couple of minutes to run the Ansible scripts is all it takes to get it back up on its feet. Plus everything I listed in the Taiga section above applies here as well - adding an SSH key for your new colleague/machine manually, while being in the middle of doing something completely different is a PITA plain (pun intended) and simple.

And we didn’t even realise that we’re behind one major release, which brought really sexy UI changes. I like geek heaven so much.

The takeaway

So. Doing every little project or proof of concept in Ansible from the start is really a case of premature optimisation most of the time, but we do now think that the line of Is it worth it? comes much sooner then expected.

‘Use your best judgement’ seems like such a silly way to summarise all this - but that’s basically what it comes up to. We could’ve just said that at the beginning and end it there, but that wouldn’t be fun, would it?