Engineering problems are hard. Not NP-complete hard (although sometimes they are), but rather technologically hard. For this same reason we need to solve them with an engineering approach. Code smells, low performance and poorly automated processes, among other problems, are good candidates for problems to be tackled with an engineering mindset.

There’s no denying that solving these problems is necessary and in some cases, critical. But how do we know when we’re overengineering them? Here’s a real life example.

ProblemWe want to have a unified deployment script, that works for all of our applications.
Thought process: We already have a script that does most of the work, but we want to use it for new application A. The script, however, needs to support deploying application A with curl instead of scp. We should augment the existing script to make it more powerful by supporting application A.
The result: A generic, catch-all script that effectively supports the new project, but looks kind of like this:

deployment.script -version=1.0 -user=admin -password=admin -url=http://url.com/target -file1=/path/to/file1 -file2=/path/to/file/2 -anotherPath=/path/to/another/path -yetAnotherPathThatIsNeededForApplicationAnly=/path/for/application/a -someRandomLegacyParameter=scxpr55 -webServer=apache -include=includeExpression -exclude=ExcludeExpression …

You get the idea.

Without having to look at the file, you already know that is going to be full of if/else statements or some other switching directive, which complicates matters. There’s even an urban legend of a big software company (I don’t know which one) that had to patch and recompile the Linux kernel to support more command line parameters than the shell allowed them to. This is mentioned in Jez Humble’s Continuous Delivery book.

What if we could modularize the script a little bit? We don’t necessarily have to be coding in an object oriented language, but if we can slice the logic into different sub-operations (configure, package, push) then clients can grab only the pieces they need and run with them. We can still share the common logic but bisect and decouple unrelated logic.

Proposed solutionHave multiple, smaller scripts that do well defined actions:

  • package.script –target=/target/path
  • test.script #no parameters
  • deploy.script –artifact-to-deploy=/path/to/file –target-server=target

Obviously, this a very simplistic example, but I hope the point is not lost. Just because we’re cleverly using parameters and flags doesn’t necessarily mean that we’re engineering the correct solution, it just means we’re trying to modify the minimum amount of code, which doesn’t always scale.

Actually it’s not about the flags and parameters (that apply to this example only) but it’s about not trying to solve everything with code. If the first thing you do when you see a problem is think about the lines of code that you need to modify, then you’re probably going to end up overengineering it. Step back, look at the full picture and carefully think about making a quick fix, specially if you’ve touched the same portion of the code several times to accommodate for new features.



Build once, deploy many

I’ve heard many times the phrase “We need a build to X environment” or “We’ll be doing a build to production on X date”, when in fact I think we should substitute build for push, deploy, install or some other verb that indicates that we have already generated the artifact that you’re going to go live with.

In most cases, this is accurate, as we’re compiling or otherwise generating the file that we’ll be then moving it to the specified target. But why are we building this artifact just before we’re going to push it to an environment? In software we certainly don’t enjoy the benefits of an on-demand, just-in-time philosophy that other industries do.

So, why do we insist on building and packaging every time? I think this often is a bad habit that teams acquire when their codebase is small. Compiling and/or packaging is relatively cheap (no more than a few seconds), so this cost is often neglected. And since it’s easier to re-generate the artifact than it is to create an infrastructure to persist these files and have the deployment scripts use it, teams usually skip this step in favor of short term gains. But as the codebase grows, this cost cannot be neglected and sometimes I’ve seen it take over 50% of the time of the overall deployment process. I’ve also seen (and worked with) a team in which the build was cloning their dependent repositories and building them from scratch, taking at least two hours. So much for continuous integration.

Build once, deploy many is a philosophy that has two advantages:

  1. Our deployment scripts become just a few instructions on how to move the package from one place to another. If your deployment scripts are doing something other than just deploying, that’s a big smell.
  2. We guarantee that the file that we’re deploying is going to be exactly the same across all environments.

The infrastructure for this could be as simple as copying the file to a shared drive and have the deployment scripts pick it up from there or as sophisticated as using tools like Artifactory or Nexus. These artifact repositories even have a REST API that allows you to perform operations like deploying the artifacts directly from the container, so your scripts don’t even have to download them. Your deployment scripts become even thinner!

Another advantage of these tools over simple file copy is that they add a layer of metadata on top of the files that makes file versioning easier.

So, you can have an independent process that generates the binaries to be deployed, versions them and pushes them to your artifact repository. Then your deployment scripts pick it up and just moves it across your different environments, making its way towards your production environment. As a former boss put it: “You wrap a gift and just hand it over to the next guy until it gets to the hands of the right person”.

And just for the record, I don’t like the term “production”; I’d rather use the term “live environment”, but that’s a different discussion.

DevOps is a philosophy, not a title

In countless occasions, when searching for a job I’ve seen these titles:

  • DevOps Engineer
  • DevOps Systems Engineer
  • Website and Cloud DevOps Engineer
  • DevOps Superstar
  • DevOps Lead Developer
  • DevOps Automation Engineer.
  • Head of DevOps

But often, the job description includes things like:

  • Monitoring
  • DNS
  • Production systems support
  • Firewall setup
  • Disk images
  • Pager duty

How is any of this related to DevOps? In fact, there is no specific set of skills that a “DevOps Engineer” should have because there is no such thing as a “DevOps Engineer”. Such a title makes as much sense as “Agile Engineer” or “Senior Waterfall Developer”.

DevOps is a methodology, a way of doing things, not a task. Why do you think that a lot of web companies, both big (Google, Facebook) and small (Etsy, Flickr, Intent Media) are being successful at this DevOps thing? It’s because they do Devops, and not have a DevOps team.

Release Engineer vs Release Manager

I often see these two terms used interchangeably when in reality I don’t think they are. The release engineers are software engineers first. They know about the development cycle, branching strategy, versioning, etc. Release managers are not necessarily engineers. In fact, a release manager is more a role than a title. Someone in this role will coordinate the actions that need to be executed in order for the live release the going out. In this sense, the release engineer can put on the release manager cap, just as much as the project lead or senior architect can.

In an ideal world, a release manager would not be needed. The release engineering would be in charge of setting up an infrastructure such that the steps required to make a live push are automated. To paraphrase Beyoncé: “If you like it, put a button on it”. Etsy’s Deployinator is a very good example of what an automated release manager should be, but you particular solution doesn’t have to be all that fancy and it can boil down to a few deployment scripts cleverly parameterized and executed from any CI tool.

It is understandable that for many organizations, a release manager is needed, but this shouldn’t be seen as a permanent requirement. Mixing in a engineering approach (that can come from the development engineers), even if it’s just part time, will gradually make the deployment process smoother. Even if it sounds counterintuitive, good release engineers should aim to put themselves out of a job, by automating every step of the process.

Why developers shouldn’t deploy artifacts

Well, the title is a bit misleading. Developers should be empowered to deploy artifacts, that’s the whole point of the DevOps philosophy; just not from their local environments.

Developer Alice is working on a feature on Module 1 with version 1.0.0-SNAPSHOT. Alice finishes the feature but she hasn’t pushed her changes.

At the same time, Bob is working on a different feature on the same Module 1. He finishes, commits his changes, checks for any incoming changes and finds none. At that point he figures (correctly) that his changes are the latest on the repository, so he pushes them successfully and then does mvn deploy (or whatever the deployment command is).

Now Alice, decides that it’s time to push her changes, but she deploys first (thinking there aren’t any other incoming changes) and runs mvn deploy before updating her workspace. At that point, because the version is a SNAPSHOT, the most recently deployed artifact (containing only Alice’s feature) will overwrite the previous one (containing only Bob’s feature), thus leaving it in an inconsistent state. Any downstream projects depending on this artifact run the risk of becoming unstable.

Now, Alice will probably try to push her changes, but realize that she has to merge them with the incoming ones. Most likely, Alice is a disciplined developer and she will realize that she’ll have to deploy again to make sure the artifact contains both Alice and Bob’s features and it’s consistent with the state of the repository. But this is still a manual, human (and hence error-prone) process.

If we leave the deployment step to a machine (i.e. the CI tool), we can guarantee that there’s only going to be one workspace off of which the artifact will be packaged, making it consistent across the board. It doesn’t have to be anything fancy (although the Gradle release plugin and Maven release plugin are very helpful), just as long as there’s a unique way for artifacts to be created.

One objection to this approach I once heard was: “But why are you trying to make it harder for developers to deploy?”. Well, yeah, it certainly becomes a barrier, but the reason is pretty much the same as why you lock your front door or put a password on your computer. It’s a necessary hurdle to avoid other risks.

Most SCMs today will allow you to add post-commit hooks such that a deployment can be kicked off every time there’s a change, making it even simpler to publish artifacts since it becomes a hands-off process.

Exposing your build system API

Nowadays, there are multiple CI tools: Jenkins/Hudson, Bamboo, TeamCity among others. Some of them are free, some others are licensed and they all offer different sets of features. Any of these tools is a great way for software development organizations to quickly jump on the Continuous Integration/Continuous Delivery train.

This is what the typical first project is organized as far as building and deploying goes:

Screen Shot 2013-02-19 at 11.46.53 AM

You have your project code in source control, being checked out by your CI tool, by a single build plan which executes a set of build scripts that may be either embedded in the build plan, checked into a separate source control repository or saved in some other way.

This works fine until you have to create a second project; then, the chart looks like this:

Screen Shot 2013-02-19 at 11.53.02 AM

Two projects, one CI tool, two build plans which are most likely a copy of each other, and both use the same set of scripts. One important thing to notice here is that the build plans are exactly like each other, but will fall out of sync as soon as the needs for one of them change. Multiple projects lead to build plan duplication and as the number of projects grow, the abstract mapping of projects <-> build plans gets uglier. Sometimes even build scripts get duplicated:

Screen Shot 2013-02-19 at 11.56.44 AM

This is far from ideal.

How can we tackle this with an engineering approach? One thing comes to mind: what if we create a domain-specific language layer on top of our build system? This would expose an API of “verbs” or actions that our build scripts can execute, such as package, build, test, deploy, etc. Then the CI tool would only interact with this API. It would look like this:

Screen Shot 2013-02-19 at 12.10.50 PM

We can even take this idea one step further. What if we package this, call it Build System 1.0 (or whatever quirky name you can think of) and publish it to your artifact repository?. Then we can incorporate it into our build plan such that all it does is “press the buttons” of the actions exposed.

This build system package can then be bootstrapped into the project. The details of doing this will vary depending on what framework is being used (Maven, Ant, Gradle) and in some cases might be tricky, but the concept it’s the same. This is not absolutely necessary and can be skipped if it becomes too complicated to implement.

At this point our map looks similar but with some key differences:

Screen Shot 2013-02-19 at 12.57.31 PM

We still have multiple projects, but the CI tool becomes a mere button-presser or “dumb” broker of the actions that are being invoked. We also still have multiple build plans but they are thinner and we don’t care if they fall out of sync, because they’ll be small enough that the difference can be ignored. And more importantly, the build plans are interacting with our brand new API. Also, since all the logic is encapsulated in one place and the history is tracked by the SCM, we are consolidating the knowledge of our build process.

In this sense, the build system becomes much like a software module:

  • It has an API
  • It’s extensible.
  • It’s releasable.
  • And it can be open sourced (either internally or externally if it’s generic enough).

On top of that, there’s an economic benefit. By reducing your build tool to an on-demand button-presser, there’s no need to spend big bucks on tools that will execute the exact same actions as the free ones.

You’re welcome.

Continuous Integration vs Continuous Delivery vs Continuous Deployment

The word continuous is now used to describe several things in the industry: integration, delivery, deployment, experimentation, monitoring, improvement. There’s nothing wrong with that, but I want to define what I understand by the first three.

I see three degrees of continuity in the world of release engineering.

  • Continuous Integration: The easiest of the three. Set up your Jenkins (or whatever you use) to check out the source, compile and run the tests with every change or periodically. Add a dashboard or push the results via email.
  • Continuous Delivery: This doesn’t necessarily mean deploying with every change, but means being able to do it. It may not be feasible for business reasons, but the application should be releasable at any point. Sadly, in some cases, the only time the software is releasable is when it’s actually released.
  • Continuous Deployment: Deploying every time a change is made to the code base. Yes, this sounds almost impossible, and for a big number of organizations, this will never happen for a number of both internal and external constraints. However, in the words of Bruce Lee: “A goal is not always meant to be reached, it often serves simply as something to aim at”. There is a number of best practices and good habits that must be followed to be able to reach this stage of maturity, but mastering at least some of them will certainly be a step in the right direction.