October 2020 - jpintelli

The title is a bit clickbaity, I know. I couldn’t figure out a better one, so there it is.

What is agile development? This is a very controversial topic. There is the Agile Manifesto, there are strict “Agile” frameworks like Scrum and methodologies like Kanban. I like the simple definition where a team of senior engineers is put into the conference room with a direct line to the client and are asked to deliver a releasable program increment every X period of time. Scrum has a lot of additional elements, one of which is the requirement that the program increment should have prior agreed set of features. One could argue that this limits the development flexibility but allows for the development team to focus on the delivery and make predictions on where the product could go in the future. You can change the scope of the sprint during its run but this is frown upon and with good reason. In order for the team to be able to estimate with any certainty and agree to the sprint scope with the PO during the planning, it needs to have a certain level of predictability of the environment in which they operate. Continuous Integration (CI) pipeline plays a huge role in that predictability.

CI pipeline is an automated process that takes our new code change and runs it through a series of steps to verify its correctness, like running tests, static analysis tools, etc. Generally any modern development team/s will have a CI pipeline as they help with making sure our change is correct. The problem arises when the same pipeline is used by multiple teams.

What happens when the teams share the CI pipeline?

The severity of the problems outlined here, depend on the pipeline implementation, however all of them eventually hit these issues:

Queues

When many engineers try to merge their changes, there will be times when the CI will become overloaded and the engineers will have to wait a long time for their change to be verified. The more engineers, the longer the wait will be. This can be mitigated with automated scaling but this will become either expensive or unstable (if low quality servers are used or something like AWS Spot instances).

Pipeline failures

Having one pipeline makes changes to it difficult and dangerous. Just like you would be cautious with making changes to your product on all client instances simultaneously. One wrong change can cripple the pipeline and force all your developers to be unable to work, potentially costing the company millions. Having many deployment pipelines lets you do independent changes, e.g. through canary releases.

Bloat

This will be more pravailaint with large monorepos, but having one deployment pipeline makes it very tempting to basically build everything everytime. As the code base grows, so will the build times, making the developers again wait longer and longer. This can be mitigated with appropriate build tools like Bazel (never used it personally, but I heard good things) but it can be costly for the company to change their processes once they reach the point where this becomes a problem.

Instability

Once in a while some team will try and push something through the CI something that will cause it to die. Maybe they will deploy some docker image that blocks a vital port (seen that happen, done it myself once or twice) or run a script that kills everything on the given machine (also seen it). Such mistakes happen. They should not block merges of other teams.

You might ask, so what? Just take that into account when estimating the time needed to deliver the feature. After all the estimates are made based on previous estimates, when those incidents also occurred. The problem is those issues are unpredictable. One sprint they will not happen, the second they will, derailing the whole sprint. We want to limit uncertainty to increase predictability.

So what to do with it?

It would be perfect if the team was the one maintaining and taking the responsibility for their CI pipeline. This will give them high control of their environment, significantly lowering uncertainty (of course there can still be unforeseen circumstances like a network failure, but there is little one can do with that). This approach can be unrealistic (and unnecessary in a company with high DevOps culture) in large corporations, for various reasons like security, costs, etc. However if the company does not have deployment properly figured out, they should let the teams set up their own pipelines and manage them independently. Once the company matures then it can try and unify the processes, keeping the pipelines separate. Just like in a microservice architecture. This approach works for any project setup. Regardless if it is a monolith or microservice, monorepo or multi repo (although it is easiest in a multi repo microservice architecture).

Final Thoughts

Think about this in terms of encapsulation. You try to enforce encapsulation within your services/modules in order to make them loosely coupled, stable and easy to change. The same applies to CI pipelines. Having a separate CI pipeline will allow the team to make better estimations as there are fewer variants.

That said, in some products having independent pipelines will be infeasible. It can be too expensive/time consuming, there might be a lack of skill necessary within the teams, etc. There can be many reasons. This, however, means it is also infeasible to expect development teams to deliver releasable features every sprint. And that is OK, alabait painful. In such cases maybe use something closer to Kanban than Scrum.

jpintelli

Menu

Month: October 2020

Scrum only works with independent CI pipelines