Wednesday, July 17, 2019

Branches and Continuous Integration

Soroush Khanlou (tweet):

A problem presents itself, however. You need to build a feature that takes 1,000 lines of code, but you’d like to merge it in in smaller chunks. How can you merge the code in if it’s not finished?

Broadly, the strategy is called “branch by abstraction”. You “branch” your codebase, not using git branches, but rather branches in the code itself. There’s is no one way to do branch by abstraction, but many techniques that are all useful in different situations.


Of course, the humble if statement is also a great way to apply this technique; use it liberally with feature flags to turn features on and off. (A feature flag doesn’t have to be complicated. A global constant boolean gets you pretty far. Feature flags don’t have to come from a remote source! However, I would recommend against compile-time #if statements, however. Code that doesn’t get compiled might as well be dead.)

Branches are just not very useful for managing features or major releases for which development will take a long time (during which you will keep working on the shipping version). They’re great when you want to make a bug fix release based on an old version, and thereafter plan for the branch to die. But, otherwise, you spend a lot of time merging changes back and forth between two active branches and still end up with a potentially difficult integration at the end. It’s better to use feature flags and potentially extra Info.plist files and Xcode targets to support simultaneous development of multiple versions.


9 Comments RSS · Twitter

I've been down this road a bunch of times, debated with development teams, and have done large projects with long lived branches and branch by abstraction/feature switches. There's a fantasy that feature switches are an answer to the pain of having long lived branches, but they just end up expressing the pain elsewhere. You have to alter the design of your application, often liter your codebase with alternate implementations or tons of branching logic, have multiple configurations, and you have to provide test coverage not just for multiple behaviors but often multiple combinations of behaviors as the feature switches proliferate. When you work on multiple switches concurrently developers are forced to reconcile everyone's overlapping features by hand without the help of a version control tool and often with code that's harder to follow. Also, you need to go into cleanup mode whenever a feature goes live and the old behavior dies.

The testing burden is one that people often overlook. If you have five feature branches you have to automate testing for five branches. If you have five concurrent feature switches you have thirty-two different combinations for test coverage. If you have a small codebase with good test automation that's great. If you have a large codebase with expensive testing, you can hit a ceiling on how many features your QA process can support at a time. With the five branches you constantly have to track trunk but you end up deferring trunk merges until later in the process which can sink a release if problems are found too late in the game. Both suck in different ways.

This isn't to say that feature switches are bad - they often make a lot of sense. My point is they don't actually solve the problem of maintaining long running separate behaviors, they just surface the pain in very different ways than branches. Different developers, teams, and projects will tend to work better with a branch or a switch approach, but it's just not true that features are good and branches are bad. My experience managing teams is that developers tend to develop an affinity for one approach or another and really struggle with the alternative. For every team I've seen end up with nightmare merges, I've seen teams with feature switched code that are terrified to ship because they don't have confidence that the application will work correctly with work-in-progress features toggled off.

The pain of maintaining long running feature projects - which is really to say the pain of maintaining multiple concurrent behaviors - is constant. All we get to do is select how that pain is expressed. The real fix for long running feature development is, to the extent possible, to not do it.

@Fred Yes, I completely agree that the essential problems remain. It’s just a question of which way works better for your particular project, team, tools, and workflow. What I like about the non-branching approach is that it makes the pain explicit—you can see where everything is—and you get basic merging/integration for free.

A lone developer can get away with almost any approach. For a team, though, I think that replacing a long running feature git branch with branching logic in the master branch is absolute insanity. Fred is right about the pain of that approach. I would argue that readability is probably the most important criterion for good code, and branching logic for features makes your code much more difficult to read and understand, both for the people writing the new feature and for the people who aren't writing the new feature. Have you ever driven your car through a construction zone? There are cones everywhere, the curbs are non-existent, lanes are closed, there's a danger of debris or even hitting construction workers, and you definitely have to slow down. Well, that's what it's like to put half-baked features in your master branch. Why would you want to turn your it into a permanent construction zone? The key to having a long-running feature branch is to continuously merge the master branch into the feature branch. That way, when the feature branch is finally merged into master, the merge should be fairly easy.

If you're doing a major version update, as Michael mentioned, then it might make sense to create a new Xcode target or even a completely new Xcode project to exist side-by-side with the old version in the master branch. But still, in that case you don't want conditional branching logic in your code. The new feature code goes into the new target/project, the old code stays in the old target/project, and the two can share whatever other code is common to both, possibly in a shared framework.

@Michael I have to disagree that with feature switches that basic merging and integration is free - on the contrary while feature switches mean you don't get a big bang merge that can be very painful, you can frequently get painful small merges where some critical sections of the code have to be updated by hand to support two or more behaviors concurrently. You're spreading the pain out, but I've seen projects where the sum of the difficult small merges ended up being more work than the big bang.

@Jeff I'm not saying that feature switches are bad. On the contrary they work out really great for some projects. In the case of having if statements everywhere - yeah this murders readability and I've seen a lot of code like this. On the other hand, I've seen codebases where the structure of the application was changed so that either through dependency injection or some explicit module system the branch-by-abstraction approach means all your conditionals are in how the code is assembled, not in the logic, which cuts down on a lot of the pain of feature switches. Of course this means you're literally changing the design of the application to facilitate a development strategy which may mean making the whole thing harder to understand in other ways.

What I'm saying is no matter how you attempt to support development of multiple long lived versions of the application (or multiple behaviors, etc) it's always painful. To the extent possible, don't do it. The reality is you probably will have to do it at some point in your career, but you have to accept that there's going to be risk and pain and the choice is less about how to remove that pain but how you want to experience and thus attempt to manage it.

There's no good one-size fits all solution. There's lots of great examples of "process X works", but really that means is "process X worked for me, on this project, with this team, and these circumstances".

@Fred Yes, you still have to do the integration work in the tricky cases. What I mean is that the busywork is free. You don’t have to manually VC-copy changes back and forth between branches, setting up problems down the road if you forget some. And all the code is visible at the same time when you need to do a batch find, so you can avoid repeating certain tasks for each branch.

@Jeff A lot of this depends on how isolated the changes are. Depending on how the code is organized, you may not need that many conditionals, and they may not affect the logic much.

@Michael Yes, absolutely. You don't have to track upstream because you're always up-to-date, and the reality is if you're using branches it's difficult to impossible to ever push upstream, which is how you end up with the big bang merge at the end.

@Michael That's fine if you have isolated changes that don't affect the logic much, but the blog post was inspired by a tweet about a patch that had been worked on for 3 years and broke things when finished, and the author talks about using conditionals "liberally" and even making typealiases to have the new and old together, so it's clear that replacing git branches with conditional logic in master is being offered as general advice for pretty much any situation. "Code that isn’t integrated into your team’s mainline branch should be considered a liability at best, and dead at worst." That's a very broad (and dubious) statement.

> and you get basic merging/integration for free

I feel like the effort required to get rid of feature switches tends to be at least as high, or higher (particularly for long-running features switches that are expressed in different places in your code, and things like dependency injection aren't an option), compared to the effort caused by merging. At least there are tools for merging, and modern ones are actually pretty good.

[…] Branches and Continuous Integration […]

Leave a Comment