6 months after the move to Mercurial and testing different working modes using this DVCS, the GreenHopper team ended up using following Mercurial features:
- We use clones for “throw away” spikes as well as in case where other teams want to contribute code changes to the GreenHopper code base
- Each feature is developed in a separate branch inside the main repository. Once complete, the changes are merged back to default and the branch closed
- Our main repository is hosted on Bitbucket, and all developers pull/push directly from/to it. This works best for us, even though it is not very dvcs’y
- Forward-compatibility branch (for the next JIRA release) kept up-to-date by automatic nightly merge from default (or manual in case of conflicts)
Moving to mercurial was a huge win for the team, but the transition towards it was not free from pain.
6 months ago, the GreenHopper team decided to switch from Subversion to Mercurial. One big reason to move to a DVCS was the big pain subversion branch management was. Working on GreenHopper for JIRA version X (we always develop against the current JIRA release), while having a compatibility branch for JIRA version Y (which is used by the JIRA team on their dogfooding instance) caused significant and sometimes painful overhead. Sometimes a commit got “forgotten” to be merged between branches, adding its share of surprises. Merge conflicts didn’t help either, especially in cases where half the code changed due to a change in JIRA. Our hope was that DVCS would greatly ease the pain.
Given the agile fashion of the GreenHopper team – we regularly switch between Scrum and Kanban to dog-food our own stuff product – we wanted to be able to develop individual features in separation and only bring the result back to trunk/default once we felt a feature was completed. Releasing GreenHopper at any instant was the ultimate goal we were striving for. While hard to do in Subversion (branching is everything but cheap), this turned out to be a sleep-walk in Mercurial.
Converting the repository
Two tools were available to perform the actual repository migration: hg convert and hgsubversion. hg convert was easier to use and gave results much quicker – so we went for it (even though the Bitbucket team recommended hgsubversion). Turns out, the generated repository contained differences to the subversion one. Having learned that lesson, we opted to use hgsubversion instead. Later in testing we also discovered that one or our branches was incorrectly created, and both tools had their fair share of problems processing it. We ended up doing following steps to get all code migrated:
- Prepare a filemap file, describes which files to ignore in the conversion. You will never get these files back, so choose wisely!
- Prepare a naming file, describing all users and their email addresses
- Prepare a branches files, describing how to rename branches (some had weird names or were over verbose)
- Prepare a tags file, describing how which tags had to be renamed (e.g. jira-greenhopper-plugin-5.5-rc5 = del_5)
- Convert the subversion repository using hgsubversion. The result was a mercurial repository containing all but the broken branch -> greenhopper-all
- Convert the broken branch using hgsubversion. Just to never having to go back to svn -> greenhopper-branch-x
- Clean up tags (we removed all del_* tags)
- Create an empty mercurial repository and import “trunk” into it -> greenhopper-main
- Store all three repositories in bitbucket. -all and -branch-x we kept for reference, greenhopper-main we used from then on to continue development.
Each team at Atlassian does the conversion slightly different. This mainly depends on the requirements. In our case we mostly work on “trunk” and very seldom release new versions of older GreenHopper versions (basically only in case of security related bugs). For us it was more important to have a small repository than one that contained everything. It should be noted, if ever we needed to bring in another branch from greenhopper-all we could do so without a problem.
Experimenting with Mercurial – clone per feature
Once we had our migration path nailed, we decided to go for the “thrown into the deep end” approach. The whole team moved at once over to mercurial, none of us having much beforehand experience (agile and all ).
We tested different approaches while working on the first few stories in the post-subversion-world. Our first attempt was to create clones for features:
- For each story we created a separate clone in bitbucket
- All developers that worked on the same story committed against the same story clone
- Once finished, we switched over to the main repository, pulled the changes from the clone and then pushes the changes back to bitbucket
While this approach kind of worked, we really felt pain pretty quickly. For each story/clone we had to get the IDE setup, adapt our scripts to test-run JIRA with the given clone and delete the clone on Bitbucket once the story was finished and merged into the main repository. Our mean time for a story is a couple of days, so soon we asked ourselves what exactly the advantages of this mode of working were…
Beside the core team we also had an intern spiking new things and another team providing code changes to GreenHopper. In both cases we actually liked the separation of repositories, as we had the power to decide when to pull in these changes into our main repository – if at all. Here the full power of DVCS came to play, as people could work on the code without the need to give them write access to the main repository!
Having had mixed success with clones, we moved to work using named branches instead
- default contains the “stable” code
- For each feature we create a new named branch, gave it a human recognizable name (we opted against issue keys) and worked away on that branch
- Each commit is marked by the JIRA issue representing the story or bug we work on
- Once the feature is complete and tested, we merge the code back to default and then close the branch.
On the console these steps would look like this:
Feature branch development
$ hg branch
# creates a new branch called "my-new-branch", updates to it
$ hg branch my-new-feature
marked working directory as branch my-new-feature
# commit changes, always state issue key
$ hg commit -m "GHS-1234: Added new super cool feature"
# update back to default branch (merge is always from something to "here"
$ hg update default
# merge branch into default
$ hg merge my-new-feature
# commit merge
$ hg commit -m "Merged branch my-new-feature into default"
# update back to the branch in order to close it
$ hg update my-new-feature
# close the branch
$ hg commit --close-branch
# back to default
$ hg update default
# push everything (we need the --new-branch the first time we push, as we added a branch,
# required regardless whether that branch is still open or has been closed already
$ hg push --new-branch
While this seems like a lot of steps, it is actually quite simple to use. The advantages of branches are numerous:
- Switching between different work branches is a simple “hg update” – the IDE can be kept open.
- Work can be committed into the main repository, yet can be kept separate until finished
- No overhead of creating and deleting repository clones, knowing which clone contained what
- Seeing what is currently in progress is a simple “hg branches” away
- As each feature commit happened in a feature branch, the actual commit contains the feature branch name. This makes it pretty simple to link commits to features, see what is currently being worked on and when looking through the history
hg branches example
edit-view-owner 5810:6b849189c8eb (inactive)
rapid-gadget-name 5783:468bfd08ab6d (inactive)
js-performance-layout 5704:0562304c8227 (inactive)
bitbucket commit history view
Branches vs clones
The core team now mostly uses feature branches for development. We still prefer using clones to receive code changes from other devs or when we work on spikes where it is not clear whether the code will ever make it onto default. Remember, once a commit makes into the repository it pretty much stays there forever.
One clone vs distributed clones
We pretty much all pull/push against the same Bitbucket repository, so in this respect we are not using Mercurial much different to what Subversion provides. Other teams (such as the Bitbucket team) create everything on different clones (each developer has his/her own clone), and only one or two developers pull the changes into the official repository, thus acting as gate keepers. We don’t really see the point of doing this in our team at the moment, so we stick with the one-repository-rules-it-all approach. Our team size and constellation (6 developers, all working in the same office) works well with this approach, we are looking forward to how the JIRA team will handle that challenge given their team size!
Merging vs rebase
When two people commit concurrently onto the same branch, you end up with two heads which have to be brought together. Two ways exist to achieve this. The clean way is to merge the heads together, thus creating third commit. Alternatively, a developer can rebase his commit when he pull the changes from the other developer in. Rebase simply puts the local commit on top of the remote one, thus avoiding the necessity to perform a merge. Initially I preferred rebasing, as it gets rid of all these noisy merge commits, but in the mean time I switched over to disliking them. The risks (two commits with the same changes due to rebase after already having pushed) does not merit the advantage (less noise), especially as tools improve (e.g. hide all merge commits in the history view). I recommend sticking to merges.
Some of us use IntelliJ IDEA, others use Eclipse. Some things are easiest achieved on the command line though. We also used MacHg as a standalone client, but have since moved to SourceTree. Eclipse really had its share of problems half a year ago, but since then the IDE support has improved heaps, reducing more and more the need to use the command line to work with Mercurial. Personally I also use TortoiseHg (on Windows), which works pretty well too.
If you still work on a centralized version control system such as Subversion, think about moving to a DVCS soon. The advantages are numerous and far outweigh the initial learning curve for the team. Tools support has recently caught up as well so the biggest time investment will be the actual migration of the repository.