In Stash 2.4 we introduced support for fork-based workflows. As part of building the feature, half the Stash team switched to forks for their daily work to dogfood forks and ensure it worked. The result? Pain. Let me explain.

The pain grew out of two primary shortcomings of forking:

  1. Integrating with upstream is clunky and onerous because you have to maintain 2 remotes and constantly juggle them
  2. Verifying what you’ve written is impossible, because continuous integration (CI) does not understand forks

I’m going to focus on the pain of integrating with upstream in this post. CI and forks remain an open sore, a problem yet to be solved.

The problem

Working in my fork, as 2.4 was developed, I had two remotes: upstream, which referred to STASH/stash (the canonical repository); and origin, which referred to ~BTURNER/stash (my fork). My workflow, repeated in regular cycles, consisted of:

git fetch upstream
git merge upstream/master
Do some work
git push origin HEAD

Note the remote juggling. I’m fetching in from upstream because origin generally contains no useful changes, but I’m pushing to origin because that’s where my changes go so I can open a pull request to get them reviewed and accepted back into upstream.

There are, of course, little twiddles we can do to try and make this “better”. One option would be to leverage Git’s ability to have distinct fetch and push URLs for the same remote, setting up my origin remote to fetch from STASH/stash but push to ~BTURNER/stash. If you only develop on a single machine, that’s not a bad option. Unfortunately, I have ~4 different machines I contribute back to Stash from, between work and home, so sometimes I actually need to be able to fetch from my fork. That means I’m back to multiple remotes again.

Another twiddle might be to swap the remotes, and instead have origin refer to STASH/stash and fork or personal or bturner refer to ~BTURNER/stash, but all that does is change up the workflow a little, allowing me to use simple git fetch instead of git fetch upstream. And now I have to remember to push to my fork.

The solution

During Innovation Week – a week long sprint where we developers are encouraged to innovate without any special project concerns – I built a feature called “ref synchronization” (now referred to as fork synchronization in Stash). A plugin in Stash listens for pushes, and then automatically fetches all of the newly-pushed changes down into my fork, applying any fast-forward updates. If a change upstream isn’t fast-forward, my fork changes are left intact–I don’t want to lose my work–and a “Synchronize” button is shown in the Stash UI to allow me to manually resolve the divergence. Right now there two options available: Merge in the latest from upstream, producing a new merge commit on the fork’s branch or discard the changes, overwriting the ref with whatever upstream had there.

The change to my workflow was immediate and blissfully welcome: I no longer need an upstream remote. Developing on my fork is now identical to developing directly on its origin. Every time someone pushes to STASH/stash, or merges a pull request, or anything else that moves refs, the changes are duplicated in my fork within 2 or 3 seconds. Instantly I’m back to git fetch git merge origin/master git push origin HEAD–my non-fork workflow.

Introducing ref synchronization

Ref synchronization duplicates all branch and tag changes made upstream in any subscribed fork. New branches and tags pushed upstream are added automatically, branches and tags deleted upstream are automatically removed, and branches and tags that are updated are automatically updated.

  • Branches are updated if they are fast-forward. If you’ve committed changes of your own on the branch they will never be overwritten.
    • If you then, for example, open a pull request to get them merged back into upstream and that pull request is accepted, the resulting merge (since it is implicitly fast-forward from your changes) will automatically be fetched back into your fork
  • Tags are updated only if they match exactly. If you have a tag which points to a different revision than upstream, any changes to the upstream tag are not reflected in your fork.
    • This is largely down to some weaknesses in git’s internals. There is no way to run git fetch that cares about the fast-forwardness of tag updates; you either tell git fetch to update tags or you tell it not to. + on the refspec means nothing in the context of tags.
  • Branches and tags are only deleted if they match exactly, to ensure you do not lose your work.

Branches can be in 4 different states:

  • Synchronized
  • Ahead – The fork includes all commits available upstream on the same branch, plus more
  • Diverged – The fork and upstream both include commits the other does not
    • Tags can never be in this state; they’re always considered ahead instead
  • Orphaned – No branch with the same name exists upstream

Ahead and orphaned are both “normal” states and will not produce a “Synchronize” button in the UI. Diverged is the only state that will show the “Synchronize” button. Note that, by implication, this means “Synchronize” will never be shown for tags.

Due to performance issues (as well as some safety concerns), rebase was removed as an option for synchronizing branches. That leaves merging in the latest from upstream or discarding your changes. If the merge fails with conflicts, Stash will tell you which files are conflicted (but, for 2.6, cannot showyou the conflicts).

Fork synchronization is available today in the latest release of Stash. For anyone using forks, hopefully it will make your life simpler.

 

Anyone can be good, but awesome takes teamwork.

Find tools to help your team work better together in our Git Essentials solution.

Git: Simplifying Forks