git subtree - a better alternative to git submodule

To manage a lot of libraries at work, we once decided to use git submodules. The idea looked easy. Create a submodules/ directory in the main repo and then every library is checked out in the directory as a submodule. This idea had some additional benefits, every library had its own repository, so we could create podspec files there and build using CocoaPods. We could also open-source some libraries, while keeping others private.

But over time there were more and more problems with the submodule approach:

  • conflicts on submodules
  • complex handling in scripts
  • submodules stopping tracking branches
  • bunch of useless bump commits for updating a submodule
  • complex removal of a submodule
  • GUI tools not handling submodules properly
  • more complex github & jenkins hooks
  • commiting a not yet merged submodule branch ref
  • teaching everyone in the team about submodules

There's a lot of places where git submodules can go bad, and they go bad. So we decided to try something different.

git subtree

While git submodule uses putting a ref to another repo into the main repo, git subtree works on the idea of subdirectory merging. We just merge a repository with a subdirectory of our repository.

All the commits from the remote repository are included in our history when we pull a subtree. We can also push a subtree, that will put commits relevant to that subdirectory in the remote repository.

To setup subtrees, on our main repository we do:

git subtree add --prefix=lib subtree/lib remote_branch

To pull:

git fetch subtree/lib remote_branch
git subtree pull --prefix=lib subtree/lib remote_branch

To push:

git fetch subtree/lib remote_branch
git subtree push --prefix=lib subtree/lib remote_branch

This so far works much better. Only one person in the team has to know how to use git subtree and can update the library repositories. Other team members just see a git repository that already has everything that is needed.