6 best practices for teams using Git

Work more effectively by using these Git collaboration strategies.
Register or Login to like

Git is very useful for helping small teams manage their software development processes, but there are ways you can make it even more effective. I've found a number of best practices that help my team, especially as new team members join with varying levels of Git expertise.

Formalize Git conventions for your team

Everyone should follow standard conventions for branch naming, tagging, and coding. Every organization has standards or best practices, and many recommendations are freely available on the internet. What's important is to pick a suitable convention early on and follow it as a team.

Also, different team members will have different levels of expertise with Git. You should create and maintain a basic set of instructions for performing common Git operations that follow the project's conventions.

Merge changes properly

Each team member should work on a separate feature branch. But even when separate branches are used, everyone eventually modifies some common files. When merging the changes back into the master branch, the merge typically will not be automatic. Human intervention may be needed to reconcile different changes made by two authors to the same file. This is where you have to learn to deal with Git merge techniques.

Modern editors have features to help with Git merge conflicts. They indicate various options for a merge in each part of a file, such as whether to keep your changes, the other branch's changes, or both. It may be time to pick a different code editor if yours doesn't support such capabilities.

Rebase your feature branch often

As you continue to develop your feature branch, rebase it against master often. This means executing the following steps regularly:

git checkout master
git pull
git checkout feature-xyz  # name of your hypothetical feature branch
git rebase master  # may need to fix merge conflicts in feature-xyz

These steps rewrite history in your feature branch (and that's not a bad thing). First, it makes your feature branch look like master with all the updates made to master up to that point. Then all your commits to the feature branch are replayed on top, so they appear sequentially in the Git log. You may get merge conflicts that you'll need to resolve along the way, which can be a challenge. However, this is the best point to deal with merge conflicts because it only impacts your feature branch.

After you fix any conflicts and perform regression testing, if you're ready to merge your feature back into master, do the above rebase steps one more time, then perform the merge:

git checkout master
git pull
git merge feature-xyz

In the interim, if someone else pushes changes to master that conflict with yours, the Git merge will have conflicts again. You'll need to resolve them and repeat the regression testing.

There are other merge philosophies (e.g., without rebasing and only using merge to avoid rewriting history), some of which may even be simpler to use. However, I've found the approach above to be a clean and reliable strategy. The commit history is stacked up as a meaningful sequence of features.

With "pure merge" strategies (without rebasing regularly, as suggested above), the history in the master branch will be interspersed with the commits from all the features being developed concurrently. Such a mixed-up history is harder to review. The exact commit times are usually not that important. It's better to have a history that's easier to review.

Squash commits before merging

When working on your feature branch, it's fine to add a commit for even minor changes. However, if every feature branch produced 50 commits, the resulting number of commits in the master branch could grow unnecessarily large as features are added. In general, there should only be one or a few commits added to master from each feature branch. To achieve this, squash multiple commits into one or a handful of commits with more elaborate messages for each one. This is typically done using a command such as:

git rebase -i HEAD~20  # look at up to 20 commits to consider squashing

When this is executed, an editor pops up with a list of commits that you can act upon in several ways, including pick or squash. Picking a commit means keeping that commit message. Squashing implies combining that commit's message into the previous commit. Using these and other options, you can combine commit messages into one and do some editing and cleanup. It's also an opportunity to get rid of the commit messages that aren't important (e.g., a commit message about fixing a typo).

In summary, keep all the actions associated with the commits, but combine and edit the associated message text for improved clarity before merging into master. Don't inadvertently drop a commit during the rebase process.

After performing such a rebase, I like to look at the git log one last time to make final edits:

git commit --amend

Finally, forcing an update to your remote feature branch is necessary, since the Git commit history for the branch has been rewritten:

git push -f

Use tags

After you have finished testing and are ready to deploy the software from the master branch, or if you want to preserve the current state as a significant milestone for any other reason, create a Git tag. While a branch accumulates a history of changes corresponding to commits, a tag is a snapshot of the branch's state at that instant. A tag can be thought of as a history-less branch or as a named pointer to a specific commit immediately before the tag was created.

Configuration control is about preserving the state of code at various milestones. Being able to reproduce software source code for any milestone so that it can be rebuilt when necessary is a requirement in most projects. A Git tag provides a unique identifier for such a code milestone. Tagging is straightforward:

git tag milestone-id -m "short message saying what this milestone is about"
git push --tags   # don't forget to explicitly push the tag to the remote

Consider a scenario where software corresponding to a given Git tag is distributed to a customer, and the customer reports an issue. While the code in the repository may continue to evolve, it's often necessary to go back to the state of the code corresponding to the Git tag to reproduce the customer issue precisely to create a bug fix. Sometimes newer code may have already fixed the issue but not always. Typically, you'd check out the specific tag and create a branch from that tag:

git checkout milestone-id        # checkout the tag that was distributed to the customer
git checkout -b new-branch-name  # create new branch to reproduce the bug

Beyond this, consider using annotated tags and signed tags if they may be beneficial to your project.

Make the software executable print the tag

In most embedded projects, the resulting binary file created from a software build has a fixed name. The Git tag corresponding to the software binary file cannot be inferred from its filename. It is useful to "embed the tag" into the software at build time to correlate any future issues precisely to a given build. Embedding the tag can be automated within the build process. Typically, the tag string git describe generates is inserted into the code before code compilation so that the resulting executable will print the tag string while booting up. When a customer reports an issue, they can be guided to send you a copy of the boot output.

Conclusion

Git is a sophisticated tool that takes time to master. Using these practices can help teams successfully collaborate using Git, regardless of their expertise level.

What to read next
Tags
Ravi is a software engineer in the UAV industry. He has worked on a variety of problems such as UAV sensor software integration, DevOps, digital signal processing and applying machine learning to problems. He's always on the lookout for new and interesting software tools for his work and hobbies. Some of his favorites topic areas include Python, C#, Unity and containers.

5 Comments

Squashing as a "best practice" is a really dangerous idea. You run the risk of losing a lot of historical detail - for example, adding a new feature frequently involves changes or refactoring of existing code; often code written by other team members, who might need to "git blame" at some point to view the explanation, which should be in your log for the lines affected. (a definite best practice!)

Another best practice is fixing and improving things you spot as you're working in various areas of the code - this should happen in individual commits, again to keep your team members aware of not only "what" but "why" you changed something. A squash commit tends to become a big ball of "what" with no "why", so this really doesn't work as a default.

Rebasing as a best practice is something I would like to be doing but have never been able to, as it requires permission to force push - and therefore also really only works if you're the only person working on a particular branch. It also doesn't really make sense as a best practice in combination with squashing, since if you're squashing, your pull request is a single commit either way.

We can't generalize too much when teaching this stuff - in my opinion, it should be taught as techniques to be applied under certain specific conditions, and should not be pushed on anyone as best practices; the risk of doing so is "cargo culting", where your inexperienced team members are applying advanced techniques, teaching them to other team members as "best practice", and ultimately no one fully understands why they're doing what they do, which can create serious problems for the whole team.

Thanks for bringing up some important points for discussion!

The workflow discussed in this article, particularly about rebasing and squashing, apply to the use case where each developer works on a separate feature branch and has force push permission for the feature branch. In your Git usage where there are multiple contributors to one feature branch, I agree that this combination of squashing and rebasing won't apply.

I agree that commits associated with refactoring of existing code or other significant code changes to facilitate the new feature shouldn't normally be squashed together with, say, commits associated with new files added for the feature.

Typically, each feature branch will result in a handful of commits after squashing. A few will be due to the things you mentioned (e.g. refactoring existing code). And one or two commits will be the result of squashing together commits related to new files associated with the new feature.

I'm not sure about your comment regarding the pull request being a single commit. The pull request can be submitted for the handful of commits in the feature branch. The approver can then perform a merge commit that will include the multiple pull request commits, plus the additional commit that represents the merging of the pull request from the feature branch.

I agree with your concern that we shouldn't generalize too much, etc. Indeed, explaining the purpose of specific techniques should be part of onboarding new team members with the understanding that it might take some experience before the concepts sink in. (For instance, I think one has to struggle a bit with some Git logs during bug fixing to really appreciate the value of having a good, well-ordered Git commit history.) I guess I was just thinking about "best practices" in this case as a good workflow for this common use case with Git. This workflow certainly won't apply to every Git-based project.

In reply to by Rasmus Schultz (not verified)

Nice Post, I really enjoyed it. And my favorite one is "Rebase your feature branch often"

Also: rename your default branch, please.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.