BLOG

Blog > Rebase vs Merge, Part 2

Rebase Merge Software Development Version Control Git

Feb 21st, 2020 - Dave Strock

Rebase vs Merge, Part 2: The Powers and Responsibilities of Time Travel

In Rebase vs Merge Part 1, we covered the basics of merge and rebase using the git revision control system as an example. This article we’ll discuss why you’d want to go mucking about with timelines, even though wiry haired scientists are always warning against such meddling.

As we learned, one of the main advantages of rebase is that it allows us to travel back in time and alter the timeline of our source code’s development. We can rearrange, and even combine and re-write, commits at will. What we didn’t discuss was why you’d want to do such a thing.

Keeping track of source code changes can sometimes be looked at as more of a security and integrity concern. We want to make sure we have a copy of our app’s source code incase something happens, and we want to have old copies so that we could revert to them incase something worse happens. This way of looking at it ignores the story that is being told of the development of your app. Even if you don’t purposely try to tell a story, your code will still tell one. So if a story will be told either way, why not tell the clearest story possible? Why not make the story telling adapt to the needs of the project?

Using the Merge Strategy

The major advantage of the merge strategy is that it is the simpler of the two. In fact, the rebase strategy usually still includes merging. For simple projects, that don’t change frequently and have very few developers, this can be a fine strategy, but as we’ll see, doing so leaves a lot of power on the table.

The merge strategy also has the disadvantage of entwining [Rich Hickey would say "complecting"], the merge process with the conflict resolution process. Many a build has failed due to a bad auto-merge. It can be hard to predict all merge conflicts before the merge, so you can end up having to change code after approving the main changes. If you’re a shop that does code reviews, do you review your post-merged code?

Using the Rebase Strategy

As we showed in part 1, one of the biggest advantages of the rebase strategy is how it separates conflict resolution from branch merging. Rebase lets you perform the sometimes complex conflict resolution process at your leisure, letting you thoroughly test and even make changes on your own schedule, removing any pressure to hurry due to fear of new changes on master introducing additional conflicts to deal with. During times of extremely high churn, it can even reach the point of being too difficult to merge a complex change, as the target keeps moving too fast, that teams will resort to so-called “code freezes” or other complex scheduling solutions just to give people room to breathe and get merges done.

Rebase removes almost all of this, by letting developers resolve conflicts, along with the subsequent testing and changes, all on their own branch. This independence was the main reason branch-based development became the norm in the first place, so it makes sense to retain as much of it as possible where the rubber meets the road in making changes to the mainline codebase.

Another minor advantage of rebase is that it leaves the repository in what some would describe as a “cleaner” state, in that it doesn’t require the creation of an addition commit that marks the point at which the branch was merged. Some teams find these merge commits to be useful in visually grouping related commits, and the difference in having merge commits versus not having them is usually negligible, but we mention it for completeness since its often one of the first features of rebase that proponents will mention.

Interactive Time Travel

While a clean revision history is nice, even nicer is the ability to go back in time and clean up a revision history. Like all time travel abilities, this can be a very sharp tool that should be used with care, but it can deliver very good results.

Since rebase can be used to go back in time and reorder things, like we altered the branch pointers in part 1, it can also be used to go back in time and modify commits. If you’ll remember, we had a file containing lines of a single counter increasing sequentially from 0 to 4. Let’s say we create a new branch and add a new value to our file:

            
$ cat file
0
1
2
3
4
$ git checkout -b add-5
$ echo "50" >> file
$ git add file
$ git commit -m "Added five"

Upon reviewing our changes, we notice that we made a typo when adding the value, adding 50 instead of 5. So we go fix it:

            
$ emacs file
$ git add file
$ git commit -m "Fixed five"

So we fixed the issue, but now our history looks like this:

            
$ git log —all
* aa8605c    (2 seconds)   <Dave Strock>   (HEAD -> add-5) Fixed five
* 07f56db    (13 minutes)  <Dave Strock>   Added five
* ecd8aa2    (6 minutes)   <Dave Strock>   (HEAD -> master, add-4) Four is more
* 3bd999e    (6 minutes)   <Dave Strock>   Three is what we need
*   02ac03a  (21 minutes)  <Dave Strock>   Merge branch 'add-2'
|\
| * 575b69c  (24 minutes)  <Dave Strock>   (add-2) Even better file!
* | cba8efd  (23 minutes)  <Dave Strock>   Start at zero
|/
* 59e4575    (25 minutes)  <Dave Strock>   The best file!
* 763d779    (27 minutes)  <Dave Strock>   Initial commit

I don’t know about you, but I find “fixed it” commits to be a bit annoying. I’d rather that I did it right the first time, and I don’t see a lot of historical value in looking back at how many times I had to fix a typo after I’d already created a commit.

So surely the lesson to learn is “Be more careful about what you commit”, right?

Why? If you really investigate this question, you’ll start to hit some deep truths about how you view revision control. We’re taught to think of it as the “permanent record”, unchanging except in the rare cases of Herculean efforts to correct titanic problems, but that need not always be the case. There is no real reason that branches have to be treated the same as master, but many of us learned branching long before we had the time traveling powers of tools like git so it may not be an obvious thing to question.

Once you realize that the requirement for code on a branch is simply that it be ready for the “permanent record” by the time it is merged to master, you realize you can do whatever you want before that. We can always get up to the point in time right before the merge, and then time travel back to change things on the branch, and we can do this as many times as we need to in order to tell the story we want to tell. This is another way that the rebase strategy gives developers more breathing room.

Let’s go ahead and fix our minor historical blemish by squashing our two commits together into a single commit that looks like we just used the correct value the first time. To do that we’ll use the --interactive (or `-i`) flag to tell git that we want to go back in the time line specifically to interact with the commits and files found there:

            
$ echo $EDITOR
emacs -nw
$ git rebase -i ecd8aa2
pick 07f56db Add five
pick aa8605c Fixed five


# Rebase ecd8aa2..aa8605c onto ecd8aa2 (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

The first thing you may wonder is why we're printing the $EDITOR shell variable, but we’ll get to that after we talk about invoking the interactive rebase. The argument given to interactive rebase is effectively telling git how far you want to go back in time, you give it the hash of the the commit farthest back in time *that you do not want to change*. This is usually one commit further back in time than the first commit you want to change. In this case, we want to change the two commits on our add-5 branch, so the commit before that is ecd8aa2. Git will allow you to use labels like master but we think it's a better practice to get used to using commit hashes to be more precise and avoid making mistakes due to assumption. Remember: sharp tool.

So once you tell git how far back to go, you’ll be presented with your editor of choice, configured with the $EDITOR shell variable, already filled in with a bunch of text that looks vaguely like the commit history you wanted to change and instructions on how to modify it. For this example, we’ll be using the squash operation, which will take the commit that we mark as ’s’ and squash it into the commit before it in time, that is into the commit above it.

So let's squash aa8605c into 07f56db by changing ‘pick’ next to aa8605c to ’s':

            
pick 07f56db Add five
s aa8605c Fixed five


# Rebase ecd8aa2..aa8605c onto ecd8aa2 (2 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# .       create a merge commit using the original merge commit's
# .       message (or the oneline, if no original merge commit was
# .       specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out

One important thing to note is that any lines starting with ‘#’ will be ignored when the file is saved, and the rebase process will only act on commits that are listed in the file when saving. This means that git’s interactive rebase has “get out of jail free” card in the event that you mess anything up: Just clear out the file (or comment all lines), save, and git will reply that there is “nothing to be done”.

In this example, when we save the file and exit the editor, we will immediately be presented with another editor displaying the commit messages for both commits we’re attempting to squash together:

            
# This is a combination of 2 commits.
# This is the 1st commit message:


Add five


# This is the commit message #2:


Fixed five


# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
#
# Date:      Tue Nov 5 16:58:08 2019 -0600
#
# interactive rebase in progress; onto ecd8aa2
# Last commands done (2 commands done):
#    pick 07f56db Add five
#    squash aa8605c Fixed five
# No commands remaining.
# You are currently rebasing branch 'add-5' on 'ecd8aa2'.
#
# Changes to be committed:
#       modified:   file
#

Just like during the interactive rebase, any lines starting with ‘#’ will be ignored upon save. This is your chance to craft a single commit message that tells the story of the two commits you melded into one. Something like “Added five correctly” should suffice. When you save and exit the editor, git will construct and save your new, single commit. Note the different commit hash and message.

            
[detached HEAD 563b63c] Add five correctly the first time
Date: Tue Nov 5 16:58:08 2019 -0600
1 file changed, 2 insertions(+), 1 deletion(-)
Successfully rebased and updated refs/heads/add-5.
$ git log --all
* 563b63c    (49 minutes)  <Dave Strock>   (HEAD -> add-5) Add five correctly
* ecd8aa2    (23 hours)    <Dave Strock>   (master) Four is more
* 3bd999e    (23 hours)    <Dave Strock>   3 is what we need
*   b5a1877  (23 hours)    <Dave Strock>   Merge branch 'add-2'
|\
| * b18a734  (24 hours)    <Dave Strock>   Even better file
* | f0ee5c2  (23 hours)    <Dave Strock>   Start at 0
|/
* 8785557    (25 hours)    <Dave Strock>   The best file
* 95878fc    (27 hours)    <Dave Strock>   Initial Commit

While we were fixing our typo, another team member merged their commit to master causing a conflict, but we already know how to easily handle that with rebase:

            
$ git log --all
* bd0f5cd    (51 seconds)  <Dave Strock>   (master, add-4.5) Added 4.5
| * 563b63c  (56 minutes)  <Dave Strock>   (HEAD -> add-5) Add five correctly
|/
* ecd8aa2    (23 hours)    <Dave Strock>   (master) Four is more
* 3bd999e    (23 hours)    <Dave Strock>   3 is what we need
*   b5a1877  (23 hours)    <Dave Strock>   Merge branch 'add-2'
|\
| * b18a734  (24 hours)    <Dave Strock>   Even better file
* | f0ee5c2  (23 hours)    <Dave Strock>   Start at 0
|/
* 8785557    (25 hours)    <Dave Strock>   The best file
* 95878fc    (27 hours)    <Dave Strock>   Initial Commit
$ git checkout add-5
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: Add five correctly the first time
Using index info to reconstruct a base tree...
M    file
Falling back to patching base and 3-way merge...
Auto-merging file
CONFLICT (content): Merge conflict in file
$ emacs file
$ git add file
$ git rebase --continue
Applying: Add five correctly
$ git log —-all
* 37432bc    (2 minutes)   <Dave Strock>   (HEAD -> add-5) Add five correctly
* bd0f5cd    (57 minutes)  <Dave Strock>   (master) Added 4.5
* ecd8aa2    (23 hours)    <Dave Strock>   Four is more
* 3bd999e    (23 hours)    <Dave Strock>   3 is what we need
*   b5a1877  (23 hours)    <Dave Strock>   Merge branch 'add-2'
|\
| * b18a734  (24 hours)    <Dave Strock>   Even better file
* | f0ee5c2  (23 hours)    <Dave Strock>   Start at 0
|/
* 8785557    (25 hours)    <Dave Strock>   The best file
* 95878fc    (27 hours)    <Dave Strock>   Initial Commit

In part 3 we’ll discuss how to get the most out of these new powers and how to be a good time traveling citizen that avoids making extra work for your team.