Git Rerere

(git-scm.com)

5 points | by vinhnx 6 hours ago

1 comments

  • goku12 10 minutes ago
    Conflicts are cases where the merge algorithm can't unambiguously determine the correct way to resolve the resulting code from multiple candidates. Normal merge algorithms like git's 'ort' use very simple and naive logic to resolve the result. Context (syntax or semantics) aware merge alorithms like mergiraf [1] are capable of resolving more merge cases than the default algorithms, thus reducing the conflicts. (Full elimination is still not possible. I'm not confident that copilots can do that repeatably.)

    Now, it's important what the VCS tool does when a conflict is encountered. Git doesn't allow the merge or rebase to be recorded (committed) until the conflicts are manually resolved. Thus you end up with a child commit that had conflicts, but doesn't explain how they were actually resolved. If you face the same conflict again (like by merging another branch with the same change), you end up having to manually resolve it again. This isn't an issue for successful merges because the knowledge of the resolution is represented by the merge algorithm itself.

    Rerere is a hack designed to solve the problem explained above - of having to repeat manual resolution due to the missing resolution information. When enabled, rerere records the states of the conflicting objects before and after the manual resolution. If you face the same conflict again, you can just ask rerere to identity the conflict pattern from its records and apply the applicable resolution. This is very useful in some weird situations where you may end up resolving the same conflict three times or more.

    More modern VCS tools like pijul[2] and jujutsu[3] take a different approach. They have what are called 'first class conflicts'. It means that a merge or rebase won't simply fail because there are conflicts. They will always succeed by recording/committing them along with the conflict information. Note that I'm not talking about committing the files with conflict markers in it. Instead, the conflicts are recorded in the repo as full fidelity structured information. Any manual conflict resolutions are recorded separately in relation to these conflict data. Essentially, they're saving the same data as rerere, but directly in the repo with proper connections to their sources, instead of in a separate cache (within the .git directory) like rerere does. This approach has several advantages:

    - Merges and rebases don't fail. You're under no obligation to resolve the conflicts before you're allowed to record the merge. It's less painful.

    - You can postpone the conflict resolution for later. You can carry on with other work on the codebase, even with those unresolved conflicts.

    - You can push the conflicted merge online like a regular push, and ask one or more other devs to handle the conflicts instead.

    - Since a resolved conflict has all the resolution information attached to it, future resolutions of the same conflict are fully automatic. A resolved conflict isn't coming back ever again.

    Pijul records changes as patches, unlike git which saves worktree snapshots as commits. In pijul, the conflicts and their resolutions are recorded with respect to the two changes/patches that contain the conflicting code. So the resolution always go along with the conflicting patches, wherever they go. In case of jujutsu, I'm not fully sure of how this is achieved. But it seems like it stores the conflicts as additional git tree objects under the merge commit object.

    This way of handling conflicts seems natural and how VCSs should have been designed. Git and the others can be excused because our understanding of version control is still evolving. However, what's remarkable about jujutsu's solution is that it's implemented in git repos. Jujutsu still uses git repo format. This means that the solution should be doable in git too. And git already has done much of that work in the form of rerere. It's only a matter of integrating these two solutions to retrofit git with first class conflicts and resolutions as well. I'm not sure if I got something wrong here. But I sure hope someone will seriously consider this. It would be a major advancement of git's ergonomics.

    [1] https://mergiraf.org/conflicts.html

    [2] https://pijul.org/manual/conflicts.html

    [3] https://jj-vcs.github.io/jj/latest/conflicts/