Git tricks

2015-06-26 — 2026-02-26

Wherein sundry Git particulars are gathered, and a merge’s disputed files are set wholesale to ours or theirs by diff piped to checkout, with fish aliases kept for later remembrance.

computers are awful
provenance
workflow
Figure 1

My own notes on git, the source control system. This is one of the more heavily documented tools on the internet. As such, notes here are not intended to be a tutorial because the internet is full of those.

1 Learning Git

See the fastai masterclass for many more helpful tips/links/scripts/recommendations. Learn Git Branching explains the mechanics in a friendly fashion. Steve Bennett’s 10 things I hate about Git is also useful.

Mark Dominus’ master class:

2 Tips

Universally acclaimed classic: git tips.

3 Handy Git config

3.1 Git editor

I often edit in VS Code, so it’s handy to set it as Git editor:

git config --global core.editor "code-insiders --wait"  # insiders

3.2 .gitignore

Basics: ignore macOS .DS_Store files.

echo .DS_Store >> .gitignore_global
git config --global core.excludesfile $HOME/.gitignore_global

Galaxy brain version:

gitignore.io — Create useful .gitignore files for our project

I’m fond of the macos,visualstudiocode combo.

4 Handy git commands

4.1 Merging ours/theirs

During a merge, git checkout --theirs filename (or --ours) will check out the “ours” or “theirs” version, depending on which option we use. The following sweet hack will resolve all files accordingly:

git diff --name-only --diff-filter=U | xargs git checkout --theirs --

I do this a lot and I’ll never remember the details, so here are some aliases for fish that I can use to make it easier:

echo "alias git_theirs 'git diff --name-only --diff-filter=U | xargs git checkout --theirs --'" >  ~/.config/fish/conf.d/git_theirs.fish
echo "alias git_ours   'git diff --name-only --diff-filter=U | xargs git checkout --ours   --'" >> ~/.config/fish/conf.d/git_theirs.fish
chmod a+x ~/.config/fish/conf.d/git_theirs.fish
source ~/.config/fish/conf.d/git_theirs.fish

4.2 Searching

4.2.1 …for a matching file

git grep

4.2.2 …for a matching commit

Easy, except for the weird name: It’s called “pickaxe” and spelled -S.

git log -Sword

4.3 Track the history of a file, including renames

Kashyap Kondamudi recommends using the --follow option in git log to view a file’s history.

git log --oneline --find-renames --stat --follow -- src/somefile.ts

4.4 Clone just one branch

git clone --single-branch --branch <branchname> <remote-repo>

4.5 Remove a file from version control without deleting our local copy

git rm --cached blah.tmp

4.6 Delete a remote branch

git push <remote_name> --delete <branch_name>

4.7 Push to a non-obvious branch

git push origin HEAD:refs/heads/backdoor

This is almost obvious, except Git’s naming still seems… arbitrary? Why refs/heads/SOMETHING?

Read on.

5 What Git calls things

By which I mean what’s formally denoted as git references. Git references are the canonical description of the mechanics. tl;dr: the most common names are refs/heads/SOMETHING for the branch SOMETHING, and refs/tags/SOMETHING and remotes/SOMEREMOTE/SOMETHING for the (last known) state of a remote branch.

As alexwlchan explains, these references are friendly names for commits, and should be thought of as pointers to commits.

And yet there is something a little magical going on. How come, if I pull a branch, I get the latest version of that branch, not the earliest commit to use that name? Something else is happening. cf. I wish people would stop insisting that Git branches are nothing but refs.

The uses are (at least partly) convention, and we can use other references too. For example, gerrit uses refs/for/ for code review purposes.

6 Filters

Commands applied to our files on the way in and out of the repository. Keywords like smudge, clean, and .gitattr. These are a long story, but not so complicated in practice. A useful one is stripping crap from jupyter notebooks.

7 Commit hooks

For doing stuff before we put it in cold storage. e.g. asking: DID YOU REALLY WANT TO INCLUDE THAT GIANT FILE?

Here is a commit hook that does exactly that. I made a slightly modernized version:

curl -L https://gist.github.com/danmackinlay/6e4a0e5c38a43972a0de2938e6ddadba/raw/install.sh | bash

UPDATE: I decided this was a waste of time, so I removed it.

After that installation, we can retrofit the hook into an existing repository like this:

p -R ~/.git_template/hooks .git/

There are various frameworks for managing hooks, if we have lots. For example, pre-commit is a mini-system for managing Git hooks, written in Python. Husky is a node.js-based one.

Also, it is remarkably labour-intensive to install the dependencies for all these systems, so if you are using heterogeneous systems this becomes tedious. I’m not sure whether hook management systems actually save time overall for us as solo developers, since the kind of person who remembers to install a pre-commit hook is also the kind of person who is less likely to need one.

To skip the pre-commit hook,

git commit --no-verify

8 Subtrees/submodules/subprojects/subdirs/subterranean mole people

Subprojects inside other projects? External ones? The simplest way to integrate external projects seems to be as subtrees. Once this is set up we can mostly ignore them, and things work as expected. Alternatively, there are submodules, which come with various complications. More recently, there’s the Subtrac system, which I haven’t used yet.

8.1 Submodule

We can include external projects as separate repositories within a repository, but I won’t document it here since it’s already well documented elsewhere. I also use it less often because it’s fiddly. We need a bit of discipline to make it work; we have to remember to git submodule init, etc.

8.2 Subtrac

I haven’t tried it yet.

subtrac is a helper tool that makes it easier to keep track of your git submodule contents. It collects the entire contents of the entire history of all your submodules (recursively) into a separate git branch, which can be pushed, pulled, forked, and merged however you want.

8.3 Subtree

Subtree subsumes one Git tree into another in a usually transparent way (no separate checkout as with submodules). It can be used for temporary merges, or for splicing and dicing projects.

8.3.1 Splicing a subtree onto a project

Creatin’:

git fetch remote branch
git subtree add --prefix=subdir remote branch --squash

Updatin’:

git fetch remote branch
git subtree pull --prefix=subdir remote branch --squash
git subtree push --prefix=subdir remote branch --squash

Con: Rebasin’ with a subtree in our repo is slow and involved.

8.3.2 Taking a cutting to make a sub-project

Use subtree split to prise out a chunk. This approach has various wrinkles but is fast and easy.

pushd superproject
git subtree split -P project_subdir -b project_branch
popd
mkdir project
pushd project
git init
git pull ../superproject project_branch

Alternatively, to rewrite history so it excludes everything outside a subdirectory:

pushd superproject
cd ..
git clone superproject subproject
pushd subproject
git filter-branch \
    --subdirectory-filter project_subdir \
    --prune-empty -- \
    --all

8.4 Download a subdirectory from a Git tree

This heinous hack works for GitHub, at least.

  1. Replace tree/master with trunk.
  2. Run svn co on the new URL.
svn co https://github.com/buckyroberts/Source-Code-from-Tutorials/trunk/Python

9 Deleting all the tags

git tag -l | xargs -I %% git push -v origin :refs/tags/%%
git tag -l | xargs git tag -d && git fetch -t

10 Conventions

Many workflows are feasible with Git. Large teams often have elaborate conventions for branch names, and for who gets to merge what with whom. Here are some I have seen in the wild:

Some of these systems have associated helpers; see next.

11 Helpers

Git has various layers of abstraction: a very basic infrastructure of plumbing; then notionally user-friendly, higher-level porcelain commands (which the Git authors say are user-friendly, but many of us regard as “a passable first attempt at best”); and then helper commands that smooth over common workflows and pain points.

11.1 gitflow

Formerly gitflow (source nvie/gitflow), now called git-flow-next

If […] you are building software that is explicitly versioned, or if you need to support multiple versions of your software in the wild, then git-flow may still be as good of a fit to your team as it has been to people in the last 10 years.

Source: gittower/git-flow-next

11.2 git-worktree

This is pretty new and actually useful for vibe-coding; it lets us create multiple mini-clones of the same repo at once, so we can work on multiple branches in parallel.

I always end up using git worktree the same way (create a worktree with a branch of the same name, then start editing), so I wrote a helper function to do that for me.

# ~/.config/fish/functions/mkworktree.fish
function mkworktree --description "Create a git worktree and open in VS Code"
    set branch_name $argv[1]

    if test -z "$branch_name"
        echo "Error: Branch name is required"
        echo "Usage: mkworktree BRANCH_NAME"
        return 1
    end

    set repo_name (basename $PWD)
    set base_dir (dirname $PWD)
    set branches_dir "$base_dir/$repo_name"_branches

    # Create branches directory if it doesn’t exist
    if not test -d "$branches_dir"
        mkdir -p "$branches_dir"
    end

    # Create worktree
    git worktree add -b "$branch_name" "$branches_dir/$branch_name"

    if test $status -eq 0
        # Open in VS Code
        code -n "$branches_dir/$branch_name"
        echo "Worktree created and opened in VS Code: $branches_dir/$branch_name"
    else
        echo "Error creating worktree"
        return 1
    end
end

The vibe coding workflow has been formalized; for example, in Crystal: Supercharge Your Development with Multi-Session Claude Code Management/stravu/crystal.

11.3 git-branchless

arxanas/git-branchless: Branchless workflow for Git

The branchless workflow is designed for use in a repository with a single main branch that all commits are rebased onto. It improves developer velocity by encouraging fast and frequent commits, and helps developers operate on these commits fearlessly.

In the branchless workflow, the commits you’re working on are inferred based on your activity, so you no longer need branches to keep track of them. Nonetheless, branches are sometimes convenient, and git-branchless fully supports them. If you prefer, you can continue to use your normal workflow and benefit from features like git sl or git undo without going entirely branchless.

11.4 Git undo

Also: Git undo: we can do better

How is it so easy to “lose” your data in a system that’s supposed to never lose your data?

Well, it’s not that it’s too easy to lose your data — but rather, that it’s too difficult to recover it. For each operation you want to recover from, there’s a different “magic” incantation to undo it. All the data is still there in principle, but it’s not accessible to many in practice.

…To address this problem, I offer git undo

11.5 gerrit

Gerrit is a code review system for Git.

11.6 legit

legit simplifies feature branch workflows.

11.7 rerere

Want to avoid repeating yourself during merges and rebases? git rerere automates rebases that you have seen already:

git config --global rerere.enabled true
git config --global rerere.autoupdate true

12 Importing some files from another branch

git checkout my_branch -- my_file/

13 Garbage collecting

In brief, this will purge a lot of stuff from a constipated repo in emergencies:

git reflog expire --expire=now --all && git gc --prune=now

In-depth explanation here.

14 Editing history

14.1 Cleaning out all big files

Figure 2: Every time I find a good picture of an octopus on the internet I put in on my git blog pages

bfg does that:

git clone --mirror git://example.com/some-big-repo.git
cd some-big-repo.git
git repack
bfg --strip-blobs-bigger-than 10M .
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push -f

14.2 Deleting specific things

I think bfg also does this. There is also native support:

git filter-branch -f \
    --index-filter
    'git rm -r --cached --ignore-unmatch unwanted_files'

14.3 Exporting a minimal history from some repository

i.e., exporting a branch for a client/collaborator that we can all still work on in Git, but that doesn’t contain any potentially proprietary stuff from the main repo. Ideally, we give them one commit with no history.

If the merge history is clean, we don’t need to be fancy; if I have a branch that has never merged in any secret info, then I can just push it to a new repository and it won’t bring along any of the secret stuff.

OTOH, research code is often unhygienic and chaotic, so we might need to be more careful.

Option 0: Export a tarball and then forget about Git:

git archive HEAD --format=zip > archive.zip

14.3.1 Option 1: squash the whole thing into a single commit.

I don’t know a faster way to do this than the classic:

git checkout -b temp_branch
git rebase --root -i

Addendum: possibly this bash hack is superior?

git reset $(git commit-tree HEAD^{tree} -m "A new start")

14.3.2 Option 2: Create an orphan branch and copy the files over

Here’s an example.

git checkout --orphan temp_branch
git add -A
git commit -am "Initial commit"
git branch -D master
git branch -m master
git push -f origin master

Merging branches created this way can be tedious, though.

14.3.3 Option 3: Serendipitous orphan

Create an orphaned commit that exactly matches an existing commit:

TREE=`git cat-file -p master |sed '1,/^$/s/^tree //p;d;'`
COMMIT=`echo Truncated tree | git commit-tree $TREE`
git branch truncated-master $COMMIT
git branch backup-master-just-in-case-i-regret-it-later master
git push -f origin truncated-master:master

I think this might let us easily cherry-pick commits onto the new tree and then return to the original.

15 Making git work with a filesystem with broken permissions

For example, we’re editing a git repo on NTFS via Linux and things get silly.

git config core.filemode false

16 Detecting whether there are changes to commit

Thomas Nyman:

if output=$(git status --porcelain) && [ -z "$output" ]; then
  # Working directory clean
else
  # Uncommitted changes
fi

17 Emergency commit

Oh crap I’m leaving the office in a hurry and I just need to get my work into Git ASAP so I can continue on another computer. I don’t care about sensible commit messages because I’m on my own private branch and no one else will see them once I squash the pull request.

I put this little script in a file called gitbang to automate this situation.

#!/usr/bin/env bash
# I’m leaving the office. Capture all changes in my private branch and push to server.
#!/usr/bin/env bash

if output=$(git status --porcelain) && [ -z "$output"]; then
  echo "nothing to commit"
else
  git add --all && git commit -m bang
fi

# Determine branch to use: either first argument or default to current branch’s upstream
branch=${1:-$(git rev-parse --abbrev-ref --symbolic-full-name "@{u}" 2>/dev/null | sed 's|^.*/||')}

# If upstream wasn’t set and no argument provided, fall back to local current branch
if [ -z "$branch" ]; then
  branch=$(git rev-parse --abbrev-ref HEAD)
fi

git pull origin "$branch" \
  && git submodule update --init --recursive \
  && git push origin HEAD:"$branch"

Pro tip: if we use VS Code there’s a feature called Cloud Changes that synchronizes our changes to the cloud, so we can pick up where we left off on another computer without arsing about with git—or so it seems.

18 Git hosting

  • GitHub is the gorilla of git hosting
  • GitLab has open-source and self-hosted options as well as a cloud offering
  • Codeberg is a Gitea host

We don’t need to use a git hosting service at all; it’s just convenient to have a pre-arranged central meeting place. A classic way to avoid needing such a host is…

19 Git email workflows

Learn to use email with git.

20 Content-specific diffing

Tools such as git-latexdiff provide custom diffing for, in this case, LaTeX code. These need to be found on a case-by-case basis.

21 SSH credentials

Managing SSH credentials in git is not obvious. See SSH.

22 Jupyter

For sanity in git + Jupyter, see jupyter.

23 Decent GUIs

See Git GUIs.

24 Which repo are we in?

For fish and bash in the shell, see bash-git-prompt.

25 Data versioning

See data versioning.

26 Working with normies in the cloud

Working with a colleague who doesn’t like git? Try this git collab trick that lets us work in hostile file systems such as OneDrive. Cf. Trying to convince academics to use git - Juulia Suvilehto.

I really mean “try this” in the sense of “I have not tried this but actually I gave up”.

Create a bare repository somewhere on our local disk, outside cloud-synced folders (or use a remote Git host like GitHub; just avoid putting the repo in a cloud-synced folder):

mkdir ~/git-storage/myproject.git
cd ~/git-storage/myproject.git
git init --bare

Prepare the OneDrive workspace:

mkdir ~/OneDrive/myproject
cd ~/OneDrive/myproject

Link the workspace to a bare repo using Git’s clone-with-separation:

git clone --separate-git-dir=~/git-storage/myproject.git /dev/null .
rm .git  # Remove auto-created folder
echo "gitdir: ~/git-storage/myproject.git" > .git

Create the initial commit via a temporary clone:

cd ~/git-storage
git clone myproject.git temp-work
cd temp-work

touch README.md
git add .
git commit -m "Initial commit"
git push origin main

cd ..
rm -rf temp-work

Check out files in the OneDrive workspace:

cd ~/OneDrive/myproject
git reset --hard HEAD

Regular Git operations should work as normal:

# Make changes
echo "New content" >> file.txt
git add .
git commit -m "Update file"
git push

File structure shows a clean separation:

~/git-storage/myproject.git  # Bare repo (Git data)
~/OneDrive/myproject         # Working files (.git file only)

Key advantages: 1. No .git folder synced to the cloud 2. Full Git history stored locally in a bare repo 3. Works with all Git commands as normal 4. Easy to relocate Git data without breaking links

For Windows paths, we use doubled backslashes in the .git file: gitdir: C:\\\\path\\\\to\\\\repo.git