Git tricks

2015-06-26 — 2025-04-02

Suspiciously similar content

My own notes on git, the source control system. This is one of the more heavily documented tools on the internet. As such, notes here are not intended to be a tutorial because the internet is full of those.

1 Learning git

See the fastai masterclass for many more helpful tips/links/scripts/recommendations. Learn Git Branching explains the mechanics in a friendly fashion. Steve Bennett’s 10 things I hate about Git is also useful.

Mark Dominus’ master class:

2 Tips

Universally-acclaimed classic: git tips.

3 Working with normies in the cloud

Working with a colleague who doesn’t like git? Try this git collab trick that lets you work in hostile file systems such as OneDrive.

I really mean “try this” in the sense of “I have not tried this”.

Create bare repository outside cloud-synced locations but on your local disk (OR you can use remote git host like GitHub, etc.; just something that is not on the cloud):

mkdir ~/git-storage/myproject.git
cd ~/git-storage/myproject.git
git init --bare

Prepare OneDrive workspace:

mkdir ~/OneDrive/myproject
cd ~/OneDrive/myproject

Link workspace to bare repo using Git’s clone-with-separation:

git clone --separate-git-dir=~/git-storage/myproject.git /dev/null .
rm .git  # Remove auto-created folder
echo "gitdir: ~/git-storage/myproject.git" > .git

Create initial commit through a temporary clone:

cd ~/git-storage
git clone myproject.git temp-work
cd temp-work

touch README.md
git add .
git commit -m "Initial commit"
git push origin main

cd ..
rm -rf temp-work

Checkout files in OneDrive workspace:

cd ~/OneDrive/myproject
git reset --hard HEAD

4 Verify Workflow

Regular Git operations work normally:

# Make changes
echo "New content" >> file.txt
git add .
git commit -m "Update file"
git push

File structure shows clean separation:

~/git-storage/myproject.git  # Bare repo (Git data)
~/OneDrive/myproject         # Working files (.git file only)

Key advantages: 1. No .git folder synced to cloud 2. Full Git history stored locally in bare repo 3. Works with all Git commands as normal 4. Easy to relocate Git data without breaking links

For Windows paths, use doubled backslashes in .git file: gitdir: C:\\\\path\\\\to\\\\repo.git

5 Handy git config

5.1 git editor

I often do editing in VS code, so it is convenient to set it as git editor:

git config --global core.editor "code-insiders --wait"  # insiders

5.2 .gitignore

Basic level: Ignore macOS .DS_Store files

echo .DS_Store >> .gitignore_global
git config --global core.excludesfile $HOME/.gitignore_global

Galaxy brain version:

gitignore.io- Create Useful .gitignore Files For Your Project

I am fond of the macos,visualstudiocode combo.

6 Handy git commands

6.1 Merging ours/theirs

During a merge, git checkout --theirs filename (or --ours) will checkout respectively their (or our) version. The following sweet hack will resolve all files accordingly:

git diff --name-only --diff-filter=U | xargs git checkout --theirs --

I do this a lot and will never remember the details, so here are some aliases for fish which I can use to make this easier:

echo "alias git_theirs 'git diff --name-only --diff-filter=U | xargs git checkout --theirs --'" >  ~/.config/fish/conf.d/git_theirs.fish
echo "alias git_ours   'git diff --name-only --diff-filter=U | xargs git checkout --ours   --'" >> ~/.config/fish/conf.d/git_theirs.fish
chmod a+x ~/.config/fish/conf.d/git_theirs.fish
source ~/.config/fish/conf.d/git_theirs.fish

6.2 Searching

6.2.1 …for a matching file

git grep

6.2.2 …for a matching commit

Easy, except for the abstruse naming; It is called “pickaxe” and spelled -S.

git log -Sword

6.3 track the history of a file including renames

Kashyap Kondamudi advises Use –follow option in git log to view a file’s history.

git log --oneline --find-renames --stat --follow -- src/somefile.ts

6.4 Clone a single branch

git clone --single-branch --branch <branchname> <remote-repo>

6.5 Remove file from versioning without deleting my copy

git rm --cached blah.tmp

6.6 delete remote branch

git push <remote_name> --delete <branch_name>

6.7 Push to a non-obvious branch

git push origin HEAD:refs/heads/backdoor

This is almost obvious except the git naming of things seems… arbitrary? Why refs/heads/SOMETHING?

Read on.

7 What git calls things

By which I mean that which is formally denoted as git references. git references is the canonical description of the mechanics. tl;dr the most common names are refs/heads/SOMETHING for branch SOMETHING, refs/tags/SOMETHING and remotes/SOMEREMOTE/SOMETHING for (last known state of) a remote branch.

As alexwlchan explains, these references are friendly names for commits, and should be thought of as pointers to commits.

And yet there is something a little magical going on. How come if I pull a branch, I get the latest version of that branch, not the earliest to use that name? Other stuff is happening.

The uses are (at least partly) convention and other references can be used too. For example gerrit uses refs/for/ for code review purposes.

8 Filters

Commands applied to your files on the way in and out of the repository. Keywords, smudge, clean, .gitattr These are a long story, but not so complicated in practice. A useful one is stripping crap from jupyter notebooks.

9 Commit hooks

For doing stuff before you put it in cold storage. e.g., asking DID YOU REALLY WANT TO INCLUDE THAT GIANT FILE?

Here is a commit hook that does exactly that. I made a slightly modernized version:

curl -L https://gist.github.com/danmackinlay/6e4a0e5c38a43972a0de2938e6ddadba/raw/install.sh | bash

UPDATE: I decided this was a waste of time and removed it.

After that installation you can retrofit the hook to an existing repository thusly

p -R ~/.git_template/hooks .git/

There are various frameworks for managing hooks, if you have lots. For example, pre-commit is a mini-system for managing git hooks, based on python. Husky is a node.js-based one.

I am not sure whether hook management systems actually save time overall for a solo developer, since the kind of person who remembers to install a pre-commit hook is also the kind of person who is relatively less likely to need one. Also, it is remarkably labour-intensive to install the dependencies for all these systems, so if you are using heterogeneous systems this becomes tedious.

To skip the pre-commit hook,

git commit --no-verify

10 Subtrees/submodules/subprojects/subdirs/subterranean mole people

Sub-projects inside other projects? External projects? The simplest way of integrating external projects seems to be as subtrees. Once this is set up we can mostly ignore them and things work mostly as expected. Alternatively, there are submodules, which have various complications. More recently, there is the subtrac system, which I have not yet used.

10.1 Submodule

Include external projects as separate repositories within a repository is possible, but I won’t document it, since it’s well documented elsewhere, and I use it less often, because it is fiddly. Some discipline is required to make it go; you need to remember to git submodule init, etc.

10.2 Subtrac

Have not yet tried.

subtrac is a helper tool that makes it easier to keep track of your git submodule contents. It collects the entire contents of the entire history of all your submodules (recursively) into a separate git branch, which can be pushed, pulled, forked, and merged however you want.

10.3 Subtree

Subtree subsumes one git tree into another in a usually-transparent way (no separate checkout as with submodules). It can be used for temporary merging or for splicing and dicing projects.

10.3.1 Splicing a subtree onto a project

Creatin’:

git fetch remote branch
git subtree add --prefix=subdir remote branch --squash

Updatin’:

git fetch remote branch
git subtree pull --prefix=subdir remote branch --squash
git subtree push --prefix=subdir remote branch --squash

Con: Rebasin’ with a subtree in your repo is slow and involved.

10.3.2 Taking a cutting to make a sub-project

Use subtree split to prise out one chunk. It has various wrinkles but is fast and easy.

pushd superproject
git subtree split -P project_subdir -b project_branch
popd
mkdir project
pushd project
git init
git pull ../superproject project_branch

Alternatively, to comprehensively rewrite history to exclude everything outside a subdir:

pushd superproject
cd ..
git clone superproject subproject
pushd subproject
git filter-branch \
    --subdirectory-filter project_subdir \
    --prune-empty -- \
    --all

10.4 Download a sub-directory from a git tree

This works for GitHub at least. I think anything running git-svn?

Heinous hack

replace tree/master => trunk
svn co the new URL

svn co https://github.com/buckyroberts/Source-Code-from-Tutorials/trunk/Python

11 Deleting all tags

git tag -l | xargs -I %% git push -v origin :refs/tags/%%
git tag -l | xargs git tag -d && git fetch -t

12 Conventions

Many possible workflows are feasible with git. Large teams often have elaborate conventions for what to name branches and who gets to merge what with whom. Here are some I have seen in the wild:

Conventional Commits
Trunk-based Development
GitHub flow (which is not the same as gitflow)

Some of these systems have associated helpers, see next.

13 Helpers

Git has various layers of abstraction, from a very basic infrastructure of plumbing, through notionally use-friendly, higher-level porcelain commands which are supposed by authors of git to be user friendly but universally regarded as “a passable first attempt at best”, through to various helper commands that ease various workflows and pain points.

13.1 git-worktree

This is new and in fact useful for vibe-coding; it creates multiple mini-clones of the source tree at the same time so you can work on multiple branches at once.

git-worktree man page is baffling
Git Worktrees and GitButler explains it better

I always end up using git worktree in a very similar way (create worktree, with a branch of the same name, then edit it) so here is a helper function to do that for me.

# ~/.config/fish/functions/mkworktree.fish
function mkworktree --description "Create a git worktree and open in VS Code"
    set branch_name $argv[1]

    if test -z "$branch_name"
        echo "Error: Branch name is required"
        echo "Usage: mkworktree BRANCH_NAME"
        return 1
    end

    set repo_name (basename $PWD)
    set base_dir (dirname $PWD)
    set branches_dir "$base_dir/$repo_name"_branches

    # Create branches directory if it doesn’t exist
    if not test -d "$branches_dir"
        mkdir -p "$branches_dir"
    end

    # Create worktree
    git worktree add -b "$branch_name" "$branches_dir/$branch_name"

    if test $status -eq 0
        # Open in VS Code
        code -n "$branches_dir/$branch_name"
        echo "Worktree created and opened in VS Code: $branches_dir/$branch_name"
    else
        echo "Error creating worktree"
        return 1
    end
end

13.2 git-branchless

arxanas/git-branchless: Branchless workflow for Git

The branchless workflow is designed for use in a repository with a single main branch that all commits are rebased onto. It improves developer velocity by encouraging fast and frequent commits, and helps developers operate on these commits fearlessly.

In the branchless workflow, the commits you’re working on are inferred based on your activity, so you no longer need branches to keep track of them. Nonetheless, branches are sometimes convenient, and git-branchless fully supports them. If you prefer, you can continue to use your normal workflow and benefit from features like git sl or git undo without going entirely branchless.

13.3 git-undo

Also: git undo: We can do better

How is it so easy to “lose” your data in a system that’s supposed to never lose your data?

Well, it’s not that it’s too easy to lose your data — but rather, that it’s too difficult to recover it. For each operation you want to recover from, there’s a different “magic” incantation to undo it. All the data is still there in principle, but it’s not accessible to many in practice.

…To address this problem, I offer git undo

13.4 `gerrit`

Gerrit is a code review system for git.

13.5 `legit`

legit simplifies feature branch workflows.

13.6 `rerere`

Not repeating yourself during merges/rebases? git rerere automates this:

git config --global rerere.enabled true
git config --global rerere.autoupdate true

14 Importing some files across a branch

git checkout my_branch -- my_file/

15 Garbage collecting

In brief, this will purge a lot of stuff from a constipated repo in emergencies:

git reflog expire --expire=now --all && git gc --prune=now

In-depth explanation here.

16 Editing history

16.1 Cleaning out all big files

Figure 2: Every time I find a good picture of an octopus on the internet I put in on my git blog pages

bfg does that:

git clone --mirror git://example.com/some-big-repo.git
cd some-big-repo.git
git repack
bfg --strip-blobs-bigger-than 10M .
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push -f

16.2 Deleting specific things

I think bfg also does this. There is also native support:

git filter-branch -f \
    --index-filter
    'git rm -r --cached --ignore-unmatch unwanted_files'

16.3 Exporting a minimal history from some repository

i.e. Exporting a branch for a client/collaborator, which they should still operate on in git, but which does not contain all the potentially proprietary stuff in the main repo. Ideally they should see one commit with no history.

If the merge history is clean, there is no need to be fancy; if I have a branch which has never merged in any secret information then I can just push it to a new repository and it won’t bring along any of the secret stuff.

OTOH, research code is often unhygienic and chaotic, so we might need to be more careful.

Option 0: Export a tarball and then forget about git:

git archive HEAD --format=zip > archive.zip

16.3.1 Option 1: squash the whole thing onto a single commit.

I don’t know a faster way of doing this than the classic:

git checkout -b temp_branch
git rebase --root -i

Addendum: possibly this bash hack is superior?

git reset $(git commit-tree HEAD^{tree} -m "A new start")

16.3.2 Option 2: create an orphan branch and copy the files over

Here is an example.

git checkout --orphan temp_branch
git add -A
git commit -am "Initial commit"
git branch -D master
git branch -m master
git push -f origin master

Merging with branches created this way can be tedious, however.

16.3.3 Option 3: Serendipitous orphan

create an orphaned commit that exactly matches an existing commit:

TREE=`git cat-file -p master |sed '1,/^$/s/^tree //p;d;'`
COMMIT=`echo Truncated tree | git commit-tree $TREE`
git branch truncated-master $COMMIT
git branch backup-master-just-in-case-i-regret-it-later master
git push -f origin truncated-master:master

I think this possibly allows us to easily cherry-pick commits against the new tree and return to the original.

17 Making git work with a broken-permission FS

e.g. you are editing a git repo on NTFS via Linux and things are silly.

git config core.filemode false

18 Detecting if there are changes to commit

Thomas Nyman:

if output=$(git status --porcelain) && [ -z "$output" ]; then
  # Working directory clean
else
  # Uncommitted changes
fi

19 Emergency commit

Oh crap I’m leaving the office in a hurry and I just need to get my work into git ASAP for continuing on another computer. I don’t care about sensible commit messages because I am on my own private branch and no-one else will see them when I squash the pull request.

I put this little script in a file called gitbang to automate this case.

#!/usr/bin/env bash
# I’m leaving the office. Capture all changes in my private branch and push to server.
#!/usr/bin/env bash

if output=$(git status --porcelain) && [ -z "$output"]; then
  echo "nothing to commit"
else
  git add --all && git commit -m bang
fi

# Determine branch to use: either first argument or default to current branch’s upstream
branch=${1:-$(git rev-parse --abbrev-ref --symbolic-full-name "@{u}" 2>/dev/null | sed 's|^.*/||')}

# If upstream wasn’t set and no argument provided, fall back to local current branch
if [ -z "$branch" ]; then
  branch=$(git rev-parse --abbrev-ref HEAD)
fi

git pull origin "$branch" \
  && git submodule update --init --recursive \
  && git push origin HEAD:"$branch"

Pro tip: if you use VS code there is a feature called Cloud Changes that synchronises your changes to the cloud, so you can pick up where you left off on another computer without arsing about with git, it seems.

20 Git hosting

Github is the gorilla of git hosting
Gitlab has open-source and self-hosted options as well as a cloud offering
Codeberg is a gitea host

One doesn’t need to use a git hosting service at all, it’s just convenient to have a pre-arranged central meeting place. A classic way to avoid the need for such a host is…

21 git email workflows

Learn to use email with git.

22 Content-specific diffing

Tools such as git-latexdiff provide custom diffing for, in this case, LaTeX code. These need to be found on a case-by-case basis.

23 SSH credentials

Managing SSH credentials in git is non-obvious. See SSH.

24 Jupyter

For sanity in git+jupyter, see jupyter.

25 Decent GUIs

See Git GUIs.

26 Which repo am I in?

For fish and bash shell, see bash-git-prompt.

27 Data versioning

See data versioning.

1 Learning git

2 Tips

3 Working with normies in the cloud

4 Verify Workflow

5 Handy git config

5.1 git editor

5.2 .gitignore

6 Handy git commands

6.1 Merging ours/theirs

6.2 Searching

6.2.1 …for a matching file

6.2.2 …for a matching commit

6.3 track the history of a file including renames

6.4 Clone a single branch

6.5 Remove file from versioning without deleting my copy

6.6 delete remote branch

6.7 Push to a non-obvious branch

7 What git calls things

8 Filters

9 Commit hooks

10 Subtrees/submodules/subprojects/subdirs/subterranean mole people

10.1 Submodule

10.2 Subtrac

10.3 Subtree

10.3.1 Splicing a subtree onto a project

10.3.2 Taking a cutting to make a sub-project

10.4 Download a sub-directory from a git tree

11 Deleting all tags

12 Conventions

13 Helpers

13.1 git-worktree

13.2 git-branchless

13.3 git-undo

13.4 gerrit

13.5 legit

13.6 rerere

14 Importing some files across a branch

15 Garbage collecting

16 Editing history

16.1 Cleaning out all big files

16.2 Deleting specific things

16.3 Exporting a minimal history from some repository

16.3.1 Option 1: squash the whole thing onto a single commit.

16.3.2 Option 2: create an orphan branch and copy the files over

16.3.3 Option 3: Serendipitous orphan

17 Making git work with a broken-permission FS

18 Detecting if there are changes to commit

19 Emergency commit

20 Git hosting

21 git email workflows

22 Content-specific diffing

23 SSH credentials

24 Jupyter

25 Decent GUIs

26 Which repo am I in?

27 Data versioning

28 Incoming

13.4 `gerrit`

13.5 `legit`

13.6 `rerere`