Git tricks

My own git notes. See also the more universally acclaimed classic git tips.

See the fastai masterclass for many more handy helpful tips/links/scripts/recommendations.

Handy git commands

Clone a single branch

git clone --single-branch --branch <branchname> <remote-repo>

Remove file from versioning without deleting my copy

git rm --cached blah.tmp

Ignore macOS .DS_Store files

echo .DS_Store >> .gitignore_global
git config --global core.excludesfile $HOME/.gitignore_global

delete remote branch

git push <remote_name> --delete <branch_name>

Push to a non-obvious branch

git push origin HEAD:refs/heads/backdoor

This is almost obvious except the git naming of things seem arbitrary. Why refs/heads/SOMETHING? Well…

Git names

By which I mean that which is formally referred to as git references. With that definition I now know what to google for next time. but wait! how about I even link to the thinks I forget?

git references is the canonical description of the mechanics here. tl;dr the most common names are refs/heads/SOMETHING for branch SOMETHING, refs/tags/SOMETHING and remotes/SOMEREMOTE/SOMETHING for (last known state of) a remote branch.

As alexwlchan explains, these references are just friendly names for commits. The uses are (at least partly) convention and other references can be used too. For example gerrit uses refs/for/ for code review purposes.

Filters

Commands applied to your files on the way in and out of the repository. Keywords, smudge, clean, .gitattr These are a long story, but not so cimplicated in practice. A useful one is stripping crap from jupyter notebooks.

Subtrees/submodules/subprojects/subdirs/subterranean mole people

Sub-projects inside other projects? External projects? The simplest way of integrating external projects is as subtrees. Once this is set up you can mostly ignore them. Alternatively there are submodules, which have various complications.

Alternatively there is the subtrac system, which I have not yet used.

Splicing a subtree onto a project

creatin’:

git fetch remote branch
git subtree add --prefix=subdir remote branch --squash

updatin’:

git fetch remote branch
git subtree pull --prefix=subdir remote branch --squash
git subtree push --prefix=subdir remote branch --squash

Con: Rebasin’ with a subtree in your repo is slow and involved.

Pruning off a sub-project

Use subtree split to prise out one chunk. It has some wrinkles but is fast and easy.

pushd superproject
git subtree split -P project_subdir -b project_branch
popd
mkdir project
pushd project
git init
git pull ../superproject project_branch

Alternatively, to comprehensively rewrite history to exclude everything outside a subdir:

pushd superproject
cd ..
git clone superproject subproject
pushd subproject
git filter-branch \
    --subdirectory-filter project_subdir \
    --prune-empty -- \
    --all

Submodules

Include external projects as separate repositories within a repository is also possible, but I won’t document it here, since it’s well documented elsewhere, and I use it less. NB: much discipline required to make it go.

Subtrac

Have not yet tried.

subtrac is a helper tool that makes it easier to keep track of your git submodule contents. It collects the entire contents of the entire history of all your submodules (recursively) into a separate git branch, which can be pushed, pulled, forked, and merged however you want.

Download a sub-directory from a git tree

This works for github at least. I think anything running git-svn?

Heinous hack

  1. replace tree/master => trunk
  2. svn co the new url
svn co https://github.com/buckyroberts/Source-Code-from-Tutorials/trunk/Python

Deleting all tags

git tag -l | xargs -I %% git push -v origin :refs/tags/%%
git tag -l | xargs git tag -d && git fetch -t

Helpers

gerrit

Gerrit is a code review system for git.

legit

legit simplifies feature branch workflows.

rerere

Not repeating yourself during merges? git rerere automates this:

git config --global rerere.enabled true
git config --global rerere.autoupdate true

Importing some files across a branch

git checkout my_branch -- my_file/

Garbage collecting

In brief, this will purge a lot of stuff from a constipated repo in emergencies:

git reflog expire --expire=now --all && git gc --prune=now

In-depth explanation.

Editing history

Cleaning out all big files

bfg does that:

git clone --mirror git://example.com/some-big-repo.git
java -jar bfg.jar --strip-blobs-bigger-than 10M some-big-repo.git
cd some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push

Deleting specific things

I think bfg also does this. There is also native support:

git filter-branch -f \
    --index-filter
    'git rm -r --cached --ignore-unmatch unwanted_files'

Making it work with a broken-permissioned FS

e.g. you are editing a git repo on NTFS via Linux and things are silly.

git config core.filemode false

Detecting if there are changes to commit

Thomas Nyman:

if output=$(git status --porcelain) && [ -z "$output" ]; then
  # Working directory clean
else
  # Uncommitted changes
fi

Emergency commit

Oh crap I’m leaving the office in a hurry and I just need to get my work into git ASAP for continuing on another computer. I don’t care about sensible commit messages because I am on my own private branch and no-one else will see them when I squash the pull request.

I put this little script in a file called gitbang to automate the most common case.

#!/usr/bin/env bash
if output=$(git status --porcelain) && [ -z "$output" ]; then
  echo "nothing to commit"
else
  git add --all && git commit -m bang
fi
git pull && git submodule update --init --recursive  && git push

Content-specific diffing

Tools such as git-latexdiff provide custom diffing for, in this case, LaTeX code.

Jupyter

For sanity in git+jupyter, see jupyter.

Decent GUIs

See Git GUIs.

Which repo am I in?

For fish and bash shell, see bash-git-prompt.

Data versioning

See data versioning.