- Learning git
- Handy git commands
- What git calls things
- Filters
- Commit hooks
- Subtrees/submodules/subprojects/subdirs/subterranean mole people
- Deleting all tags
- Helpers
- Importing some files across a branch
- Garbage collecting
- Editing history
- Making it work with a broken-permissioned FS
- Detecting if there are changes to commit
- Emergency commit
- Content-specific diffing
- SSH credentials
- Jupyter
- Decent GUIs
- Which repo am I in?
- Data versioning
By Kerryn Wood
My own git notes.
See also the more universally acclaimed classic git tips.
There are better learning resources than this. Some are noted here, in fact.
Learning git
See the fastai masterclass for many more helpful tips/links/scripts/recommendations. Learn Git Branching explains the mechanics in a friendly fashion.
Handy git commands
merging
During a merge, git checkout --theirs filename
(or --ours
) will checkout respectively their, or my, version.
The following sweet hack will resolve all files accordingly:
grep -lr '<<<<<<<' . | xargs git checkout --theirs
TODO: Surely I can find conflicted files using git natively without grep. Should look that up.
Searching
…for a matching file
…for a matching commit
Easy, except for the abstruse naming;
It is called “pickaxe” and spelled -S
.
git log -Sword
Clone a single branch
git clone --single-branch --branch <branchname> <remote-repo>
Remove file from versioning without deleting my copy
git rm --cached blah.tmp
Ignore macOS .DS_Store
files
echo .DS_Store >> .gitignore_global
git config --global core.excludesfile $HOME/.gitignore_global
delete remote branch
git push <remote_name> --delete <branch_name>
Push to a non-obvious branch
git push origin HEAD:refs/heads/backdoor
This is almost obvious except the git naming of things seem arbitrary.
Why refs/heads/SOMETHING
? Well…
What git calls things
By which I mean that which is formally referred to as git references.
git references is the canonical description of the mechanics here.
tl;dr the most common names are refs/heads/SOMETHING
for branch SOMETHING
, refs/tags/SOMETHING
and remotes/SOMEREMOTE/SOMETHING
for (last known state of) a remote branch.
As alexwlchan explains, these references are friendly names for commits.
The uses are (at least partly) convention and other references can be used too.
For example gerrit
uses refs/for/
for code review purposes.
Filters
Commands applied to your files on the way in and out of the repository.
Keywords, smudge
, clean
, .gitattr
These are a long story, but not so complicated in practice.
A useful one is stripping crap from jupyter notebooks.
Commit hooks
For doing stuff before you put it in cold storage. For me this means, e.g asking DID YOU REALLY WANT TO INCLUDE THAT GIANT FILE?
Here is a commit hook that does exactly that. I made a slightly modernized version:
curl -L https://gist.github.com/danmackinlay/6e4a0e5c38a43972a0de2938e6ddadba/raw/install.sh | bash
After that installation you can retrofit the hook to an existing repository thusly
p -R ~/.git_template/hooks .git/
There are various frameworks for managing hooks, if you have lots.
For example,
pre-commit is a mini-system for managing git hooks, based on python.
Husky is a node.js
-based one.
I am not sure whether hook management system actually save time overall for a solo developer, since the kind of person who remembers to install a pre-commit hock is also the kind of person who is relatively less likely to need one. Also it is remarkably labour-intensive to install the dependencies for all these systems, so if you are using heterogeneous systems this becomes tedious.
Subtrees/submodules/subprojects/subdirs/subterranean mole people
Sub-projects inside other projects? External projects? The simplest way of integrating external projects is as subtrees. Once this is set up you can mostly ignore them. Alternatively there are submodules, which have various complications.
Alternatively there is the subtrac
system, which I have not yet used.
Splicing a subtree onto a project
Creatin’:
git fetch remote branch
git subtree add --prefix=subdir remote branch --squash
Updatin’:
git fetch remote branch
git subtree pull --prefix=subdir remote branch --squash
git subtree push --prefix=subdir remote branch --squash
Con: Rebasin’ with a subtree in your repo is slow and involved.
Pruning off a sub-project
Use subtree split
to prise out one chunk. It
has some wrinkles
but is fast and easy.
pushd superproject
git subtree split -P project_subdir -b project_branch
popd
mkdir project
pushd project
git init
git pull ../superproject project_branch
Alternatively, to comprehensively rewrite history to exclude everything outside a subdir:
pushd superproject
cd ..
git clone superproject subproject
pushd subproject
git filter-branch \
--subdirectory-filter project_subdir \
--prune-empty -- \
--all
Submodules
Include external projects as separate repositories within a repository is also possible, but I won’t document it here, since it’s well documented elsewhere, and I use it less. NB: much discipline required to make it go.
Subtrac
Have not yet tried.
subtrac
is a helper tool that makes it easier to keep track of your git submodule contents. It collects the entire contents of the entire history of all your submodules (recursively) into a separate git branch, which can be pushed, pulled, forked, and merged however you want.
Download a sub-directory from a git tree
This works for github at least. I think anything running git-svn
?
- replace tree/master => trunk
- svn co the new url
svn co https://github.com/buckyroberts/Source-Code-from-Tutorials/trunk/Python
Helpers
gerrit
Gerrit is a code review system for git.
legit
legit
simplifies feature branch workflows.
rerere
Not repeating yourself during merges? git rerere automates this:
git config --global rerere.enabled true
git config --global rerere.autoupdate true
Importing some files across a branch
git checkout my_branch -- my_file/
Garbage collecting
In brief, this will purge a lot of stuff from a constipated repo in emergencies:
git reflog expire --expire=now --all && git gc --prune=now
Editing history
Cleaning out all big files
Every time I find a good picture of an octopus on the internet I put in on my git blog pages
bfg
does that:
git clone --mirror git://example.com/some-big-repo.git
java -jar bfg.jar --strip-blobs-bigger-than 10M some-big-repo.git
cd some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push
Deleting specific things
I think bfg
also does this. There is also native support:
git filter-branch -f \
--index-filter
'git rm -r --cached --ignore-unmatch unwanted_files'
Making it work with a broken-permissioned FS
e.g. you are editing a git repo on NTFS via Linux and things are silly.
git config core.filemode false
Detecting if there are changes to commit
if output=$(git status --porcelain) && [ -z "$output" ]; then
# Working directory clean
else
# Uncommitted changes
fi
Emergency commit
Oh crap I’m leaving the office in a hurry and I just need to get my work into git ASAP for continuing on another computer. I don’t care about sensible commit messages because I am on my own private branch and no-one else will see them when I squash the pull request.
I put this little script in a file called gitbang
to automate the this case.
#!/usr/bin/env bash
# I’m leaving the office. Capture all changes in my private branch and push to server.
if output=$(git status --porcelain) && [ -z "$output" ]; then
echo "nothing to commit"
else
git add --all && git commit -m bang
fi
git pull && git submodule update --init --recursive && git push
Content-specific diffing
Tools such as git-latexdiff provide custom diffing for, in this case, LaTeX
code.
These need to be found on a case-by-case basis.
SSH credentials
Managing SSH credentials in git is non-obvious. See SSH.
Jupyter
For sanity in git+jupyter, see jupyter
.
Decent GUIs
See Git GUIs.
Which repo am I in?
For fish
and bash
shell, see
bash-git-prompt.
Data versioning
See data versioning.
No comments yet!