- Learning git
- Handy git config
- Handy git commands
- What git calls things
- Commit hooks
- Subtrees/submodules/subprojects/subdirs/subterranean mole people
- Deleting all tags
- Importing some files across a branch
- Garbage collecting
- Editing history
- Making it work with a broken-permission FS
- Detecting if there are changes to commit
- Emergency commit
- Git hosting
- git email workflows
- Content-specific diffing
- SSH credentials
- Decent GUIs
- Which repo am I in?
- Data versioning
My own git notes, not intended to be tutorial; there are better learning resources than this. Some are noted here, in fact. See also the more universally acclaimed classic git tips.
Handy git config
git config --global core.editor "code-insiders --wait" # insiders
echo .DS_Store >> .gitignore_global git config --global core.excludesfile $HOME/.gitignore_global
Handy git commands
During a merge,
git checkout --theirs filename (or
--ours) will checkout respectively their, or my, version.
The following sweet hack will resolve all files accordingly:
grep -lr '<<<<<<<' . | xargs git checkout --theirs --
TODO: Surely I can find conflicted files using git natively without grep. Should look that up.
…for a matching file
…for a matching commit
Easy, except for the abstruse naming;
It is called “pickaxe” and spelled
git log -Sword
Clone a single branch
git clone --single-branch --branch <branchname> <remote-repo>
Remove file from versioning without deleting my copy
git rm --cached blah.tmp
delete remote branch
git push <remote_name> --delete <branch_name>
Push to a non-obvious branch
git push origin HEAD:refs/heads/backdoor
This is almost obvious except the git naming of things seem arbitrary.
What git calls things
By which I mean that which is formally referred to as git references.
git references is the canonical description of the mechanics.
tl;dr the most common names are
refs/heads/SOMETHING for branch
remotes/SOMEREMOTE/SOMETHING for (last known state of) a remote branch.
As alexwlchan explains, these references are friendly names for commits.
The uses are (at least partly) convention and other references can be used too.
refs/for/ for code review purposes.
Commands applied to your files on the way in and out of the repository.
These are a long story, but not so complicated in practice.
A useful one is stripping crap from jupyter notebooks.
For doing stuff before you put it in cold storage. For me this means, e.g asking DID YOU REALLY WANT TO INCLUDE THAT GIANT FILE?
curl -L https://gist.github.com/danmackinlay/6e4a0e5c38a43972a0de2938e6ddadba/raw/install.sh | bash
After that installation you can retrofit the hook to an existing repository thusly
p -R ~/.git_template/hooks .git/
I am not sure whether hook management system actually saves time overall for a solo developer, since the kind of person who remembers to install a pre-commit hook is also the kind of person who is relatively less likely to need one. Also it is remarkably labour-intensive to install the dependencies for all these systems, so if you are using heterogeneous systems this becomes tedious.
To skip the pre-commit hook,
git commit --no-verify
Subtrees/submodules/subprojects/subdirs/subterranean mole people
Sub-projects inside other projects? External projects? The simplest way of integrating external projects is as subtrees. Once this is set up you can mostly ignore them. Alternatively there are submodules, which have various complications. More recently, there is the subtrac system, which I have not yet used.
Include external projects as separate repositories within a repository is
I won’t document it, since it’s well documented elsewhere, and I use it less often
NB: some discipline required to make it go; you need to remember to
git submodule init etc.
Have not yet tried.
subtracis a helper tool that makes it easier to keep track of your git submodule contents. It collects the entire contents of the entire history of all your submodules (recursively) into a separate git branch, which can be pushed, pulled, forked, and merged however you want.
Subtree subsumes one git tree into another is a usually-transparent way (no separate checkout as with submodules.) It can be used for temporary merging or for splicing and dicing projects.
Splicing a subtree onto a project
git fetch remote branch git subtree add --prefix=subdir remote branch --squash
git fetch remote branch git subtree pull --prefix=subdir remote branch --squash git subtree push --prefix=subdir remote branch --squash
Pruning off a sub-project
subtree split to prise out one chunk. It
has some wrinkles
but is fast and easy.
pushd superproject git subtree split -P project_subdir -b project_branch popd mkdir project pushd project git init git pull ../superproject project_branch
Alternatively, to comprehensively rewrite history to exclude everything outside a subdir:
pushd superproject cd .. git clone superproject subproject pushd subproject git filter-branch \ --subdirectory-filter project_subdir \ --prune-empty -- \ --all
Download a sub-directory from a git tree
This works for github at least. I think anything running
- replace tree/master => trunk
- svn co the new url
svn co https://github.com/buckyroberts/Source-Code-from-Tutorials/trunk/Python
The branchless workflow is designed for use in a repository with a single main branch that all commits are rebased onto. It improves developer velocity by encouraging fast and frequent commits, and helps developers operate on these commits fearlessly.
In the branchless workflow, the commits you’re working on are inferred based on your activity, so you no longer need branches to keep track of them. Nonetheless, branches are sometimes convenient, and
git-branchlessfully supports them. If you prefer, you can continue to use your normal workflow and benefit from features like
git undowithout going entirely branchless.
Also : git undo: We can do better
How is it so easy to “lose” your data in a system that’s supposed to never lose your data?
Well, it’s not that it’s too easy to lose your data — but rather, that it’s too difficult to recover it. For each operation you want to recover from, there’s a different “magic” incantation to undo it. All the data is still there in principle, but it’s not accessible to many in practice.
…To address this problem, I offer git undo
Gerrit is a code review system for git.
legit simplifies feature branch workflows.
Not repeating yourself during merges? git rerere automates this:
git config --global rerere.enabled true git config --global rerere.autoupdate true
Importing some files across a branch
git checkout my_branch -- my_file/
In brief, this will purge a lot of stuff from a constipated repo in emergencies:
git reflog expire --expire=now --all && git gc --prune=now
Cleaning out all big files
bfg does that:
git clone --mirror git://example.com/some-big-repo.git cd some-big-repo.git git repack bfg --strip-blobs-bigger-than 10M . git reflog expire --expire=now --all && git gc --prune=now --aggressive git push -f
Deleting specific things
bfg also does this. There is also native support:
git filter-branch -f \ --index-filter 'git rm -r --cached --ignore-unmatch unwanted_files'
Making it work with a broken-permission FS
e.g. you are editing a git repo on NTFS via Linux and things are silly.
git config core.filemode false
Detecting if there are changes to commit
if output=$(git status --porcelain) && [ -z "$output" ]; then # Working directory clean else # Uncommitted changes fi
Oh crap I’m leaving the office in a hurry and I just need to get my work into git ASAP for continuing on another computer. I don’t care about sensible commit messages because I am on my own private branch and no-one else will see them when I squash the pull request.
I put this little script in a file called
gitbang to automate the this case.
#!/usr/bin/env bash # I’m leaving the office. Capture all changes in my private branch and push to server. if output=$(git status --porcelain) && [ -z "$output" ]; then echo "nothing to commit" else git add --all && git commit -m bang fi git pull && git submodule update --init --recursive && git push
Or maybe avoid the need for hosting by using…
git email workflows
Managing SSH credentials in git is non-obvious. See SSH.
For sanity in git+jupyter, see
See Git GUIs.
Which repo am I in?
bash shell, see
See data versioning.