Clean up you Git Repo

Doing the “git gc” reduced my git repo from 17 GB to 787 MB !!!

Generally, doing incremental “git gc” is the right approach, and better
than doing “git gc –aggressive”. It’s going to re-use old deltas, and
when those old deltas can’t be found (the reason for doing incremental GC
in the first place!) it’s going to create new ones.

On the other hand, it’s definitely true that an “initial import of a long
and involved history” is a point where it can be worth spending a lot of
time finding the *really*good* deltas. Then, every user ever after (as
long as they don’t use “git gc –aggressive” to undo it!) will get the
advantage of that one-time event. So especially for big projects with a
long history, it’s probably worth doing some extra work, telling the delta
finding code to go wild.

So the equivalent of “git gc –aggressive” – but done *properly* – is to
do (overnight) something like

git repack -a -d –depth=250 –window=250

where that depth thing is just about how deep the delta chains can be
(make them longer for old history – it’s worth the space overhead), and
the window thing is about how big an object window we want each delta
candidate to scan.

And here, you might well want to add the “-f” flag (which is the “drop all
old deltas”, since you now are actually trying to make sure that this one
actually finds good candidates.

And then it’s going to take forever and a day (ie a “do it overnight”
thing). But the end result is that everybody downstream from that
repository will get much better packs, without having to spend any effort
on it themselves.

Linus