[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Savannah-hackers-public] FYI, ran "git gc" on all git repositories
From: |
Sylvain Beucler |
Subject: |
Re: [Savannah-hackers-public] FYI, ran "git gc" on all git repositories |
Date: |
Fri, 9 Oct 2009 20:24:57 +0200 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Fri, Oct 09, 2009 at 07:57:52PM +0200, Jim Meyering wrote:
> Sylvain Beucler wrote:
> > On Fri, Oct 09, 2009 at 04:27:32PM +0200, Jim Meyering wrote:
> >> I did the same thing a few months ago.
> >> For some it made a big difference: emacs.git went from 1.1GB to 155MB.
> >> Active repositories were shrunk to ~20% or even 5% of their original size.
> >>
> >> This is the script I ran:
> >>
> >> #!/bin/bash
> >> log=$(mktemp /tmp/log-repo-gc-XXXXXX)
> >> printf "Run this to see more detail:\ntail -f $log\n"
> >> exec >$log
> >>
> >> cd /vservers/vcs-noshell/srv/git
> >>
> >> for dir in *.git; do
> >> echo $dir... 1>&2
> >> start_kb=$(du -sk $dir|cut -f1)
> >> printf '%-20s %u KiB->' $dir $start_kb
> >> start_sec=$(date +%s)
> >> git --git-dir=$dir gc
> >> end_sec=$(date +%s)
> >> elapsed=$((end_sec - start_sec))
> >> end_kb=$(du -sk $dir|cut -f1)
> >> percent_saved=$(echo "scale=2; 100 * ($start_kb - $end_kb) /
> >> $start_kb"|bc)
> >> printf '%s (saved %s%% in %ss)\n' $end_kb $percent_saved $elapsed
> >> done
> >
> > Cool. Nice optimization.
> >
> > I wonder what kind of effects this have though.
>
> It's certainly safe.
> I've done this numerous times on savannah, and on other systems,
> with no ill effects, other than for those unlucky
> enough to have to use git-over-http.
>
> > Possibly HTTP users will have to download a big file (but then
> > shouldn't use http ;))
>
> Right on both counts.
>
> > We should ask Petr Baudis from repo.or.cz, I think there's been a
> > discussion a year or two ago, and they weren't sure -- or fix git into
> > doing it automatically if that's feasible.
>
> One way or another, it's worth automating.
I think the issues was weird branching, maybe branching from remotes
that were deleted _and_ in forks using repo.or.cz space-efficient
local forks - but I may be wrong.
Last you ran a massive gc was 2009-04-12 - btw it's good to mention
this in the ChangeLog too :)
--
Sylvain