netbeans-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emilian Bold <emilian.b...@gmail.com>
Subject Re: Switching to Git was: Version control advice
Date Thu, 24 Nov 2016 19:07:19 GMT
At under 1GB the repository size is not an issue anymore.

It's sad to see we will still have migration problems due to legal
considerations.

Could you provide an estimate how long it would take to verify and
whitelist the entire codebase Oracle plans on donating?

It's unclear to me how history would be preserved with an incremental
approach.

I would prefer we migrate the whole thing in one piece with history and all.


--emi

On Thu, Nov 24, 2016 at 5:22 PM, Jaroslav Tulach <jaroslav.tulach@oracle.com
> wrote:

> Emilian, Jan, Mark, great work.
>
> Smooth migration from Hg to Git is essential for successful migration to
> Apache. Thanks a lot for investigating how to do that.
>
> My plan (as described in another email) is to prepare the code donation in
> Hg
> and update it incrementally with code integrated into Hg.
>
> Are your conversions methods ready for incremental updates or do they only
> work as a one-time batch conversion?
>
> -jt
>
> On čtvrtek 24. listopadu 2016 10:41:50 CET Jan Lahoda wrote:
> > Interesting. I tried "git gc --aggressive" on the Mark's converted
> > repository, and the result is:
> > netbeans-import/.git$ du -hs .
> > 792M    .
> >
> > The original was:
> > netbeans-import.git $ du -hs .
> > 3,5G    .
> >
> > (IIRC Mark was converting http://hg.netbeans.org/main, not releases, so
> the
> > repository is a little bit smaller than the releases one.)
> >
> > I tried:
> > $ git log -p | sha1sum
> >
> > on both repositories, and the hashes appear to be the same. I also tried
> to
> > clone the gc-ed repository using git clone --bare --no-local, and the
> > resulting repository is still about the same size. So, this seems good to
> > me, unless there is some downside I don't know about.
> >
> > Jan
> >
> >
> > On Wed, Nov 23, 2016 at 8:26 PM, Emilian Bold <emilian.bold@gmail.com>
> >
> > wrote:
> > > Actually I don't believe the data loss is that large. (There may also
> be
> > > mercurial commits that are intentionally ignored by the conversion
> script,
> > > like commits that only add tags?)
> > >
> > > hg log | grep '^changeset:' | wc -l
> > >
> > >   313209
> > >
> > > git log | grep '^commit ' | wc -l
> > >
> > >   301478
> > >
> > > So there is a difference of 11731 commits (about 4%) but those couldn't
> > > have such a large impact on repository size.
> > >
> > > I hope somebody else is willing to work with me on this so we document
> > > everything and do a reproducible repository conversion.
> > >
> > >
> > >
> > > --emi
> > >
> > > On Wed, Nov 23, 2016 at 9:10 PM, Emilian Bold <emilian.bold@gmail.com>
> > >
> > > wrote:
> > > > Well, I dunno what black magic `gc --aggressive` does but the
> repository
> > > > is 0.85GB now!
> > > >
> > > > I also ran `git reflog expire` first but it didn't change the size at
> > >
> > > all.
> > >
> > > > One thing to keep in mind is that I used --force although I had 6
> > > > commits
> > > > with the warning "repository has at least one unnamed head". Which
> were
> > > > probably all close branch commits (hg commit --close-branch).
> > > >
> > > > So I might have have data loss(!) since I believe I read
> > >
> > > hg-fast-export.sh
> > >
> > > > picks only one unnamed head as the migration winner. I wonder if the
> gc
> > > > command didn't just purge a lot of valid commits from such an unnamed
> > >
> > > head
> > >
> > > > and that's why the repository became so small.
> > > >
> > > > Could somebody else try a test repository conversion and validate my
> > > > numbers?
> > > >
> > > > git gc --aggressive --prune=now
> > > > Counting objects: 4085031, done.
> > > > Delta compression using up to 8 threads.
> > > > Compressing objects: 100% (2909203/2909203), done.
> > > > Writing objects: 100% (4085031/4085031), done.
> > > > Total 4085031 (delta 2150468), reused 1585934 (delta 0)
> > > > Checking connectivity: 4085031, done.
> > > >
> > > >
> > > >
> > > > --emi
> > > >
> > > > On Wed, Nov 23, 2016 at 7:59 PM, Paul Merlin <paulmerlin@apache.org>
> > > >
> > > > wrote:
> > > >> Hi Emilian,
> > > >>
> > > >> > I see hg-fast-export.sh finished at some point.
> > > >> >
> > > >> > As expected though, git does not have any of the disk space gains.
> > > >> > The
> > > >> > converted git releases/ repository is 3.6GB.
> > > >>
> > > >> Just a thought.
> > > >> Did you try some git cleanups after the conversion?
> > > >>
> > > >> git reflog expire --expire=now --all
> > > >> git gc --aggressive --prune=now
> > > >>
> > > >> Cheers
> > > >>
> > > >> > In case these statistics mean something:
> > > >> >
> > > >> > git-fast-import statistics:
> > > >> > ------------------------------------------------------------
> ---------
> > > >> > Alloc'd objects:    4090000
> > > >> > Total objects:      4085509 (  40220100 duplicates
>   )
> > > >> >
> > > >> >       blobs  :      1036365 (  28386238 duplicates     858087
> deltas
> > >
> > > of
> > >
> > > >> > 969684 attempts)
> > > >> >
> > > >> >       trees  :      2735935 (  11833862 duplicates    1370606
> deltas
> > >
> > > of
> > >
> > > >> >  2613480 attempts)
> > > >> >
> > > >> >       commits:       313209 (         0 duplicates          0
> deltas
> > >
> > > of
> > >
> > > >> >      0 attempts)
> > > >> >
> > > >> >       tags   :            0 (         0 duplicates          0
> deltas
> > >
> > > of
> > >
> > > >> >      0 attempts)
> > > >> >
> > > >> > Total branches:        1283 (       346 loads     )
> > > >> >
> > > >> >       marks:        1048576 (    313209 unique    )
> > > >> >       atoms:         124011
> > > >> >
> > > >> > Memory total:        218429 KiB
> > > >> >
> > > >> >        pools:         26711 KiB
> > > >> >
> > > >> >      objects:        191718 KiB
> > > >> >
> > > >> > ------------------------------------------------------------
> ---------
> > > >> > pack_report: getpagesize()            =       4096
> > > >> > pack_report: core.packedGitWindowSize = 1073741824
> > > >> > pack_report: core.packedGitLimit      = 8589934592
> > > >> > pack_report: pack_used_ctr            =   39000045
> > > >> > pack_report: pack_mmap_calls          =     733040
> > > >> > pack_report: pack_open_windows        =          4 /        
 7
> > > >> > pack_report: pack_mapped              = 4280730006 / 6950823920
> > > >> > ------------------------------------------------------------
> ---------
> > > >> >
> > > >> >
> > > >> > --emi
> > > >> >
> > > >> > On Fri, Nov 18, 2016 at 1:32 PM, Emilian Bold <
> emilian.bold@gmail.com
> > > >> >
> > > >> > wrote:
> > > >> >> A releases/ clone which on my system takes 3.8GB is reduced
to
> 1.6GB
> > > >>
> > > >> with
> > > >>
> > > >> >> the generaldelta and aggressivemergedeltas flags (took about
14
> > >
> > > hours).
> > >
> > > >> >> Pretty impressive!
> > > >> >>
> > > >> >> Converting to git with hg-fast-export.sh complains that
> "repository
> > > >>
> > > >> has at
> > > >>
> > > >> >> least one unnamed head" for about 6 revisions. With --force
I'm
> able
> > >
> > > to
> > >
> > > >> >> start the conversion but it hasn't finished yet.
> > > >> >>
> > > >> >> The git conversion is about 35% done and already using 1.3GB.
> > > >> >>
> > > >> >> So... I assume it's going to need just like the original
> repository
> > > >>
> > > >> about
> > > >>
> > > >> >> 3.8GB.
> > > >> >>
> > > >> >> I wonder if git has similar space-saving tricks?
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> --emi
> > > >> >>
> > > >> >> On Thu, Nov 17, 2016 at 8:46 AM, Emilian Bold <
> > >
> > > emilian.bold@gmail.com>
> > >
> > > >> >> wrote:
> > > >> >>> Forgot about this. I've just started the Mercurial repository
> > > >>
> > > >> conversion
> > > >>
> > > >> >>> which will take a few hours.
> > > >> >>>
> > > >> >>> Will report tomorrow or when it's done.
> > > >> >>>
> > > >> >>>
> > > >> >>> --emi
> > > >> >>>
> > > >> >>> On Wed, Nov 16, 2016 at 11:18 PM, cowwoc <
> cowwoc@bbs.darktech.org>
> > > >>
> > > >> wrote:
> > > >> >>>> Hi Emilian,
> > > >> >>>>
> > > >> >>>> Any update on this?
> > > >> >>>>
> > > >> >>>> Thanks,
> > > >> >>>> Gili
> > > >> >>>>
> > > >> >>>> On 2016-11-11 01:33 (-0500), Emilian Bold <e...@gmail.com>
> wrote:
> > > >> >>>>> Thank you for following through with this after
we talked on
> > > >> >>>>> IRC.>
> > > >> >>>>>
> > > >> >>>>> I will check later the size reduction for the
releases/ repo.>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message