Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9CAB0200BBE for ; Fri, 11 Nov 2016 07:34:06 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 96965160AF5; Fri, 11 Nov 2016 06:34:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B550B160AE4 for ; Fri, 11 Nov 2016 07:34:05 +0100 (CET) Received: (qmail 5053 invoked by uid 500); 11 Nov 2016 06:34:04 -0000 Mailing-List: contact dev-help@netbeans.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@netbeans.incubator.apache.org Delivered-To: mailing list dev@netbeans.incubator.apache.org Received: (qmail 5037 invoked by uid 99); 11 Nov 2016 06:34:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Nov 2016 06:34:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 20F76C0B47 for ; Fri, 11 Nov 2016 06:34:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id y7k_smD-d4KF for ; Fri, 11 Nov 2016 06:34:00 +0000 (UTC) Received: from mail-qt0-f175.google.com (mail-qt0-f175.google.com [209.85.216.175]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 37C5D5FB1F for ; Fri, 11 Nov 2016 06:34:00 +0000 (UTC) Received: by mail-qt0-f175.google.com with SMTP id w33so4488104qtc.3 for ; Thu, 10 Nov 2016 22:34:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=DwlWZ6rw8TZ8+0PuO5POKMI6g9tCmenSegvSsPc0qB0=; b=rAWSp1t4Rkz7nr4pGOrKifT56tp29KtEkgewvkZth7G8MCR0qkHDnCUDnyWygJrUPq ySe/JhTOaNG4/vSnrNkK1KnYkQO35wIYh/WHBHpOGdn7/otwS7KaKg38Zgs+jGIx6Cmc vab5al2T2KTLlRFa5YEl6jR1RlyymuB/3+TL1p3+K1e+rSOG2iAnNOssA4Y8HDDSPiGl MqMA/UoevTeuzbLcToxGDXQenKu9UL59e+QBF5P06r0B5HzDNakZuc5W1QB5dDOZ69uR uAnjvmq2YCCIyBEf5p9bpvry8BhZFz69KdqZ0O7RnUBpZewL8502/wP+GG9vA/wgw+cA BnrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=DwlWZ6rw8TZ8+0PuO5POKMI6g9tCmenSegvSsPc0qB0=; b=F5BrxryOJJLcGZTUxMJn/AhD0+lYAd33A1n6OeEfFUPqdhpj62Z3P/YvSQyzwzlRdW ak+9IjNcYVIEfEg+IPiB8XniQJjx3Ng7AIHXObTTGNnNsnAbEXnl6d8hB2G7YmWEgw/K 1hMR3gL41Bg+MWDlaUOzi/uVZAQ4/R4Sgqyu19A2LdPWdJ7pg/KeVc7lADbf4xOtUJY+ wIbnNlPSxX42UV1gW1HDWmO2dPy8LqDk2M+AEn3iqnOTZthQ8bcqc0HSmXMAft8f6wr7 LaarAixb6JnbNcz/aVuRt1Krozg1GzMjsF9F5bTBQSZSV4my7GYzadT1+JHNZMBXF9zI fHMA== X-Gm-Message-State: ABUngvcDBaX2HPZNktF9R5fKh2enPedrBUvAOglMW6MvzZBskDOEBxnE8OBhkHjeuSNIcGLkOIfgE7jm5AOdsg== X-Received: by 10.237.37.196 with SMTP id y4mr1714714qtc.8.1478846037625; Thu, 10 Nov 2016 22:33:57 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Emilian Bold Date: Fri, 11 Nov 2016 06:33:46 +0000 Message-ID: Subject: Re: Version control advice To: dev@netbeans.incubator.apache.org Content-Type: multipart/alternative; boundary=001a1141243cc0261d054100aca7 archived-at: Fri, 11 Nov 2016 06:34:06 -0000 --001a1141243cc0261d054100aca7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thank you for following through with this after we talked on IRC. I will check later the size reduction for the releases/ repo. =C3=8En Vin, 11 nov. 2016 la 07:45 Gregory Szorc = a scris: > I'm a Mercurial developer who is also responsible for running > https://hg.mozilla.org/ and supporting Mercurial at Mozilla. I understand > NetBeans is contemplating its version control future because the ASF only > supports Subversion and Git. I think I've learned some things that may be > helpful to you. > > First, the NetBeans "main" repo is on the same order of magnitude (but > marginally smaller than) the Firefox repository in terms of file count an= d > repository data size. So generally speaking, what I have learned supporti= ng > Firefox can apply to NetBeans. > > While I understand Mercurial may not be in your future, I'd like to point > out that hg.netbeans.org is running a very old and very slow version of > Mercurial (likely a release from before July 2010). The high volume of > merge commits in the "main" repo contributes to highly sub-optimal storag= e > utilization in old versions of Mercurial. This makes clones and pulls > significantly slower due to more data to transfer and contributes to > significant CPU load on the server to read/encode the sub-optimal storage > encoding. I wouldn't be surprised if you have CPU load issues on the > server. > > As it is stored today, the "main" repository is almost exactly 3 GB. If y= ou > create a new repository with optimal storage encoding using Mercurial 3.7 > or newer so "generaldelta" is the default storage format and configuring > the repository to recalculate optimal deltas, the repository size drops t= o > ~1.1 GB. This can be done as such: > > $ hg init main-optimal > $ cd main-optimal > $ hg --config format.generaldelta=3Dtrue --config > format.aggressivemergedeltas=3Dtrue pull https://hg.netbeans.org/main > > > Now, for your VCS future. > > I'm a huge proponent of monorepos for productivity reasons. I've seen > discussion on this list about splitting the repo. I would discourage that= . > I'd encourage you to read https://danluu.com/monorepo/ and the linked > articles at the bottom for more on the topic. > > Unfortunately, one of the practical concerns about monorepos is they don'= t > scale with some version control tools, namely Git. This leads many to let > deficiencies in tools drive workflow decisions, which is quite unfortunat= e > because tools should enhance productivity, not hinder it. If NetBeans use= s > Git and maintains the "main" repo as is, I believe you'll experience the > following performance issues now or in the future as the repository keeps > growing: > > * You'll constantly be dealing with CPU explosions on the Git server > generated from clients performing clones and large pulls. GitHub uses a > server infrastructure that caches certain operations related to packfiles > to help mitigate this. I'm not sure the state of ASF's Git server. > > * In many cases, shallow clones can require more CPU on the Git server to > process than full clones. This is because the server essentially has to > read objects from packs and repack things instead of doing a fastpath tha= t > effectively streams a packfile to a client. > > * Garbage collection could be problematic on the server and client > > Now, Git is constantly improving, so these problems may not always > exist.And as much as GitHub does well scaling well - better than a vanill= a > Git install - it isn't a silver bullet. On a few instances, processes at > Mozilla have overwhelmed GitHub and resulted in GitHub disabling access t= o > repositories! That hasn't happened in a while though (partially through > them scaling better and partially through us learning our lesson and not > pointing hundreds of machines at large Git repos). I'm not sure what if > anything ASF's Git server has done to mitigate load from large > repositories. > > It's worth nothing that while some of the server-side CPU issues exist in > default Mercurial installations, there are mitigations. The "clonebundles= " > extension allows a server to advertise pre-generated "bundle" files of > repository content. When a client clones, they download a large bundle fr= om > a static file server then go back to the Mercurial server and get the dat= a > changed since the bundle was created. If you `hg clone > https://hg.mozilla.org/mozilla-unified` > with a modern Mercurial client, > your client will grab a 1+ GB file from a CDN and our servers will spend > maybe 5s of total CPU to service the clone. The clones are faster for > clients and the server can scale clones to nearly infinitely. It is wins > all around. > > Anyway, Mercurial's ability to scale doesn't help you if your choices are > Subversion or Git :/ > > Given those choices, I would lean towards Subversion if you want to > maintain the "main" repo as is. If you use the "main" repo as is with Git= , > you should really do due diligence with the Git server operator to make > sure they won't be overwhelmed. > > If you split the "main" repo, go with Git if your users prefer Git over > Subversion. > > A compromise option would be to keep everything in a monorepo in Subversi= on > and have separate Git repositories for specific subdirectories or "views.= " > This is often a win-win but requires a bit of tooling to do the syncing. > Speaking of syncing, it should be unidirectional: bi-directional syncing = of > anything is a hard problem and take my word from someone who has hacked o= n > bi-directional VCS syncing that it is not something you want to support. > Instead, I recommend abstracting the process of "pushing to the canonical > repo" to something a machine does and have it perform the VCS conversion = to > the canonical repo and do the actual push. e.g. landing something from Gi= t > would have a server fetch that Git ref and replay the commits as Subversi= on > commits (or squash and commit to preserve atomicity). > > Anyway, I think this wall of text is long enough. Reply if you have any > questions. > > Gregory > --001a1141243cc0261d054100aca7--