Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 14680200B5A for ; Thu, 4 Aug 2016 18:54:03 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 12CFD160AAB; Thu, 4 Aug 2016 16:54:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5B036160A6A for ; Thu, 4 Aug 2016 18:54:02 +0200 (CEST) Received: (qmail 3406 invoked by uid 500); 4 Aug 2016 16:54:01 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 3395 invoked by uid 99); 4 Aug 2016 16:54:01 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Aug 2016 16:54:01 +0000 Received: from mail-qk0-f171.google.com (mail-qk0-f171.google.com [209.85.220.171]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 32F5F1A003E for ; Thu, 4 Aug 2016 16:54:00 +0000 (UTC) Received: by mail-qk0-f171.google.com with SMTP id p186so110542319qkd.1 for ; Thu, 04 Aug 2016 09:54:00 -0700 (PDT) X-Gm-Message-State: AEkoout3QJWEBjosoLUWIKPafF2jPagxqe8PJX/kC6rt2TxBZRzhb/ANJHF3lEAF8fPk9AXs9Ew/nY6jBAKcDQ== X-Received: by 10.55.100.21 with SMTP id y21mr7317740qkb.274.1470329640108; Thu, 04 Aug 2016 09:54:00 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Christopher Date: Thu, 04 Aug 2016 16:53:49 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [DISCUSS] Time for a 1.8.0 release? To: dev@accumulo.apache.org Content-Type: multipart/alternative; boundary=94eb2c05602ee72776053941cb86 archived-at: Thu, 04 Aug 2016 16:54:03 -0000 --94eb2c05602ee72776053941cb86 Content-Type: text/plain; charset=UTF-8 Yeah, that's pretty interesting. I didn't know you could do a multi-build like that, a separate build for each IT. It does look like we'll have to keep the list of ITs up-to-date in the job configuration, but that's a simple `find . -name '*IT.java' -exec basename {} .java \;` This is probably overall much more expensive to run (a lot of redundant compiling), but it does seem to make it possible to run these in ASF Jenkins now. I wish we had had this before now. We probably don't want to do this for all branches all the time, but certainly for the ones we're targeting for a release, and master (or whatever branch active devel is occurring on) during other times. On Thu, Aug 4, 2016 at 12:41 PM Michael Wall wrote: > Sean, > > I'm interested. How do I get granted more permissions? I can't see the > configuration you used, but I can launch a new build. > > Mike > > On Thu, Aug 4, 2016 at 12:24 PM, Sean Busbey wrote: > > > On Wed, Aug 3, 2016 at 5:17 PM, Christopher wrote: > > > On Wed, Aug 3, 2016 at 5:47 PM Sean Busbey > wrote: > > > > > >> My understanding was that maintenance releases (aka double dot, e.g. > > >> 1.7.2) had relaxed criteria because we expected the scope of changes > > >> in them to be more limited. Even so, the release notes for 1.7.2, > > >> 1.7.1, and 1.7.0 all claim the ITs passed. > > >> > > >> > > > Even those releases have periodic IT failure. > > > > > > > > >> Is there a reason we can't parallelize the ITs? > > > > > > > > > We can. Eric's mrit effort was all intended towards that. But, that's > not > > > the same as CI passing. I don't know what it would take to parallelize > > them > > > in a CI server. > > > > > > > > >> What's stopping > > >> builds.a.o from running them? Specific requests from projects to asf > > >> infra can get us resources if that's the problem. > > >> > > >> > > > I spoke to infra in HipChat about this a a few weeks ago, and > mentioned a > > > few things which impact builds on ASF jenkins (builds.apache.org): > > > > > > 1. Accumulo has an excessive number of tests to run. > > > 2. Build timeouts with Jenkins can abort builds. > > > 3. Tests are timing sensitive, and are affected by VM/host > configuration > > > and contention with other concurrent builds from other projects. > > > 4. Tests need lots of RAM and storage (at least 4GB RAM, but ideally no > > > less than 16GB, and at least 6 GB for a workspace) > > > 5. Tests need specialized system configuration, (increasing ulimits, > > > optimizing kernel settings for swappiness, etc.) > > > > > > What we really need for reliable IT passing in CI, is exclusive use of > > > dedicated, bare-metal beefy build machines, for 6+ hours per build x 4 > > > branches minimum, plus another 6+ hours for each pull request and other > > > builds which skipITs, so we can get immediate feedback on unit tests > and > > > compilation errors. > > > > > > > I took a first pass at a nightly (~once per 12 hours) job on asf build > for > > master and it did okay, considering that I haven't spent any time trying > to > > tune anything: > > > > https://builds.apache.org/job/Accumulo-master-IT/1/ > > > > 2 hr 9 min, 7 failures out of 202 tests. > > > > I think we can do this; if anyone else is interested I'll start a new > > thread > > where we can discuss. > > > > > > > > -- > > busbey > > > --94eb2c05602ee72776053941cb86--