Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9596718C34 for ; Mon, 11 Jan 2016 17:58:18 +0000 (UTC) Received: (qmail 46766 invoked by uid 500); 11 Jan 2016 17:58:18 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 46645 invoked by uid 500); 11 Jan 2016 17:58:17 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 46633 invoked by uid 99); 11 Jan 2016 17:58:17 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Jan 2016 17:58:17 +0000 Received: from mail-lf0-f50.google.com (mail-lf0-f50.google.com [209.85.215.50]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 4D2201A0015 for ; Mon, 11 Jan 2016 17:58:17 +0000 (UTC) Received: by mail-lf0-f50.google.com with SMTP id c192so218847984lfe.2 for ; Mon, 11 Jan 2016 09:58:17 -0800 (PST) X-Received: by 10.25.218.137 with SMTP id r131mr2001899lfg.63.1452535095725; Mon, 11 Jan 2016 09:58:15 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.148.200 with HTTP; Mon, 11 Jan 2016 09:57:56 -0800 (PST) In-Reply-To: References: From: Sean Busbey Date: Mon, 11 Jan 2016 11:57:56 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing) To: dev Content-Type: text/plain; charset=UTF-8 Found the problem (not setting the path to commands in the case where there is a cached install :/ ); have now turned off debug by default. On Mon, Jan 11, 2016 at 9:18 AM, Sean Busbey wrote: > We've had a few precommit jobs fail because the cache for our yetus > install was present but not executable. > > I've turned on debugging so we can try to figure out what's going on > the next time one happens. > > On Fri, Jan 8, 2016 at 7:58 AM, Sean Busbey wrote: >> FYI, I just pushed HBASE-13525 (switch to Apache Yetus for precommit tests) >> and updated our jenkins precommit build to use it. >> >> Jenkins job has some explanation: >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build/ >> >> Release note from HBASE-13525 does as well. >> >> The old job will stick around here for a couple of weeks, in case we need >> to refer back to it: >> >> https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HBASE-Build-deprecated/ >> >> If something looks awry, please drop a note on HBASE-13525 while it remains >> open (and make a new issue after). >> >> >> On Wed, Dec 2, 2015 at 3:22 PM, Stack wrote: >> >>> As part of my continuing advocacy of builds.apache.org and that their >>> results are now worthy of our trust and nurture, here are some highlights >>> from the last few days of builds: >>> >>> + hadoopqa is now finding zombies before the patch is committed. >>> HBASE-14888 showed "-1 core tests. The patch failed these unit tests:" but >>> didn't have any failed tests listed (I'm trying to see if I can do anything >>> about this...). Running our little ./dev-tools/findHangingTests.py against >>> the consoleText, it showed a hanging test. Running locally, I see same >>> hang. This is before the patch landed. >>> + Our branch runs are now near totally zombie and flakey free -- still some >>> work to do -- but a recent patch that seemed harmless was causing a >>> reliable flake fail in the backport to branch-1* confirmed by local runs. >>> The flakeyness was plain to see up in builds.apache.org. >>> + In the last few days I've committed a patch that included javadoc >>> warnings even though hadoopqa said the patch introduced javadoc issues (I >>> missed it). This messed up life for folks subsequently as their patches now >>> reported javadoc issues.... >>> >>> In short, I suggest that builds.apache.org is worth keeping an eye on, >>> make >>> sure you get a clean build out of hadoopqa before committing anything, and >>> lets all work together to try and keep our builds blue: it'll save us all >>> work in the long run. >>> >>> St.Ack >>> >>> >>> On Tue, Nov 4, 2014 at 9:38 AM, Stack wrote: >>> >>> > Branch-1 and master have stabilized and now run mostly blue (give or take >>> > the odd failure) [1][2]. Having a mostly blue branch-1 has helped us >>> > identify at least one destabilizing commit in the last few days, maybe >>> two; >>> > this is as it should be (smile). >>> > >>> > Lets keep our builds blue. If you commit a patch, make sure subsequent >>> > builds stay blue. You can subscribe to builds@hbase.apache.org to get >>> > notice of failures if not already subscribed. >>> > >>> > Thanks, >>> > St.Ack >>> > >>> > 1. https://builds.apache.org/view/H-L/view/HBase/job/HBase-1.0/ >>> > 2. https://builds.apache.org/view/H-L/view/HBase/job/HBase-TRUNK/ >>> > >>> > >>> > On Mon, Oct 13, 2014 at 4:41 PM, Stack wrote: >>> > >>> >> A few notes on testing. >>> >> >>> >> Too long to read, infra is more capable now and after some work, we are >>> >> seeing branch-1 and trunk mostly running blue. Lets try and keep it this >>> >> way going forward. >>> >> >>> >> Apache Infra has new, more capable hardware. >>> >> >>> >> A recent spurt of test fixing combined with more capable hardware seems >>> >> to have gotten us to a new place; tests are mostly passing now on >>> branch-1 >>> >> and master. Lets try and keep it this way and start to trust our test >>> runs >>> >> again. Just a few flakies remain. Lets try and nail them. >>> >> >>> >> Our tests now run in parallel with other test suites where previous we >>> >> ran alone. You can see this sometimes when our zombie detector reports >>> >> tests from another project altogether as lingerers (To be fixed). Some >>> of >>> >> our tests are failing because a concurrent hbase run is undoing classes >>> and >>> >> data from under it. Also, lets fix. >>> >> >>> >> Our tests are brittle. It takes 75minutes for them to complete. Many >>> are >>> >> heavy-duty integration tests starting up multiple clusters and mapreduce >>> >> all in the one JVM. It is a miracle they pass at all. Usually >>> integration >>> >> tests have been cast as unit tests because there was no where else for >>> them >>> >> to get an airing. We have the hbase-it suite now which would be a more >>> apt >>> >> place but until these are run on a regular basis in public for all to >>> see, >>> >> the fat integration tests disguised as unit tests will remain. A >>> review of >>> >> our current unit tests weeding the old cruft and the no longer relevant >>> or >>> >> duplicates would be a nice undertaking if someone is looking to >>> contribute. >>> >> >>> >> Alex Newman has been working on making our tests work up on travis and >>> >> circle-ci. That'll be sweet when it goes end-to-end. He also added in >>> >> some "type" categorizations -- client, filter, mapreduce -- alongside >>> our >>> >> old "sizing" categorizations of small/medium/large. His thinking is >>> that >>> >> we can run these categorizations in parallel so we could run the total >>> >> suite in about the time of the longest test, say 20-30minutes? We could >>> >> even change Apache to run them this way. >>> >> >>> >> FYI, >>> >> St.Ack >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> > >>> >> >> >> >> -- >> Sean