Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0F69110CE1 for ; Sun, 3 May 2015 20:02:41 +0000 (UTC) Received: (qmail 89671 invoked by uid 500); 3 May 2015 20:02:40 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 89576 invoked by uid 500); 3 May 2015 20:02:40 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 89563 invoked by uid 99); 3 May 2015 20:02:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 May 2015 20:02:40 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests= X-Spam-Check-By: apache.org Received-SPF: unknown ~allinclude:_spf.google.com (athena.apache.org: encountered unrecognized mechanism during SPF processing of domain of aw@altiscale.com) Received: from [54.164.171.186] (HELO mx1-us-east.apache.org) (54.164.171.186) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 03 May 2015 20:02:34 +0000 Received: from mail-pd0-f173.google.com (mail-pd0-f173.google.com [209.85.192.173]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 52E0043CB9 for ; Sun, 3 May 2015 20:02:13 +0000 (UTC) Received: by pdbqa5 with SMTP id qa5so144929579pdb.1 for ; Sun, 03 May 2015 13:02:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=altiscale.com; s=google; h=from:content-type:content-transfer-encoding:subject:date:message-id :cc:to:mime-version; bh=xXNoxnhm9mVcCzRmzdHZvDzmNqS+7HUGbcO2GcYY7/c=; b=f2VYerY0pW/lqD2YVYLoNYcnBw3lthYLoeNfPiFjGlJuRYXaNZPJcmiFgiTajllB06 WVX9ICAuHIBHYAspTJGtnFlzOm43br2TJUfJt7b+lKUW0bHcNl+knJQt3ziBxSpDuheL AZvXEZlohpOXdrkBDij1H/br1acBoaoKxh/m8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-type:content-transfer-encoding :subject:date:message-id:cc:to:mime-version; bh=xXNoxnhm9mVcCzRmzdHZvDzmNqS+7HUGbcO2GcYY7/c=; b=cFvwc1ULAmYXYwrisSEkRQEV+c5Azb7ZbUMtBgA95u1H27mGXCXxW28FDAYKjlVxs2 rvzivDgpTmwkQVUJ5t1In/aEhnhQw2Yknk9l6c72bcikFiEkbEr95ijqumh22P2bjqcT 5AWslssWSRKQw6vsCjPzo5exULK8niaRLDu3FiCrgB8rh17fQqXrDmIc45ZZcn7s/cet Mtjf9FZ1uGUwx+/twsD2TxrlOsvdEvzQDq2k4AlPD4VqPiYYLe/iFEN/SWRwF9diI8/2 hutv1eZy9XP9qK7u3TtwU2+y+MpQSSkWJfBgvDYMkrgKgJuwGUmaZc5F1l7zRHivYAVf /4DQ== X-Gm-Message-State: ALoCoQk2TY2RaPCtNdrAgJYgkjydsP4vljJ8mlanfdvoZIHn7vxlXZY6HRs/xYaYz3xlz/ZUBltD X-Received: by 10.68.250.194 with SMTP id ze2mr36960635pbc.24.1430683326528; Sun, 03 May 2015 13:02:06 -0700 (PDT) Received: from dhcp-207.private.iobm.com (nat.iobm.com. [64.142.69.92]) by mx.google.com with ESMTPSA id ae9sm10663386pac.25.2015.05.03.13.02.04 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 03 May 2015 13:02:05 -0700 (PDT) From: Allen Wittenauer Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: we need a fix: precommit failures correlate to hdfs patches Date: Sun, 3 May 2015 13:02:02 -0700 Message-Id: Cc: common-dev@hadoop.apache.org To: hdfs-dev@hadoop.apache.org Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) X-Virus-Checked: Checked by ClamAV on apache.org =09 So, as some may have noticed, I slammed the Jenkins servers over = the weekend to get some recent patch test runs in JIRA for the bug bash = this week. I've had a suspicion for a while now that either the long = run times of the hadoop-hdfs module unit tests (typically 2+ hours) or = the hdfs tests themselves were related to the patch process directory = getting removed out from underneath test-patch. To test the hypothesis, I submitted all of the non-HDFS patches = so that they were first in the queue. Let them run for a very long = time. Jenkins bounced back and forth between YARN, MR, and HADOOP. No = issues encounters. Added HDFS patches into the mix. BOOM. The dreaded = "The patch artifact directory has been removed! =93 started to appear = here and there. This seems to provide some evidence that, yes, hdfs = unit tests are directory or indirectly related to the failures. IMO, I think we need to take a serious look at: * splitting up the hadoop-hdfs module into multiple modules to = reduce unit test run times * checking to see if the pre commit hooks in hdfs are different = than the rest (I do know that the YARN bits are different and appear to = have some bugs as well) * increasing the timeout for jenkins job runs FWIW, I=92ve also found some minor things here and there with = the rewritten test-patch.sh. JIRAs have been filed. One critical, one = major and a handful of minor things. =20=