Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CFAB9DADE for ; Sat, 6 Oct 2012 01:40:05 +0000 (UTC) Received: (qmail 59738 invoked by uid 500); 6 Oct 2012 01:40:03 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 59687 invoked by uid 500); 6 Oct 2012 01:40:03 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 59592 invoked by uid 99); 6 Oct 2012 01:40:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Oct 2012 01:40:03 +0000 Date: Sat, 6 Oct 2012 01:40:03 +0000 (UTC) From: "Aaron T. Myers (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <939561131.4226.1349487603688.JavaMail.jiratomcat@arcas> Subject: [jira] [Resolved] (HADOOP-8591) TestZKFailoverController tests time out MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers resolved HADOOP-8591. ------------------------------------ Resolution: Invalid Assignee: Aaron T. Myers I looked into this today and realized that it was a problem with a particular Jenkins slave. Whenever a pre-commit test or nightly build was run on hadoop1, it would fail. Whenever it was run anywhere else, it would pass. When I logged in to hadoop1, I noticed that there were a bunch of pre-commit processes and even a nightly build that had been running for weeks or months. After killing these zombie processes, TestZKFailoverController now passes reliably on hadoop1. > TestZKFailoverController tests time out > --------------------------------------- > > Key: HADOOP-8591 > URL: https://issues.apache.org/jira/browse/HADOOP-8591 > Project: Hadoop Common > Issue Type: Bug > Components: auto-failover, ha, test > Affects Versions: 2.0.0-alpha > Reporter: Eli Collins > Assignee: Aaron T. Myers > Labels: test-fail > > Looks like the TestZKFailoverController timeout needs to be bumped. > {noformat} > java.lang.Exception: test timed out after 30000 milliseconds > at java.lang.Object.wait(Native Method) > at org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:460) > at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:648) > at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) > at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:593) > at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1334) > at org.apache.hadoop.ha.ZKFailoverController.gracefulFailoverToYou(ZKFailoverController.java:590) > at org.apache.hadoop.ha.TestZKFailoverController.testOneOfEverything(TestZKFailoverController.java:575) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira