Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A9BAF1014D for ; Wed, 16 Oct 2013 19:41:49 +0000 (UTC) Received: (qmail 46266 invoked by uid 500); 16 Oct 2013 19:41:44 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 46136 invoked by uid 500); 16 Oct 2013 19:41:43 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 46088 invoked by uid 99); 16 Oct 2013 19:41:42 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Oct 2013 19:41:42 +0000 Date: Wed, 16 Oct 2013 19:41:42 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-9024) TestLogRolling fails/goes zombie MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-9024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-9024: ------------------------- Priority: Critical (was: Major) > TestLogRolling fails/goes zombie > -------------------------------- > > Key: HBASE-9024 > URL: https://issues.apache.org/jira/browse/HBASE-9024 > Project: HBase > Issue Type: Bug > Components: test > Reporter: stack > Priority: Critical > > TestLogRolling.testLogRollOnPipelineRestart failed on hadoop1 here: https://builds.apache.org/job/hbase-0.95/352/consoleText It went zombie. > In the double thread dump on the end: > {code} > "pool-1-thread-1" prio=10 tid=0x73f9dc00 nid=0x3a34 in Object.wait() [0x7517d000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xcf624ad0> (a java.util.concurrent.atomic.AtomicLong) > at org.apache.hadoop.hbase.client.AsyncProcess.waitForNextTaskDone(AsyncProcess.java:634) > - locked <0xcf624ad0> (a java.util.concurrent.atomic.AtomicLong) > at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:659) > at org.apache.hadoop.hbase.client.AsyncProcess.waitUntilDone(AsyncProcess.java:670) > at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:813) > at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1170) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:753) > at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.doPut(TestLogRolling.java:640) > at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.writeData(TestLogRolling.java:248) > at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:515) > {code} > ... we are stuck here. > The math looks like it could go wonky. But looking in the output for the test, it seems that when this test ran we got this: > {code} > 2013-07-23 01:23:29,560 INFO [pool-1-thread-1] hbase.HBaseTestingUtility(922): Minicluster is down > 2013-07-23 01:23:29,574 INFO [pool-1-thread-1] hbase.ResourceChecker(171): after: regionserver.wal.TestLogRolling#testLogRollOnPipelineRestart Thread=39 (was 31) - Thread LEAK? -, OpenFileDescriptor=312 (was 272) - OpenFileDescriptor LEAK? -, MaxFileDescriptor=40000 (was 40000), SystemLoadAverage=351 (was 368), ProcessCount=144 (was 142) - ProcessCount LEAK? -, AvailableMemoryMB=906 (was 1995), ConnectionCount=0 (was 0) > {code} > This test has a history of failures. See HBASE-5995 where it was fixed and reenabled once. Thought was that it was a hadoop2 issue but this cited failure is on hadoop1. -- This message was sent by Atlassian JIRA (v6.1#6144)