Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 4345 invoked from network); 11 Feb 2009 18:55:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Feb 2009 18:55:22 -0000 Received: (qmail 18668 invoked by uid 500); 11 Feb 2009 18:55:21 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 18644 invoked by uid 500); 11 Feb 2009 18:55:21 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 18607 invoked by uid 99); 11 Feb 2009 18:55:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Feb 2009 10:55:21 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Feb 2009 18:55:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E691D234C48C for ; Wed, 11 Feb 2009 10:54:59 -0800 (PST) Message-ID: <1283886002.1234378499943.JavaMail.jira@brutus> Date: Wed, 11 Feb 2009 10:54:59 -0800 (PST) From: "Andrew Purtell (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1196) OOME in HRS IPC server causes infinite client stall In-Reply-To: <325283571.1234378019764.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672720#action_12672720 ] Andrew Purtell commented on HBASE-1196: --------------------------------------- Specifically in the case of my usage pattern, an OOME cascade like the above will damage IPC during a scan, and subsequent writes from the client are what stall forever. > OOME in HRS IPC server causes infinite client stall > --------------------------------------------------- > > Key: HBASE-1196 > URL: https://issues.apache.org/jira/browse/HBASE-1196 > Project: Hadoop HBase > Issue Type: Bug > Reporter: Andrew Purtell > Assignee: Andrew Purtell > Priority: Critical > > OOME in IPC server handler causes the IPC handler to abort, but the client never learns about this, so it waits and waits and waits... I have seen Heritrix writer threads that have been waiting for 7+ hours. And, the OOME does not take down the HRS, so it stays up in some degraded state. E.g.: > java.lang.OutOfMemoryError: Java heap space > Dumping heap to java_pid13008.hprof > Exception in thread "IPC Server handler 5 on 60020" java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2786) > at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at org.apache.hadoop.hbase.util.Bytes.writeByteArray(Bytes.java:82) > at org.apache.hadoop.hbase.io.Cell.write(Cell.java:162) > at org.apache.hadoop.hbase.io.HbaseMapWritable.write(HbaseMapWritable.java:200) > at org.apache.hadoop.hbase.io.RowResult.write(RowResult.java:249) > at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:300) > at org.apache.hadoop.hbase.io.HbaseObjectWritable.write(HbaseObjectWritable.java:262) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:917) > Exception in thread "IPC Server handler 7 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 4 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 2 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 3 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 0 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 6 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 9 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 1 on 60020" java.lang.OutOfMemoryError: Java heap space > Exception in thread "IPC Server handler 8 on 60020" java.lang.OutOfMemoryError: Java heap space -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.