Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@locus.apache.org Received: (qmail 61360 invoked from network); 16 Jan 2009 01:01:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Jan 2009 01:01:21 -0000 Received: (qmail 31052 invoked by uid 500); 16 Jan 2009 01:01:20 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 31026 invoked by uid 500); 16 Jan 2009 01:01:20 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 31008 invoked by uid 99); 16 Jan 2009 01:01:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2009 17:01:20 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jan 2009 01:01:19 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8AC53234C485 for ; Thu, 15 Jan 2009 17:00:59 -0800 (PST) Message-ID: <2059614202.1232067659557.JavaMail.jira@brutus> Date: Thu, 15 Jan 2009 17:00:59 -0800 (PST) From: "stack (JIRA)" To: hbase-dev@hadoop.apache.org Subject: [jira] Commented: (HBASE-1127) OOME running randomRead PE In-Reply-To: <340889276.1231999559664.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664343#action_12664343 ] stack commented on HBASE-1127: ------------------------------ Been looking at this more. I can watch the GC doing Full, Full, Full, but our executor thread checking SoftValueMap reference queues is not clearing anything. Then we OOME. I tried various things including a thread per instance of BlockFSInputStream just blocked on the reference queue waiting for the GC to add stuff. Odd is that even in this case, we OOME though we get a bit further. Changing the interval between when our executor thread runs from 10 seconds to 1 second makes it so the executor now does clearing of reference queues but again its not enough. We'll OOME at about same place as we do when we have a thread per BlockFSInputStream instance (A thread per instance won't fly so this is good) I'm going to look at this a little more. In times of high memory pressure, its as though the GC gives up adding items to reference queues which wouldn't seem to make sense. Given that we're up against the RC, I am currently thinking that I'll revert to having blockcache on by default and instead let users choose it explicitly (with the checker running every second). I'll leave it on in catalog tables so meta content has block cache on. > OOME running randomRead PE > -------------------------- > > Key: HBASE-1127 > URL: https://issues.apache.org/jira/browse/HBASE-1127 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Priority: Blocker > Fix For: 0.19.0 > > > Blockcache is misbehaving on TRUNK. Something is broke. We OOME about 20% into the randomRead test. Looking at heap, its all soft references. Instrumenting the referencequeue, we're never clearing full gc'ing. Something is off. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.