Return-Path: Delivered-To: apmail-hadoop-hbase-issues-archive@minotaur.apache.org Received: (qmail 16901 invoked from network); 22 Apr 2010 18:36:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 22 Apr 2010 18:36:17 -0000 Received: (qmail 1835 invoked by uid 500); 22 Apr 2010 18:36:17 -0000 Delivered-To: apmail-hadoop-hbase-issues-archive@hadoop.apache.org Received: (qmail 1809 invoked by uid 500); 22 Apr 2010 18:36:17 -0000 Mailing-List: contact hbase-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hbase-issues@hadoop.apache.org Received: (qmail 1801 invoked by uid 99); 22 Apr 2010 18:36:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 18:36:17 +0000 X-ASF-Spam-Status: No, hits=-1333.1 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Apr 2010 18:36:16 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3MIZuMw018872 for ; Thu, 22 Apr 2010 18:35:56 GMT Message-ID: <29470850.145101271961356177.JavaMail.jira@thor> Date: Thu, 22 Apr 2010 14:35:56 -0400 (EDT) From: "ryan rawson (JIRA)" To: hbase-issues@hadoop.apache.org Subject: [jira] Commented: (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859946#action_12859946 ] ryan rawson commented on HBASE-2236: ------------------------------------ seems like we should flush until we are under the max log count by some percent amount, like 20% perhaps. after all flushing logs while under load means we are just potentially playing perpetual catchup while more edits come in. > Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053) > ------------------------------------------------------------------------------ > > Key: HBASE-2236 > URL: https://issues.apache.org/jira/browse/HBASE-2236 > Project: Hadoop HBase > Issue Type: Bug > Reporter: stack > Fix For: 0.20.5, 0.21.0 > > > So hbase-2053 is not aggressive enough. WALs can still overwhelm the upper limit on log count. While the code added by HBASE-2053, when done, will ensure we let go of the oldest WAL, to do it, we might have to flush many regions. E.g: > {code} > 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too many hlogs: logs=45, maxlogs=32; forcing flush of 5 regions(s): test1,193717,1266095474624, test1,194375,1266108228663, test1,195690,1266095539377, test1,196348,1266095539377, test1,197939,1266069173999 > {code} > This takes time. If we are taking on edits a furious rate, we might have rolled the log again, meantime, maybe more than once. > Also log rolls happen inline with a put/delete as soon as it hits the 64MB (default) boundary whereas the necessary flushing is done in background by a single thread and the memstore can overrun the (default) 64MB size. Flushes needed to release logs will be mixed in with "natural" flushes as memstores fill. Flushes may take longer than the writing of an HLog because they can be larger. > So, on an RS that is struggling the tendency would seem to be for a slight rise in WALs. Only if the RS gets a breather will the flusher catch up. > If HBASE-2087 happens, then the count of WALs get a boost. > Ideas to fix this for good would be : > + Priority queue for queuing up flushes with those that are queued to free up WALs having priority > + Improve the HBASE-2053 code so that it will free more than just the last WAL, maybe even queuing flushes so we clear all WALs such that we are back under the maximum WALS threshold again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.