Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3821A111FF for ; Sat, 14 Jun 2014 05:13:02 +0000 (UTC) Received: (qmail 71345 invoked by uid 500); 14 Jun 2014 05:13:02 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 71300 invoked by uid 500); 14 Jun 2014 05:13:02 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 71290 invoked by uid 99); 14 Jun 2014 05:13:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Jun 2014 05:13:02 +0000 Date: Sat, 14 Jun 2014 05:13:02 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-11322) SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many cache refreshes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031472#comment-14031472 ] Lars Hofhansl commented on HBASE-11322: --------------------------------------- That doesn't seem right, though. I think to be correct we need to keep track of both timestamp and refresh if either has changed. > SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many cache refreshes > ------------------------------------------------------------------------------------------------------ > > Key: HBASE-11322 > URL: https://issues.apache.org/jira/browse/HBASE-11322 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.19 > Reporter: churro morales > Assignee: churro morales > Priority: Critical > Fix For: 0.94.21 > > Attachments: HBASE-11322.patch > > > The SnapshotHFileCleaner calls the SnapshotFileCache if a particular HFile in question is part of a snapshot. > If the HFile is not in the cache, we then refresh the cache and check again. > But the cache refresh checks to see if anything has been modified since the last cache refresh but this logic is incorrect in certain scenarios. > The last modified time is done via this operation: > {code} > this.lastModifiedTime = Math.min(dirStatus.getModificationTime(), > tempStatus.getModificationTime()); > {code} > and the check to see if the snapshot directories have been modified: > {code} > // if the snapshot directory wasn't modified since we last check, we are done > if (dirStatus.getModificationTime() <= lastModifiedTime && > tempStatus.getModificationTime() <= lastModifiedTime) { > return; > } > {code} > Suppose the following happens: > dirStatus modified 6-1-2014 > tempStatus modified 6-2-2014 > lastModifiedTime = 6-1-2014 > provided these two directories don't get modified again all subsequent checks wont exit early, like they should. > In our cluster, this was a huge performance hit. The cleaner chain fell behind, thus almost filling up dfs and our namenode heap. > Its a simple fix, instead of Math.min we use Math.max for the lastModified, I believe that will be correct. > I'll apply a patch for you guys. -- This message was sent by Atlassian JIRA (v6.2#6252)