hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11322) SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many cache refreshes
Date Wed, 18 Jun 2014 22:22:26 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036542#comment-14036542

Lars Hofhansl commented on HBASE-11322:

Now... Would this be fixed if we kept the separate timestamps for snapshot and tmp directories
as in my first suggestion?
(I'm actually not really familiar with how this works and bit pressed for time, hence the
question instead of just checking it out myself... Sorry)

> SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many
cache refreshes
> ------------------------------------------------------------------------------------------------------
>                 Key: HBASE-11322
>                 URL: https://issues.apache.org/jira/browse/HBASE-11322
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.19
>            Reporter: churro morales
>            Assignee: churro morales
>            Priority: Critical
>             Fix For: 0.94.21
>         Attachments: 11322.94.txt, HBASE-11322.patch
> The SnapshotHFileCleaner calls the SnapshotFileCache if a particular HFile in question
is part of a snapshot.
> If the HFile is not in the cache, we then refresh the cache and check again.
> But the cache refresh checks to see if anything has been modified since the last cache
refresh but this logic is incorrect in certain scenarios.
> The last modified time is done via this operation:
> {code}
> this.lastModifiedTime = Math.min(dirStatus.getModificationTime(),
>                                      tempStatus.getModificationTime());
> {code}
> and the check to see if the snapshot directories have been modified:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are done
>     if (dirStatus.getModificationTime() <= lastModifiedTime &&
>         tempStatus.getModificationTime() <= lastModifiedTime) {
>       return;
>     }
> {code}
> Suppose the following happens:
> dirStatus modified 6-1-2014
> tempStatus modified 6-2-2014
> lastModifiedTime = 6-1-2014
> provided these two directories don't get modified again all subsequent checks wont exit
early, like they should.
> In our cluster, this was a huge performance hit.  The cleaner chain fell behind, thus
almost filling up dfs and our namenode heap.
> Its a simple fix, instead of Math.min we use Math.max for the lastModified, I believe
that will be correct.
> I'll apply a patch for you guys.

This message was sent by Atlassian JIRA

View raw message