hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11322) SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many cache refreshes
Date Sat, 14 Jun 2014 23:02:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031729#comment-14031729

Lars Hofhansl commented on HBASE-11322:

Hmm... True, assuming no races and or clock skew in any way you are correct.
I agree Math.max is fine too. Either way is fine with me. Keeping both times is more explicit
and easier to reason about. Math.max on the other hand is simpler.

+1 on Math.max if you like that better :)

> SnapshotHFileCleaner makes the wrong check for lastModified time thus causing too many
cache refreshes
> ------------------------------------------------------------------------------------------------------
>                 Key: HBASE-11322
>                 URL: https://issues.apache.org/jira/browse/HBASE-11322
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.19
>            Reporter: churro morales
>            Assignee: churro morales
>            Priority: Critical
>             Fix For: 0.94.21
>         Attachments: 11322.94.txt, HBASE-11322.patch
> The SnapshotHFileCleaner calls the SnapshotFileCache if a particular HFile in question
is part of a snapshot.
> If the HFile is not in the cache, we then refresh the cache and check again.
> But the cache refresh checks to see if anything has been modified since the last cache
refresh but this logic is incorrect in certain scenarios.
> The last modified time is done via this operation:
> {code}
> this.lastModifiedTime = Math.min(dirStatus.getModificationTime(),
>                                      tempStatus.getModificationTime());
> {code}
> and the check to see if the snapshot directories have been modified:
> {code}
> // if the snapshot directory wasn't modified since we last check, we are done
>     if (dirStatus.getModificationTime() <= lastModifiedTime &&
>         tempStatus.getModificationTime() <= lastModifiedTime) {
>       return;
>     }
> {code}
> Suppose the following happens:
> dirStatus modified 6-1-2014
> tempStatus modified 6-2-2014
> lastModifiedTime = 6-1-2014
> provided these two directories don't get modified again all subsequent checks wont exit
early, like they should.
> In our cluster, this was a huge performance hit.  The cleaner chain fell behind, thus
almost filling up dfs and our namenode heap.
> Its a simple fix, instead of Math.min we use Math.max for the lastModified, I believe
that will be correct.
> I'll apply a patch for you guys.

This message was sent by Atlassian JIRA

View raw message