hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11360) SnapshotFileCache refresh logic based on modified directory time might be insufficient
Date Thu, 19 Jun 2014 06:25:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037023#comment-14037023

Lars Hofhansl commented on HBASE-11360:

Or more radically: I think the mistake is that we cache snapshots in .tmp at all. These are
by definition temporary, in progress, and changing. So let's *always* refresh those, we can
list the contents of the .tmp directory without enumerating all snapshots so that should be
efficient. Like so:
# we keep a cache of all snapshots not in .tmp. To detect changes in these we only need to
take the time of the snapshots directory
# we always refresh all snapshots in .tmp, and store them in a separate set
# when we check whether a snapshot contain a file we check both.

So we only check the status of the snapshot dir and we always enumerate the .tmp directory
(which should be small'ish), so overall this should be efficient.

If we agree that that would work, I'll put up a patch. Should fix both this issue and HBASE-11322.

> SnapshotFileCache refresh logic based on modified directory time might be insufficient
> --------------------------------------------------------------------------------------
>                 Key: HBASE-11360
>                 URL: https://issues.apache.org/jira/browse/HBASE-11360
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.19
>            Reporter: churro morales
> Right now we decide whether to refresh the cache based on the lastModified timestamp
of all the snapshots and those "running" snapshots which is located in the /hbase/.hbase-snapshot/.tmp/<snapshot>
> We ran a ExportSnapshot job which takes around 7 minutes between creating the directory
and copying all the files. 
> Thus the modified time for the 
> /hbase/.hbase-snapshot/.tmp directory was 7 minutes earlier than the modified time of
> /hbase/.hbase-snapshot/.tmp/<snapshot> directory
> Thus the cache refresh happens and doesn't pick up all the files but thinks its up to
date as the modified time of the .tmp directory never changes.
> This is a bug as when the export job starts the cache never contains the files for the
"running" snapshot and will fail.

This message was sent by Atlassian JIRA

View raw message