hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-7245) Recovery on failed snapshot restore
Date Sat, 19 Oct 2013 19:28:04 GMT

     [ https://issues.apache.org/jira/browse/HBASE-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-7245:

    Fix Version/s:     (was: 0.96.0)

> Recovery on failed snapshot restore
> -----------------------------------
>                 Key: HBASE-7245
>                 URL: https://issues.apache.org/jira/browse/HBASE-7245
>             Project: HBase
>          Issue Type: Bug
>          Components: Client, master, regionserver, snapshots, Zookeeper
>            Reporter: Jonathan Hsieh
>            Assignee: Matteo Bertozzi
>             Fix For: 0.96.1
> Restore will do updates to the file system and to meta.  it seems that an inopportune
failure before meta is completely updated could result in an inconsistent state that would
require hbck to fix.
> We should define what the semantics are for recovering from this.  Some suggestions:
> 1) Fail Forward (see some log saying restore's meta edits not completed, then gather
information necessary to build it all from fs, and complete meta edits.).
> 2) Fail backwards (see some log saying restore's meta edits not completed, delete incomplete
snapshot region entries from meta.)  
> I think I prefer 1 -- if two processes end somehow updating  (somehow the original master
didn't die, and a new one started up) they would be idempotent.  If we used 2, we could still
have a race and still be in a bad place.

This message was sent by Atlassian JIRA

View raw message