hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-2834) Deferred deletes
Date Mon, 12 May 2014 00:51:15 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell resolved HBASE-2834.

    Resolution: Duplicate

> Deferred deletes
> ----------------
>                 Key: HBASE-2834
>                 URL: https://issues.apache.org/jira/browse/HBASE-2834
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
> Tangentally mentioned in a blog post, James Hamilton talks about deferred deletes:
> {quote}
> If you have an application error, administrative error, or database implementation bug
that losses data, then it is simply gone unless you have an offline copy. This, by the way,
is why I'm a big fan of deferred delete.  This is a technique where deleted items are marked
as deleted but not garbage collected until some days or preferably weeks later.  Deferred
delete is not full protection but it has saved my butt more than once and I'm a believer.
See On Designing and Deploying Internet-Scale Services (http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf)
for more detail.
> {quote}
> (See http://perspectives.mvdirona.com/2010/04/07/StonebrakerOnCAPTheoremAndDatabases.aspx)
> Because deletes -- at least, after the initial write has been flushed from memstore --
are tombstones, deferred delete in HBase could be supported if somehow tombstones could be
invalidated, an undelete operation in effect. This could be accomplished by adding support
for tombstones for deletes. Would complicate major compaction but otherwise not touch much.
A typical use case might be "resurrect any data deleted from _ts1_ to _ts2_ ", a period of
4 hours when an application error was operative. In this case a new write would be issued
to the table that is a tombstone covering any deletes over that period of time. Users would
defer major compactions until safe checkpoint periods. 
> Such guarantees could optionally be extended to the memstoe by using tombstones there
as well. But it would probably be sufficient to provide guidance that forcing a flush is 
necessary to insure edits are persisted in a way that allows for undeletion.

This message was sent by Atlassian JIRA

View raw message