hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4536) Allow CF to retain deleted rows
Date Sun, 16 Oct 2011 05:56:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128355#comment-13128355
] 

Lars Hofhansl commented on HBASE-4536:
--------------------------------------

Fair enough. Was picking up on Stack's suggestion to have this on by default. Just means the
code has to distinguish between minor and major compaction scans, raw scans, and normal user
scans, all of which can be for a store with keep_delete_cells enabled or not.

Thinking about a ScanConfig (or ScanInfo) as static inner class of Store. That would capture
all immutable scan-relevant information about the Store (min/max version, family name, ttl,
keep_deletes, comparator). (A MatcherConfig with all information would need to be mutable
and created or changed for every scan.)
And then maybe a ScanType enum to distinguish between compaction scans and user initiated
scans.

What about Stack's suggested in the review to include delete cells in the version count? (The
only strange part with that is that the family markers are *always* in the beginning).
Right now a delete cell does not increase the version count and instead "inherits" the version
of the last put.

                
> Allow CF to retain deleted rows
> -------------------------------
>
>                 Key: HBASE-4536
>                 URL: https://issues.apache.org/jira/browse/HBASE-4536
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number of versions.
> However, if a client deletes a row all version older than the delete tomb stone will
be remove at the next major compaction (and even at memstore flush - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of versions
even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent (from a
user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message