hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prakash Khemani (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4721) Retain Delete Markers after Major Compaction
Date Mon, 07 Nov 2011 20:46:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145785#comment-13145785

Prakash Khemani commented on HBASE-4721:

I had started at a point where I thought I will independently assign ttls to delete markers.
But now I have realized that it doesn't make any sense to give a different ttl to the delete-markers.
(giving the delete-markers a smaller ttl than the puts will be incorrect. giving them a larger
ttl than the puts will be pointless because then the delete-markers will be deleting already
expired puts)

HBASE-4536 will work but only if keep-deleted-kvs flag is set on the column family (or is
it table?). Do you think it makes sense to make it the default behavior that regardless of
whether point-in-time queries are being supported or not, major compaction will not remove
the delete-markers? A delete-marker will only be removed when it expires or when enough put
versions accumulate before it.

Concerns that people have raised if we stopped removing all delete markers in a major compaction
(1) Space wastage. I am not sure if this is a big concern.
(2) The bigger issue is that the user will never be able to insert a Put beyond the delete
marker. Today, if the user makes a mistake then the admin can go in, delete the puts, do a
major compaction, and then the user can reinsert the correct Puts. This workflow will be nullified
if we keep delete-markers even after major compaction.
(3) Today the user doesn't even know that there are delete markers. But that will have to
change if we start keeping delete-markers beyond major compactions.

I don't get the reasoning behind why we need to keep deleted puts when syncing logs from one
cluster to another. The problem that I am concerned about is the following

(1) Delete marker arrives from the source cluster
(2) major compaction happens on the target cluster which gets rid of the delete marker
(3) The deleted put arrives from the source cluster. Now that the delete marker is not there,
this put will become visible on the target cluster.

> Retain Delete Markers after Major Compaction
> --------------------------------------------
>                 Key: HBASE-4721
>                 URL: https://issues.apache.org/jira/browse/HBASE-4721
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Prakash Khemani
>            Assignee: Prakash Khemani
> There is a need to provide long TTLs for delete markers. This is useful when replicating
hbase logs from one cluster to another. The receiving cluster shouldn't compact away the delete
markers because the affected key-values might still be on the way.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message