hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu Shaohui (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10118) Major compact keeps deletes with future timestamps
Date Tue, 25 Mar 2014 12:27:19 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Liu Shaohui updated HBASE-10118:

    Attachment: HBASE-10118-trunk-v1.diff

Agree with [~lhofhansl] 
When the special delete TTL is not set (or set to 0) , it should not have any effect.

[~jmspaggi] [~sershe]
If we add an option hbase.hstore.time.to.purge.future.deletes. 
it will bring conflicts when user set the delete TTL and hbase.hstore.time.to.purge.future.deletes
= true, because we don't know when the future delete  kvs are inserted and when to delete
these future kvs.

So i think if the special delete TTL is not set, we keep the behavior as before and future
deleted kvs are purged during major compaction. 

And when  the special delete TTL is set, the future deleted kvs are kept util 
kv.timestap() + delete TTL.

> Major compact keeps deletes with future timestamps
> --------------------------------------------------
>                 Key: HBASE-10118
>                 URL: https://issues.apache.org/jira/browse/HBASE-10118
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction, Deletes, regionserver
>            Reporter: Max Lapan
>            Priority: Minor
>         Attachments: HBASE-10118-trunk-v1.diff
> Hello!
> During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major
compact handles delete markers with timestamps in the future. Before HBASE-4721 major compact
purged deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp
not reached.
> I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis()
- timestamp) <= timeToPurgeDeletes}}.
> This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}}
option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java.
> Maybe, we are trying to do something weird by specifing delete timestamp in future, but
HBASE-4721 definitely breaks old behaviour we rely on.
> Steps to reproduce this:
> {code}
> put 'test', 'delmeRow', 'delme:something', 'hello'
> flush 'test'
> delete 'test', 'delmeRow', 'delme:something', 1394161431061
> flush 'test'
> major_compact 'test'
> {code}
> Before major_compact we have two hfiles with the following:
> {code}
> first:
> K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0
> second:
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> After major compact we get the following:
> {code}
> K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0
> {code}
> In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes
to Integer.MIN_VALUE, which purges delete markers, and it looks like a solution. But, maybe,
there are better approach.

This message was sent by Atlassian JIRA

View raw message