hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Max Lapan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10118) Major compact keeps deletes with future timestamps
Date Tue, 10 Dec 2013 09:43:07 GMT
Max Lapan created HBASE-10118:

             Summary: Major compact keeps deletes with future timestamps
                 Key: HBASE-10118
                 URL: https://issues.apache.org/jira/browse/HBASE-10118
             Project: HBase
          Issue Type: Bug
          Components: Compaction, Deletes, regionserver
            Reporter: Max Lapan
            Priority: Minor


During migration from HBase 0.90.6 to 0.94.6 we found changed behaviour in how major compact
handles delete markers with timestamps in the future. Before HBASE-4721 major compact purged
deletes regardless of their timestamp. Newer versions keep them in HFile until timestamp not

I guess this happened due to new check in ScanQueryMatcher {{EnvironmentEdgeManager.currentTimeMillis()
- timestamp) <= timeToPurgeDeletes}}.

This can be worked around by specifying large negative value in {{hbase.hstore.time.to.purge.deletes}}
option, but, unfortunately, negative values are pulled up to zero by Math.max in HStore.java.

It is very possible that we are trying to do something weird by specifing delete timestamp
in future, but HBASE-4721 definitely breaks old behaviour we rely on.

Steps to reproduce this:
put 'test', 'delmeRow', 'delme:something', 'hello'
flush 'test'
delete 'test', 'delmeRow', 'delme:something', 1394161431061
flush 'test'
major_compact 'test'

Before major_compact we have two hfiles with the following:
K: delmeRow/delme:something/1384161431061/Put/vlen=5/ts=0

K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0

After major compact we get the following:
K: delmeRow/delme:something/1394161431061/DeleteColumn/vlen=0/ts=0

In our installation, we resolved this by removing Math.max and setting hbase.hstore.time.to.purge.deletes
to Integer.MIN_VALUE, which purges delete markers.

This message was sent by Atlassian JIRA

View raw message