hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-16223) Drop duplicated delete markers in minor compaction
Date Wed, 13 Jul 2016 07:44:20 GMT
Duo Zhang created HBASE-16223:

             Summary: Drop duplicated delete markers in minor compaction
                 Key: HBASE-16223
                 URL: https://issues.apache.org/jira/browse/HBASE-16223
             Project: HBase
          Issue Type: Improvement
            Reporter: Duo Zhang

Recently we suffer from this. One of our customers may delete the same row multiple times(the
record is about 100, 000 times), and cause scan timeout.

Now we trigger major compaction every day to drop the duplicated delete markers. But this
is not a good idea since the cost of major compaction gets higher as the data gets larger.

And in fact, I think only the newest delete marker is useful(if maxverions = 1), so we could
only retain this delete marker when doing minor compaction.

This message was sent by Atlassian JIRA

View raw message