hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-3636) a bug about deciding whether this key is a new key for the ROWCOL bloomfilter
Date Mon, 14 Mar 2011 18:33:29 GMT
a bug about deciding whether this key is a new key for the ROWCOL bloomfilter
-----------------------------------------------------------------------------

                 Key: HBASE-3636
                 URL: https://issues.apache.org/jira/browse/HBASE-3636
             Project: HBase
          Issue Type: Bug
          Components: regionserver
            Reporter: Liyin Tang


When ROWCOL bloomfilter needs to decide whether this key is a new key or not,
it will call the matchingRowColumn function, which will compare the timestamp offset between
this kv and last kv.
But when checking the timestamp offset, it didn't deduct the original offset of the keyvalue
itself.

For example, when 2 keyvalue objects have the same row key and col key, but from different
storefiles. It is highly likely that these 2 keyvalue objects have different offset value.
So the timestamp offset of these 2 objects are totally different. They will be regard as new
keys to add into bloomfilters.
So after compaction, the key count of bloomfilter will increase immediately, which is almost
equal to the number of entries.

The solution is straightforward. Just compare the relevant timestamp offset, which is the
timestamp offset - key_value offset.

This also may explain this jira: https://issues.apache.org/jira/browse/HBASE-3007

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message