hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-12311) Version stats in HFiles?
Date Tue, 21 Oct 2014 17:57:34 GMT
Lars Hofhansl created HBASE-12311:
-------------------------------------

             Summary: Version stats in HFiles?
                 Key: HBASE-12311
                 URL: https://issues.apache.org/jira/browse/HBASE-12311
             Project: HBase
          Issue Type: Brainstorming
            Reporter: Lars Hofhansl


In HBASE-9778 I basically punted the decision on whether doing repeated scanner.next() called
instead of the issueing (re)seeks to the user.
I think we can do better.

One way do that is maintain simple stats of what the maximum number of versions we've seen
for any row/col combination and store these in the HFile's metadata (just like the timerange,
oldest Put, etc).

Then we estimate fairly accurately whether we have to expect lots of versions (i.e. seek between
columns is better) or not (in which case we'd issue repeated next()'s).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message