hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-12311) Version stats in HFiles?
Date Sat, 04 Jul 2015 00:06:05 GMT

     [ https://issues.apache.org/jira/browse/HBASE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl resolved HBASE-12311.
-----------------------------------
    Resolution: Invalid

This is no longer needed. I added much better heuristics now to decide when we should SEEK
and when we should SKIP.

> Version stats in HFiles?
> ------------------------
>
>                 Key: HBASE-12311
>                 URL: https://issues.apache.org/jira/browse/HBASE-12311
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>         Attachments: 12311-indexed-0.98-v2.txt, 12311-indexed-0.98.txt, 12311-v2.txt,
12311-v3.txt, 12311.txt, CellStatTracker.java
>
>
> In HBASE-9778 I basically punted the decision on whether doing repeated scanner.next()
called instead of the issueing (re)seeks to the user.
> I think we can do better.
> One way do that is maintain simple stats of what the maximum number of versions we've
seen for any row/col combination and store these in the HFile's metadata (just like the timerange,
oldest Put, etc).
> Then we estimate fairly accurately whether we have to expect lots of versions (i.e. seek
between columns is better) or not (in which case we'd issue repeated next()'s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message