hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-792) Rewrite getClosestAtOrJustBefore; doesn't scale as currently written
Date Mon, 17 Aug 2009 20:24:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744183#action_12744183
] 

stack commented on HBASE-792:
-----------------------------

HBASE-1761 rewrote getclosestatorbefore.  Code is much cleaner and more focused on the target
key.  We can do this now because of such as the axiom that deletes only apply to the flie
that follows.  Doesn't carry around bulky Maps of candidates nor of deletes (now we have new
style deletes) any more so should be more performant.

The one thing left to do is an early-out if we get an answer early in the processing -- in
memstore say.  I tried to do this as part of hbase-1761 but only worked if client asked for
the first row in a region. Need to make it so getclosest when a meta table leverages HRegionInfo.
 If target row key falls between the start and end key or the region, answer is the one we
want so exit.

> Rewrite getClosestAtOrJustBefore; doesn't scale as currently written
> --------------------------------------------------------------------
>
>                 Key: HBASE-792
>                 URL: https://issues.apache.org/jira/browse/HBASE-792
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 792.patch
>
>
> As currently written, as a table gets bigger, the number of rows .META. needs to keep
count of grows.
> As written, our getClosestAtOrJustBefore, goes through every storefile and in each picks
up any row that could be a possible candidate for closest before.  It doesn't just get the
closest from the storefile, but all keys that are closest before.  Its not selective because
how can it tell at the store file level which of the candidates will survive deletes that
are sitting in later store files or up in memcache.
> So, if a store file has keys 0-10 and we ask to get the row that is closest or just before
7, it returns rows 0-7.. and so on per store file.
> Can bet big and slow weeding key wanted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message