hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2468) Improvements to prewarm META cache on clients
Date Wed, 09 Jun 2010 22:42:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877244#action_12877244

HBase Review Board commented on HBASE-2468:

Message from: "Mingjie Lai" <mjlai09@gmail.com>

bq.  On 2010-06-07 14:23:42, stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java, line 96
bq.  > <http://review.hbase.org/r/98/diff/5/?file=944#file944line96>
bq.  >
bq.  >     getRowOrBefore is an expensive call.  Are we sure we are not calling this too

I agree it is an expensive call. 

However I don't think it would bring any performance penalty for existing and potential use
Use case 1 -- existing MetaScanner users: since this method is newly added, existing users
won't be affected; 
Use case 2 -- hbase clients when locating a region :  
1) if prefetch is on, it calls this MetaScanner with [table + row combination], which calls
getRowOrBefore() to get current region info, then number of following regions from meta. After
that, the client can get the region info directly from cache. 
2) if prefetch is disabled (current behavior), it eventually calls similar method getClosestRowBefore()
to get desired region. 

So no matter prefetch is on or not, getRowOrBefore(or getClosestRowBefore) eventually is called.
The only difference is whether to scan following regions from meta or not. 

For future MetaScanner users which scan from one region with desired use table row, it has
to take the effort since it is the expected behavior. 

- Mingjie

This is an automatically generated e-mail. To reply, visit:

> Improvements to prewarm META cache on clients
> ---------------------------------------------
>                 Key: HBASE-2468
>                 URL: https://issues.apache.org/jira/browse/HBASE-2468
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Todd Lipcon
>            Assignee: Mingjie Lai
>             Fix For: 0.21.0
>         Attachments: HBASE-2468-trunk.patch
> A couple different use cases cause storms of reads to META during startup. For example,
a large MR job will cause each map task to hit meta since it starts with an empty cache.
> A couple possible improvements have been proposed:
>  - MR jobs could ship a copy of META for the table in the DistributedCache
>  - Clients could prewarm cache by doing a large scan of all the meta for the table instead
of random reads for each miss
>  - Each miss could fetch ahead some number of rows in META

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message