hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2468) Improvements to prewarm META cache on clients
Date Fri, 11 Jun 2010 23:29:16 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878137#action_12878137
] 

HBase Review Board commented on HBASE-2468:
-------------------------------------------

Message from: "Jonathan Gray" <jgray@apache.org>


bq.  On 2010-06-10 08:53:41, Jonathan Gray wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1079
bq.  > <http://review.hbase.org/r/98/diff/5/?file=943#file943line1079>
bq.  >
bq.  >     I guess these are static because of how HTables all share a single HCM per conf.
 The setting of prefetching is set at the HCM level not the HTable level, however clients
are usually not exposed to HCM and only deal with HTable.
bq.  >     
bq.  >     We should probably make it clear in the javadoc for these methods that they
apply to all HTable instances, though that may be clear from being static.
bq.  >     
bq.  >     Maybe since these are more advanced calls, they shouldn't be in HTable?  If
we provide proper documentation, it should be easy enough for a user to grab the HCM and apply
the config at that level?
bq.  
bq.  Mingjie Lai wrote:
bq.      > I guess these are static because of how HTables all share a single HCM per conf...
bq.      Yes. 
bq.      
bq.      > Maybe since these are more advanced calls, they shouldn't be in HTable?
bq.      Two alternatives:
bq.      1) HCM: as jgray said, ``however clients are usually not exposed to HCM and only
deal with HTable.''
bq.      2) HBaseAdmin: it is a more reasonable design choice since these operation are at
HCM level. 
bq.      3) or, make it a configuration. It would be one global configuration applied to all
tables, and cannot be changed dynamically. 
bq.      
bq.      I like 2) better, but not really sure whether we want to expose it there or not.

bq.      
bq.      What do you think?
bq.  
bq.  Jonathan Gray wrote:
bq.      Adding it to HBaseAdmin could make sense.  This one is a bit of an odd one because
it's a client-side configuration parameter done at the per-client-jvm level.  Typically we
have per-query or per-htable-instance configs.  HBaseAdmin is generally made up of remote
administration commands not local client config.
bq.      
bq.      If we provide sufficient javadoc (including as a class comment on HTable) it doesn't
matter so much where we put it.  Since it's distinct from what's currently in HTable and HBaseAdmin,
maybe it does make sense as a static in HCM?
bq.  
bq.  Todd Lipcon wrote:
bq.      I think keeping HConnectionManager an internal interface is a good idea, so kind
of -0 there. -1 on HBaseAdmin, since we should keep that for administrative functions that
really change something on the cluster. So I'd prefer HTable, but wouldn't cry over HCM.

Since we can knock it down to just two methods, get/set, let's just put it in HTable.

But it will be static, right, so people understand it's not per-instance.  Let's also make
sure there is javadoc that also explains that it is for all HTable instances for the tables
you configure.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/98/#review176
-----------------------------------------------------------





> Improvements to prewarm META cache on clients
> ---------------------------------------------
>
>                 Key: HBASE-2468
>                 URL: https://issues.apache.org/jira/browse/HBASE-2468
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>            Reporter: Todd Lipcon
>            Assignee: Mingjie Lai
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2468-trunk.patch
>
>
> A couple different use cases cause storms of reads to META during startup. For example,
a large MR job will cause each map task to hit meta since it starts with an empty cache.
> A couple possible improvements have been proposed:
>  - MR jobs could ship a copy of META for the table in the DistributedCache
>  - Clients could prewarm cache by doing a large scan of all the meta for the table instead
of random reads for each miss
>  - Each miss could fetch ahead some number of rows in META

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message