hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3767) Cache the number of RS in HTable
Date Tue, 12 Apr 2011 18:24:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018969#comment-13018969

Jean-Daniel Cryans commented on HBASE-3767:

bq. And if the number of region servers changes, are there repercussions?

Currently once the HTable is created its ThreadPoolExecutor will stay the same size disregard
the changing number of region servers. Caching it here has the same behavior. Where it changes
is if a HTable is created later after the number of region server changes, but running with
less threads than the total number of region server is only less efficient under bulk load
situations where you need to insert into all of them at the same time (which I believe isn't
frequent when uploading, usually you create the HTables up front). That's the only repercussion
I see, and it's still less bad than the following:

bq. Thats better than doing getCurrentNrHRS. Maybe 2* number of processors

So the reason we use the number of RS is to be able to insert into all the region servers
at the same time in a bulk upload case. Using the number of CPUs by itself isn't particularly
useful since uploading isn't CPU intensive on the client (it's just threads waiting on region
servers) and the fact that you usually have many HTables per JVM kinda defeats the purpose
of limiting the number of executors.

I personally like the fact that we try to learn how many RS there is in order to tune the
TPE, but it's just that calling it every time is rather expensive and mostly useless. I still
believe we should just cache it.

> Cache the number of RS in HTable
> --------------------------------
>                 Key: HBASE-3767
>                 URL: https://issues.apache.org/jira/browse/HBASE-3767
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.2
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.90.3
> When creating a new HTable we have to query ZK to learn about the number of region servers
in the cluster. That is done for every single one of them, I think instead we should do it
once per JVM and then reuse that number for all the others.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message