hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-3767) Cache the number of RS in HTable
Date Wed, 13 Apr 2011 23:12:05 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jean-Daniel Cryans updated HBASE-3767:
--------------------------------------

    Attachment: HBASE-3767.patch

So the current way we handle the TPE is called "unbounded queues", from the javadoc:

{quote}
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined
capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy.
Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize
therefore doesn't have any effect.) This may be appropriate when each task is completely independent
of others, so tasks cannot affect each others execution; for example, in a web page server.
While this style of queuing can be useful in smoothing out transient bursts of requests, it
admits the possibility of unbounded work queue growth when commands continue to arrive on
average faster than they can be processed.
{quote}

The important part is that no more than corePoolSize threads will ever be created, maxPoolSize
isn't used, and the rest is just queued. This is why it's important in that context to know
the number of region servers since you want maximum parallelism.

Instead, using the "direct handoff" strategy, new threads are created as soon as they start
being queued meaning that the number of threads will go up to the number of region servers
naturally, even if it changes. From the javadoc:

{quote}
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off
tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail
if no threads are immediately available to run it, so a new thread will be constructed. This
policy avoids lockups when handling sets of requests that might have internal dependencies.
Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted
tasks. This in turn admits the possibility of unbounded thread growth when commands continue
to arrive on average faster than they can be processed.
{quote}

We will never suffer from what is described in that last sentence since HCM will only create
as many Runnables as there are RS that contain the regions that we need to talk to.

> Cache the number of RS in HTable
> --------------------------------
>
>                 Key: HBASE-3767
>                 URL: https://issues.apache.org/jira/browse/HBASE-3767
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.2
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.90.3
>
>         Attachments: HBASE-3767.patch
>
>
> When creating a new HTable we have to query ZK to learn about the number of region servers
in the cluster. That is done for every single one of them, I think instead we should do it
once per JVM and then reuse that number for all the others.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message