hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-16095) Add priority to TableDescriptor and priority region open thread pool
Date Sat, 09 Jul 2016 18:09:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369222#comment-15369222
] 

Andrew Purtell edited comment on HBASE-16095 at 7/9/16 6:08 PM:
----------------------------------------------------------------

bq.  I feel that the priority region opening might still be useful for other contexts as well
(like opening framework level tables sooner), hence we should still pursue this.

This makes sense IMHO. 
This is not per request priorities (HBASE-15816), and we may have an (orthogonal) use case
for those someday, but it's also possible this change alone is useful enough to Phoenix: Assign
index tables a higher priority than the primary table and that may solve their deadlock challenges.


Not thrilled about the new event type but alternatives seem worse IMHO
{code}
M_RS_OPEN_PRIORITY_REGION          (26, ExecutorType.RS_OPEN_PRIORITY_REGION),
{code}

We handle priority opens of META with its own pool, and will have another priority pool after
this change. The two priority pools aren't aware of each other and may contend for the same
resources. Offhand I don't have a thought on how to do better.

In general we are approaching this in an ad hoc manner with static pools, predefined and limited
QoS levels, and magic constants. Why not have a dynamic dispatch. Allocate a handler pool
at every distinct priority level supplied by htd.getPriority(). Such pools could have only
one core thread and a configurable upper bound on pool size, with the additional workers terminating
after like one minute of inactivity. Use the same approach for dispatch to META: META handlers
would become just a pool allocated at the highest priority level. The downside is more work
at dispatch than simple test-and-branch with precompiled constants.  Just a thought.
{code}
            if (htd.getPriority() >= HConstants.ADMIN_QOS) {
	              regionServer.service.submit(new OpenPriorityRegionHandler(
	                regionServer, regionServer, region, htd, masterSystemTime));
	            } else {
	              regionServer.service.submit(new OpenRegionHandler(
	                regionServer, regionServer, region, htd, masterSystemTime));
	            }
           }
{code}


was (Author: apurtell):
bq.  I feel that the priority region opening might still be useful for other contexts as well
(like opening framework level tables sooner), hence we should still pursue this.

This makes sense IMHO. 
This is not per request priorities (HBASE-15816), and we may have an (orthogonal) use case
for those someday, but it's also possible this change alone is useful enough to Phoenix: Assign
index tables a higher priority than the primary table and that may solve their deadlock challenges.


Not thrilled about the new event type but alternatives seem worse IMHO
{code}
M_RS_OPEN_PRIORITY_REGION          (26, ExecutorType.RS_OPEN_PRIORITY_REGION),
{code}

We handle priority opens of META with its own pool, and will have another priority pool after
this change. The two priority pools aren't aware of each other and may execute concurrently.


In general we are approaching this in an ad hoc manner with static pools, predefined and limited
QoS levels, and magic constants. Why not have a dynamic dispatch. Allocate a handler pool
at every distinct priority level supplied by htd.getPriority(). Such pools could have only
one core thread and a configurable upper bound on pool size, with the additional workers terminating
after like one minute of inactivity. Use the same approach for dispatch to META: META handlers
would become just a pool allocated at the highest priority level. The downside is more work
at dispatch than simple test-and-branch with precompiled constants.  Just a thought.
{code}
            if (htd.getPriority() >= HConstants.ADMIN_QOS) {
	              regionServer.service.submit(new OpenPriorityRegionHandler(
	                regionServer, regionServer, region, htd, masterSystemTime));
	            } else {
	              regionServer.service.submit(new OpenRegionHandler(
	                regionServer, regionServer, region, htd, masterSystemTime));
	            }
           }
{code}

> Add priority to TableDescriptor and priority region open thread pool
> --------------------------------------------------------------------
>
>                 Key: HBASE-16095
>                 URL: https://issues.apache.org/jira/browse/HBASE-16095
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: hbase-16095_v0.patch, hbase-16095_v1.patch, hbase-16095_v2.patch
>
>
> This is in the similar area with HBASE-15816, and also required with the current secondary
indexing for Phoenix. 
> The problem with P secondary indexes is that data table regions depend on index regions
to be able to make progress. Possible distributed deadlocks can be prevented via custom RpcScheduler
+ RpcController configuration via HBASE-11048 and PHOENIX-938. However, region opening also
has the same deadlock situation, because data region open has to replay the WAL edits to the
index regions. There is only 1 thread pool to open regions with 3 workers by default. So if
the cluster is recovering / restarting from scratch, the deadlock happens because some index
regions cannot be opened due to them being in the same queue waiting for data regions to open
(which waits for  RPC'ing to index regions which is not open). This is reproduced in almost
all Phoenix secondary index clusters (mutable table w/o transactions) that we see. 
> The proposal is to have a "high priority" region opening thread pool, and have the HTD
carry the relative priority of a table. This maybe useful for other "framework" level tables
from Phoenix, Tephra, Trafodian, etc if they want some specific tables to become online faster.

> As a follow up patch, we can also take a look at how this priority information can be
used by the rpc scheduler on the server side or rpc controller on the client side, so that
we do not have to set priorities manually per-operation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message