hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16095) Add priority to TableDescriptor and priority region open thread pool
Date Tue, 19 Jul 2016 23:49:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15385076#comment-15385076
] 

Enis Soztutar commented on HBASE-16095:
---------------------------------------

bq. Then the "deadlock" here is entirely due to Phoenix's handling, and I don't think it's
something we should be trying to address with HBase. We've always said that doing blocking
operations in coprocessor hooks is bad practice. I don't think trying to paper over that for
a specific use-case here really helps HBase. Trying to impose ordering on operations in a
distributed system just adds complexity and problems.
Agreed. As I said above, we should fix that in Phoenix. However that would be a big work item.
We will hopefully get to it sooner rather than later, but unless we untangle that we will
have to deal with the problem. Only this week, some other client was doing a "secondary index"
coprocessor to do update in {{postPut()}} causing the exact same deadlock condition as in
Phoenix.This has both the deadlock problem as well as the lack of handling failure conditions
problem. I feel that we have ignored "secondary index" problem in HBase so far, but it is
a real issue that we have to have at least a working guideline that would work for all hbase
users (with or without phoenix). 

bq. I think the better way for Phoenix to approach this is to fail the region open for the
data table if the required index region is not online yet. Yes, this is a problem with current
HBase, where regions that go into FAILED_OPEN are never retried by assignment manager.
Yep, we have to make it so that FAILED_OPEN is retried forever. However, one complication
with HBASE-16209 or similar solutions is that we do not know which index regions we will depend
on unless we start the recover and compute all the index updates from the recovered WAL edits
from the data table updates and find the regions that these updates belong to. Realistically,
a single data table region will depend on almost ALL of the index regions to be available.
That would mean mean that we will spend a lot of time starting recovery failing, starting
again, failing again, etc. 


> Add priority to TableDescriptor and priority region open thread pool
> --------------------------------------------------------------------
>
>                 Key: HBASE-16095
>                 URL: https://issues.apache.org/jira/browse/HBASE-16095
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.21
>
>         Attachments: HBASE-16095-0.98.patch, HBASE-16095-0.98.patch, hbase-16095_v0.patch,
hbase-16095_v1.patch, hbase-16095_v2.patch, hbase-16095_v3.patch
>
>
> This is in the similar area with HBASE-15816, and also required with the current secondary
indexing for Phoenix. 
> The problem with P secondary indexes is that data table regions depend on index regions
to be able to make progress. Possible distributed deadlocks can be prevented via custom RpcScheduler
+ RpcController configuration via HBASE-11048 and PHOENIX-938. However, region opening also
has the same deadlock situation, because data region open has to replay the WAL edits to the
index regions. There is only 1 thread pool to open regions with 3 workers by default. So if
the cluster is recovering / restarting from scratch, the deadlock happens because some index
regions cannot be opened due to them being in the same queue waiting for data regions to open
(which waits for  RPC'ing to index regions which is not open). This is reproduced in almost
all Phoenix secondary index clusters (mutable table w/o transactions) that we see. 
> The proposal is to have a "high priority" region opening thread pool, and have the HTD
carry the relative priority of a table. This maybe useful for other "framework" level tables
from Phoenix, Tephra, Trafodian, etc if they want some specific tables to become online faster.

> As a follow up patch, we can also take a look at how this priority information can be
used by the rpc scheduler on the server side or rpc controller on the client side, so that
we do not have to set priorities manually per-operation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message