phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Samarth Jain (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3983) Index rebuild scans should not be using the ServerRpcControllerFactory
Date Thu, 29 Jun 2017 01:16:01 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067544#comment-16067544
] 

Samarth Jain commented on PHOENIX-3983:
---------------------------------------

The region server hosting system catalog issues index rebuild scans against the data table
region servers. If the ServerRpcControllerFactory is configured on the region servers, then
these scan RPCs have their priority set to the INDEX priority which results in these RPC calls
being handled on the destination servers by the INDEX handlers. In turn, these index handlers
are used to do local writes to the data table which then trigger remote RPCs to the index
tables. These RPCs are then again handled by the index handlers on the region servers hosting
index table regions. This can result in a deadlock. Consider this simple scenario:

Two region server setup.
RS1 - SYSTEM.CATALOG
RS2 - DATA_TABLE, INDEX_TABLE
RS3 - DATA_TABLE, INDEX_TABLE
For simplicity lets assume that number of index rpc handlers  is 1. Let's name the lone handler
as T1 on RS2 and T1' on RS3.
Number of regular rpc handlers - 1

RS1 -> issues a scan on data table region servers. These scans are then handled on RS2
by T1 and RS3 by T1'
The index handler T1 on RS2 and T1' on RS3 then write locally to their data table regions
which results in remote RPCs to RS3 and RS2 respectively.

RPC from RS3 to RS2 is not able to proceed because the index handler T1 on RS2 that could
service this call is waiting on it's RPC to RS3 to finish. 
RPC from RS2 to RS3 is not able to proceed because the index handler T1' on RS3 that could
service this call is waiting on it's RPC to RS2 to finish.

Deadlock. 

The fix is to *unset* the server rpc controller factory so that the scans happening on data
table region servers are handled by DefaultRPCHandler s and *not* IndexRPCHandlers.

Many thanks to [~vincentpoon] for his help in debugging and identifying the issue. 

FYI, [~lhofhansl]. 


> Index rebuild scans should not be using the ServerRpcControllerFactory
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-3983
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3983
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message