phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3360) Secondary index configuration is wrong
Date Tue, 14 Feb 2017 07:47:41 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865298#comment-15865298
] 

William Yang commented on PHOENIX-3360:
---------------------------------------

New patch attached. 

There is another reason we have to create a single connection used for index updates. See
{{CoprocessorHConnection#getConnectionForEnvironment()}}, it will create a new connection
at each call. Then the ctor of  {{HConnectionImplementation}} will be called. In this ctor,
it will hit ZK to read the cluster id by calling {{retrieveClusterId()}}. This is totally
unacceptable. Apart from the extra network operation, it will still generate many CLOSE-WAIT
tcp connections in ZK cluster. As ZK is always a critical resource that we should try our
best to not access it unless we have to. If we haven't configured connection limit big enough
in zoo.cfg ({{maxClientCnxns}}), then index updates will fail at getting HTableInterface phase
because ZK connection requests are rejected for there are already too many.

Has anyone ever encountered this problem?

> Secondary index configuration is wrong
> --------------------------------------
>
>                 Key: PHOENIX-3360
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3360
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Critical
>             Fix For: 4.10.0
>
>         Attachments: ConfCP.java, PHOENIX-3360.patch, PHOENIX-3360-v2.PATCH, PHOENIX-3360-v3.PATCH,
PHOENIX-3360-v4.PATCH
>
>
> IndexRpcScheduler allocates some handler threads and uses a higher priority for RPCs.
The corresponding IndexRpcController is not used by default as it is, but used through ServerRpcControllerFactory
that we configure from Ambari by default which sets the priority of the outgoing RPCs to either
metadata priority, or the index priority.
> However, after reading code of IndexRpcController / ServerRpcController it seems that
the IndexRPCController DOES NOT look at whether the outgoing RPC is for an Index table or
not. It just sets ALL rpc priorities to be the index priority. The intention seems to be the
case that ONLY on servers, we configure ServerRpcControllerFactory, and with clients we NEVER
configure ServerRpcControllerFactory, but instead use ClientRpcControllerFactory. We configure
ServerRpcControllerFactory from Ambari, which in affect makes it so that ALL rpcs from Phoenix
are only handled by the index handlers by default. It means all deadlock cases are still there.

> The documentation in https://phoenix.apache.org/secondary_indexing.html is also wrong
in this sense. It does not talk about server side / client side. Plus this way of configuring
different values is not how HBase configuration is deployed. We cannot have the configuration
show the ServerRpcControllerFactory even only for server nodes, because the clients running
on those nodes will also see the wrong values. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message