hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15893) Followup on HIVE-15671
Date Mon, 20 Feb 2017 15:12:44 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874676#comment-15874676
] 

Rui Li commented on HIVE-15893:
-------------------------------

Hi [~xuefuz], our RPC channel has a handler that monitors the channel inactive event: https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java#L240
When the channel is closed abnormally this handler closes the RPC and print a warning:
https://github.com/apache/hive/blob/master/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java#L131

My understanding is the time needed to detect the broken connection is up to netty. When I
worked on HIVE-15860 it's immediately detected.
So I think you can check your log to look for the warning message. If the message is printed,
it means the error is detected and Hive is probably hanging somewhere else.
Besides, you may want to check whether you increased this property {{hive.spark.client.future.timeout}}.
It's one possible reason that can make the client wait.

> Followup on HIVE-15671
> ----------------------
>
>                 Key: HIVE-15893
>                 URL: https://issues.apache.org/jira/browse/HIVE-15893
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>    Affects Versions: 2.2.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>
> In HIVE-15671, we fixed a type where server.connect.timeout is used in the place of client.connect.timeout.
This might solve some potential problems, but the original problem reported in HIVE-15671
might still exist. (Not sure if HIVE-15860 helps). Here is the proposal suggested by Marcelo:
> {quote}
> bq: server detecting a driver problem after it has connected back to the server.
> Hmm. That is definitely not any of the "connect" timeouts, which probably means it isn't
configured and is just using netty's default (which is probably no timeout?). Would probably
need something using io.netty.handler.timeout.IdleStateHandler, and also some periodic "ping"
so that the connection isn't torn down without reason.
> {quote}
> We will use this JIRA to track the issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message