hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15860) RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally
Date Fri, 10 Feb 2017 09:26:42 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860967#comment-15860967
] 

Xuefu Zhang commented on HIVE-15860:
------------------------------------

Hi [~lirui], thanks for working on this. Just to clarify, does the monitor loop forever in
the case? It seems that it does even though the broken connection is already detected at RPC
layer. As a result, the user session will hang forever w/o making any progress.

> RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally
> -----------------------------------------------------------------
>
>                 Key: HIVE-15860
>                 URL: https://issues.apache.org/jira/browse/HIVE-15860
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-15860.1.patch
>
>
> It happens when RemoteDriver crashes between {{JobStarted}} and {{JobSubmitted}}, e.g.
killed by {{kill -9}}. RemoteSparkJobMonitor will consider the job has started, however it
can't get the job info because it hasn't received the JobId. Then the monitor will loop forever.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message