hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Sun (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-10433) Cancel connection when remote driver process exited with error code [Spark Branch]
Date Wed, 22 Apr 2015 00:03:18 GMT
Chao Sun created HIVE-10433:
-------------------------------

             Summary: Cancel connection when remote driver process exited with error code
[Spark Branch]
                 Key: HIVE-10433
                 URL: https://issues.apache.org/jira/browse/HIVE-10433
             Project: Hive
          Issue Type: Bug
          Components: spark-branch
            Reporter: Chao Sun


Currently in HoS, after starting a remote process in SparkClientImpl, it will wait for the
process to connect back. However, there are cases that the process may fail and exit with
error code, and thus no connection is attempted. In this situation, the HS2 process will still
wait for the connection and eventually timeout itself. What makes it worse, user may need
to wait for two timeout periods, one for SparkSetReducerParallelism, and another for the actual
Spark job.

We should cancel the timeout task and mark the promise as failed once we know that the process
is failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message