hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8956) Hive hangs while some error/exception happens beyond job execution [Spark Branch]
Date Wed, 26 Nov 2014 17:33:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226504#comment-14226504
] 

Marcelo Vanzin commented on HIVE-8956:
--------------------------------------

I haven't looked at akka in that much detail to see if there is some API to catch those. You
can enable akka logging (set {{spark.akka.logLifecycleEvents}} to true) and that will print
these errors to the logs. Spark tries to serialize data before sending it to akka, to try
to catch serialization issues, but that adds overhead, and it also doesn't help in the deserialization
path...

> Hive hangs while some error/exception happens beyond job execution [Spark Branch]
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-8956
>                 URL: https://issues.apache.org/jira/browse/HIVE-8956
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Rui Li
>              Labels: Spark-M3
>             Fix For: spark-branch
>
>         Attachments: HIVE-8956.1-spark.patch
>
>
> Remote spark client communicate with remote spark context asynchronously, if error/exception
is throw out during job execution in remote spark context, it would be wrapped and send back
to remote spark client, but if error/exception is throw out beyond job execution, such as
job serialized failed, remote spark client would never know what's going on in remote spark
context, and it would hangs there.
> Set a timeout in remote spark client side may not a great idea, as we are not sure how
long the query executed in spark cluster. we need find a way to check whether job has failed(whole
life cycle) in remote spark context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message