hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <>
Subject [jira] [Updated] (HIVE-7747) Submitting a query to Spark from HiveServer2 fails [Spark Branch]
Date Wed, 20 Aug 2014 00:31:18 GMT


Venki Korukanti updated HIVE-7747:

    Attachment: HIVE-7747.1.patch

Problem is that we ship a wrong jar to Spark cluster. Instead of hive-exec, we ship hive-common.
In SparkClient, we get the jar from HiveConf.getJar() which returns that jar that contains
the initialization class. Initialization class given to HiveConf is different in HS2 versus
CLI. In CliDriver (see run() method), SessionState.class (contained in hive-exec jar) is passed
to HiveConf. In HS2 no initialization class is passed which defaults to HiveConf.class (contained
in hive-common). 

The error thrown in Spark task is strange. Not sure if it is the standard error that is throw
if no classes are found on classpath. Attaching a fix to pass SessionState.class as initilization
class to HiveConf in HiveSessionImpl. It is a general fix, not specific to spark branch.

> Submitting a query to Spark from HiveServer2 fails [Spark Branch]
> -----------------------------------------------------------------
>                 Key: HIVE-7747
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>         Attachments: HIVE-7747.1.patch
> {{spark.serializer}} is set to {{org.apache.spark.serializer.KryoSerializer}}. Same configuration
works fine from Hive CLI.
> Spark tasks fails with following error:
> {code}
> Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 1.0 (TID 9, java.lang.IllegalStateException: unread
block data
>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:84)
>         org.apache.spark.executor.Executor$
>         java.util.concurrent.ThreadPoolExecutor.runWorker(
>         java.util.concurrent.ThreadPoolExecutor$
> {code}

This message was sent by Atlassian JIRA

View raw message