hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venki Korukanti (JIRA)" <>
Subject [jira] [Commented] (HIVE-7747) Submitting a query to Spark from HiveServer2 fails
Date Wed, 20 Aug 2014 10:03:26 GMT


Venki Korukanti commented on HIVE-7747:

Test failure here is related to the change. Failure is complicated. It turns out that output
of {{HiveConf(srcHiveConf, SessionState.class)}} is not same as srcHiveConf in terms of (property,
value) pairs. Executed as part of constructor, the {{HiveConf.initialize}} method applies
system properties on top of copied properties from srcHiveConf. So from the moment srcHiveConf
is created to the moment of cloning HiveConf if there are any System properties set, cloned
HiveConf inherits those properties. In the test case ({{MiniHS2}}) scratchdir property is
modified in System properties (See [here|]),
but the default scratchdir value is {{$\{test.tmp.dir\}/scratchdir}} from hive-site.xml. Scrathdir
set in {{MiniHS2}} is never used before, but with this change HS2 started using it. Scratchdir
created in {{MiniHS2}} (See [here|])
doesn't have 777 permissions, so whenever we have user impersonation there are issues (thats
where the test is failing). Before this change, scratchdir is always {{$\{test.tmp.dir\}/scratchdir}}
which is created in HS2 with 777 permissions (See [here|]),
so there were no issues with the impersonation. 

I think it is better to fix this in SparkClient by fetching the jar directly than through
HiveConf, to avoid unexpected issues.

> Submitting a query to Spark from HiveServer2 fails
> --------------------------------------------------
>                 Key: HIVE-7747
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 0.13.1
>            Reporter: Venki Korukanti
>            Assignee: Venki Korukanti
>         Attachments: HIVE-7747.1.patch
> {{spark.serializer}} is set to {{org.apache.spark.serializer.KryoSerializer}}. Same configuration
works fine from Hive CLI.
> Spark tasks fails with following error:
> {code}
> Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure:
Lost task 0.3 in stage 1.0 (TID 9, java.lang.IllegalStateException: unread
block data
>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:84)
>         org.apache.spark.executor.Executor$
>         java.util.concurrent.ThreadPoolExecutor.runWorker(
>         java.util.concurrent.ThreadPoolExecutor$
> {code}

This message was sent by Atlassian JIRA

View raw message