hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mich Talebzadeh" <m...@peridale.co.uk>
Subject RE: Executor getting killed when running Hive on Spark
Date Thu, 24 Dec 2015 17:43:10 GMT
Hi Sofia.

 

I don’t think version 1.5.2  of spark can be used as Hive engine. I tried it many times.


 

What works is you download spark 1.3.1 anf build it as you did.

 

You then create spark-assembly-1.3.1-hadoop2.4.0.jar (after unzip and untar the result file)
and put it in $HIVE_HOME/lib

 

Then download the built version of spark 1.3.1 and install it as usual. You do not need to
start master slaves etc

 

So far so good.

 

In the directory you want to start spark 

 

Do

 

unset SPARK_HOME

 

log in to spark and do as follows:

 

set spark.home=/usr/lib/spark-1.3.1-bin-hadoop2.6;   - change to yours

set hive.execution.engine=spark;

set spark.master=yarn-client;

set spark.eventLog.enabled=true;

set spark.eventLog.dir=/usr/lib/spark-1.3.1-bin-hadoop2.6/logs;

set spark.executor.memory=512m;

set spark.serializer=org.apache.spark.serializer.KryoSerializer;

set hive.spark.client.server.connect.timeout=220000ms;

set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;

 

 

0: jdbc:hive2://rhes564:10010/default> select count(1) from t;

INFO  :

Query Hive on Spark job[1] stages:

INFO  : 2

INFO  : 3

INFO  :

Status: Running (Hive on Spark job[1])

INFO  : Job Progress Format

CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount
[StageCost]

INFO  : 2015-12-24 17:47:15,781 Stage-2_0: 0(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:17,790 Stage-2_0: 1(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:18,794 Stage-2_0: 2(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:19,798 Stage-2_0: 4(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:20,802 Stage-2_0: 5(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:21,807 Stage-2_0: 6(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:22,823 Stage-2_0: 8(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:23,830 Stage-2_0: 9(+2)/256    Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:24,835 Stage-2_0: 10(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:25,838 Stage-2_0: 12(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:26,842 Stage-2_0: 13(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:27,847 Stage-2_0: 15(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:28,856 Stage-2_0: 26(+3)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:29,862 Stage-2_0: 66(+2)/256   Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:30,867 Stage-2_0: 107(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:31,871 Stage-2_0: 154(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:32,875 Stage-2_0: 206(+2)/256  Stage-3_0: 0/1

INFO  : 2015-12-24 17:47:33,879 Stage-2_0: 256/256 Finished     Stage-3_0: 0(+1)/1

INFO  : 2015-12-24 17:47:34,882 Stage-2_0: 256/256 Finished     Stage-3_0: 1/1 Finished

INFO  : Status: Finished successfully in 20.12 seconds

+----------+--+

|   _c0    |

+----------+--+

| 2074897  |

+----------+--+

1 row selected (20.247 seconds)

 

 

 

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.


co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This message is for the
designated recipient only, if you are not the intended recipient, you should destroy it immediately.
Any information in this message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility
of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd,
its subsidiaries nor their employees accept any responsibility.

 

From: Sofia [mailto:sofia.panagiotidi@taiger.com] 
Sent: 24 December 2015 16:25
To: user@hive.apache.org
Subject: Executor getting killed when running Hive on Spark

 

Hello and happy holiday to those who are already enjoying it!

 

 

I am still having trouble running Hive with Spark. I downloaded Spark 1.5.2 and built it like
this (my Hadoop is version 2.7.1):

 

./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided”

 

When trying to run it with Hive 1.2.1 (a simple command that creates a Spark job like ‘Select
count(*) from userstweetsdailystatistics;') get the following error

 

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54 INFO log.PerfLogger:
<PERFLOG method=SparkBuildPlan from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54 INFO log.PerfLogger:
<PERFLOG method=SparkCreateTran.Map 1 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54 INFO Configuration.deprecation:
mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54 INFO exec.Utilities:
Processing alias userstweetsdailystatistics

15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54 INFO exec.Utilities:
Adding input file hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics

15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:55 INFO log.PerfLogger:
<PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:55 INFO exec.Utilities:
Serializing MapWork via kryo

15/12/24 17:12:56 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:56 INFO log.PerfLogger:
</PERFLOG method=serializePlan start=1450973575887 end=1450973576279 duration=392 from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:57 INFO storage.MemoryStore:
ensureFreeSpace(572800) called with curMem=0, maxMem=556038881

15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:57 INFO storage.MemoryStore:
Block broadcast_0 stored as values in memory (estimated size 559.4 KB, free 529.7 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO storage.MemoryStore:
ensureFreeSpace(43075) called with curMem=572800, maxMem=556038881

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO storage.MemoryStore:
Block broadcast_0_piece0 stored as bytes in memory (estimated size 42.1 KB, free 529.7 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO storage.BlockManagerInfo:
Added broadcast_0_piece0 in memory on 192.168.1.64:49690 (size: 42.1 KB, free: 530.2 MB)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 ERROR util.Utils:
uncaught error in thread SparkListenerBus, stopping SparkContext

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: java.lang.AbstractMethodError

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/metrics/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/api,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/static,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/executors/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/executors,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/environment/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/environment,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/storage/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/storage,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/pool,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/stage,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/stages,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/jobs/job,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/jobs/json,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO handler.ContextHandler:
stopped o.s.j.s.ServletContextHandler{/jobs,null}

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO spark.SparkContext:
Created broadcast 0 from hadoopRDD at SparkPlanGenerator.java:188

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO ui.SparkUI:
Stopped Spark web UI at http://192.168.1.64:4040

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO scheduler.DAGScheduler:
Stopping DAGScheduler

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO cluster.SparkDeploySchedulerBackend:
Shutting down all executors

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO cluster.SparkDeploySchedulerBackend:
Asking each executor to shut down

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO log.PerfLogger:
</PERFLOG method=SparkCreateTran.Map 1 start=1450973574712 end=1450973578874 duration=4162
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO log.PerfLogger:
<PERFLOG method=SparkCreateTran.Reducer 2 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO log.PerfLogger:
<PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58 INFO exec.Utilities:
Serializing ReduceWork via kryo

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO log.PerfLogger:
</PERFLOG method=serializePlan start=1450973578926 end=1450973579000 duration=74 from=org.apache.hadoop.hive.ql.exec.Utilities>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO log.PerfLogger:
</PERFLOG method=SparkCreateTran.Reducer 2 start=1450973578874 end=1450973579073 duration=199
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO log.PerfLogger:
</PERFLOG method=SparkBuildPlan start=1450973574707 end=1450973579074 duration=4367 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO log.PerfLogger:
<PERFLOG method=SparkBuildRDDGraph from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 WARN remote.ReliableDeliverySupervisor:
Association with remote system [akka.tcp://sparkExecutor@192.168.1.64:35089] has failed, address
is now gated for [5000] ms. Reason: [Disassociated] 

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO log.PerfLogger:
</PERFLOG method=SparkBuildRDDGraph start=1450973579074 end=1450973579273 duration=199
from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59 INFO client.RemoteDriver:
Failed to run job d3746d11-eac8-4bf9-9897-bef27fd0423e

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: java.lang.IllegalStateException:
Cannot call methods on a stopped SparkContext

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.SparkContext.org
<http://org.apache.spark.sparkcontext.org> $apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.SparkContext.submitJob(SparkContext.scala:1981)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:118)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:116)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.rdd.AsyncRDDActions.foreachAsync(AsyncRDDActions.scala:116)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.api.java.JavaRDDLike$class.foreachAsync(JavaRDDLike.scala:690)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.spark.api.java.AbstractJavaRDDLike.foreachAsync(JavaRDDLike.scala:47)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:257)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at java.util.concurrent.FutureTask.run(FutureTask.java:262)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl:         at java.lang.Thread.run(Thread.java:745)

15/12/24 17:12:59 [RPC-Handler-3]: INFO client.SparkClientImpl: Received result for d3746d11-eac8-4bf9-9897-bef27fd0423e

Status: Failed

15/12/24 17:12:59 [Thread-8]: ERROR status.SparkJobMonitor: Status: Failed

15/12/24 17:12:59 [Thread-8]: INFO log.PerfLogger: </PERFLOG method=SparkRunJob start=1450973569576
end=1450973579584 duration=10008 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

15/12/24 17:13:01 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=Driver.execute start=1450973565261
end=1450973581307 duration=16046 from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1450973581308
end=1450973581308 duration=0 from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 finished. closing... 

15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 Close done

15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>

15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1450973581362
end=1450973581362 duration=0 from=org.apache.hadoop.hive.ql.Driver>

 

 

The only useful thing I can find at the Spark side is in the worker log:

 

15/12/24 17:12:53 INFO worker.Worker: Asked to launch executor app-20151224171253-0000/0 for
Hive on Spark

15/12/24 17:12:53 INFO spark.SecurityManager: Changing view acls to: ubuntu

15/12/24 17:12:53 INFO spark.SecurityManager: Changing modify acls to: ubuntu

15/12/24 17:12:53 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui
acls disabled; users with view permissions: Set(ubuntu); users with modify permissions: Set(ubuntu)

15/12/24 17:12:53 INFO worker.ExecutorRunner: Launch command: "/usr/lib/jvm/java-7-openjdk-amd64/bin/java"
"-cp" "/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/sbin/../conf/:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar"
"-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=44858" "-Dhive.spark.log.dir=/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/logs/"
"-XX:MaxPermSize=256m" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url"
"akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler" "--executor-id" "0"
"--hostname" "192.168.1.64" "--cores" "3" "--app-id" "app-20151224171253-0000" "--worker-url"
"akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker"

15/12/24 17:12:58 INFO worker.Worker: Asked to kill executor app-20151224171253-0000/0

15/12/24 17:12:58 INFO worker.ExecutorRunner: Runner thread for executor app-20151224171253-0000/0
interrupted

15/12/24 17:12:58 INFO worker.ExecutorRunner: Killing process!

15/12/24 17:12:58 ERROR logging.FileAppender: Error writing stream to file /home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/work/app-20151224171253-0000/0/stderr

java.io.IOException: Stream closed

            at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)

            at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)

            at java.io.BufferedInputStream.read(BufferedInputStream.java:334)

            at java.io.FilterInputStream.read(FilterInputStream.java:107)

            at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)

            at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)

            at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

            at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)

            at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)

            at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)

15/12/24 17:12:59 INFO worker.Worker: Executor app-20151224171253-0000/0 finished with state
KILLED exitStatus 143

15/12/24 17:12:59 INFO worker.Worker: Cleaning up local directories for application app-20151224171253-0000

15/12/24 17:12:59 INFO shuffle.ExternalShuffleBlockResolver: Application app-20151224171253-0000
removed, cleanupLocalDirs = true

 

Here is my Spark configuration

 

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export SPARK_DIST_CLASSPATH=`hadoop class path`

 

 

Any hints as to what could be going wrong? Why is the executor getting killed? Have I built
Spark wrongly? I have tried building it in several different ways and I keep failing.

I must admit I am confused with the information I find online on how to use/build Spark on
Hive and which version goes with what.

Can I download a pre-built version of Spark that would be suitable with my existing Hadoop
2.7.1 and my Hive 1.2.1?

This error has been baffling me for weeks..

 

 

More than grateful for any help!

Sofia

 

 


Mime
View raw message