hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sofia <sofia.panagiot...@taiger.com>
Subject Re: Executor getting killed when running Hive on Spark
Date Thu, 24 Dec 2015 17:59:30 GMT

I am not sure which other log file to look into. 
I have one master and one worker and in the previous mail I showed the hive and spark worker’s
log. The master log contains something like the following (extracted from an execution I just
did)

15/12/24 18:19:01 INFO master.Master: Launching executor app-20151224181901-0000/0 on worker
worker-20151224181835-192.168.1.64-35198
15/12/24 18:19:06 INFO master.Master: Received unregister request from application app-20151224181901-0000
15/12/24 18:19:06 INFO master.Master: Removing app app-20151224181901-0000
15/12/24 18:19:06 WARN master.Master: Got status update for unknown executor app-20151224181901-0000/0

I run hive with spark execution engine and I have certainly set the correct master in the
hive-site.xml.
A normal Spark job seems to run ok, I just ran the wordcount example and it terminates just
fine.

Also, the table is created by something like 'CREATE TABLE userstweetsdailystatistics(foo
INT, bar STRING);’

The error log file not being accessed I think is due to the fact that the executor is already
killed by the time it tries to write in it (stream closed).


> On 24 Dec 2015, at 17:52, Jörn Franke <jornfranke@gmail.com> wrote:
> 
> Have you checked what the issue is with the log file causing troubles? Enough space available?
Access rights (what is the user of the spark worker?)? Does directory exist?
> 
> Can you provide more details how the table is created?
> 
> Does the query work with mr or tez as an execution engine?
> 
> Does a normal Spark job without Hive work?
> 
> On 24 Dec 2015, at 17:25, Sofia <sofia.panagiotidi@taiger.com <mailto:sofia.panagiotidi@taiger.com>>
wrote:
> 
>> Hello and happy holiday to those who are already enjoying it!
>> 
>> 
>> I am still having trouble running Hive with Spark. I downloaded Spark 1.5.2 and built
it like this (my Hadoop is version 2.7.1):
>> 
>> ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided”
>> 
>> When trying to run it with Hive 1.2.1 (a simple command that creates a Spark job
like ‘Select count(*) from userstweetsdailystatistics;') get the following error
>> 
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54
INFO log.PerfLogger: <PERFLOG method=SparkBuildPlan from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54
INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Map 1 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54
INFO Configuration.deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54
INFO exec.Utilities: Processing alias userstweetsdailystatistics
>> 15/12/24 17:12:54 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:54
INFO exec.Utilities: Adding input file hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics
<hdfs://hadoop-master:8020/user/ubuntu/hive/warehouse/userstweetsdailystatistics>
>> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:55
INFO log.PerfLogger: <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:55 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:55
INFO exec.Utilities: Serializing MapWork via kryo
>> 15/12/24 17:12:56 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:56
INFO log.PerfLogger: </PERFLOG method=serializePlan start=1450973575887 end=1450973576279
duration=392 from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:57
INFO storage.MemoryStore: ensureFreeSpace(572800) called with curMem=0, maxMem=556038881
>> 15/12/24 17:12:57 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:57
INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 559.4
KB, free 529.7 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO storage.MemoryStore: ensureFreeSpace(43075) called with curMem=572800, maxMem=556038881
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size
42.1 KB, free 529.7 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.64:49690 (size:
42.1 KB, free: 530.2 MB)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
ERROR util.Utils: uncaught error in thread SparkListenerBus, stopping SparkContext
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: java.lang.AbstractMethodError
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:62)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:56)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:79)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO spark.SparkContext: Created broadcast 0 from hadoopRDD at SparkPlanGenerator.java:188
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO ui.SparkUI: Stopped Spark web UI at http://192.168.1.64:4040 <http://192.168.1.64:4040/>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO scheduler.DAGScheduler: Stopping DAGScheduler
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO cluster.SparkDeploySchedulerBackend: Shutting down all executors
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO cluster.SparkDeploySchedulerBackend: Asking each executor to shut down
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Map 1 start=1450973574712 end=1450973578874
duration=4162 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO log.PerfLogger: <PERFLOG method=SparkCreateTran.Reducer 2 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO log.PerfLogger: <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:58 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:58
INFO exec.Utilities: Serializing ReduceWork via kryo
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO log.PerfLogger: </PERFLOG method=serializePlan start=1450973578926 end=1450973579000
duration=74 from=org.apache.hadoop.hive.ql.exec.Utilities>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO log.PerfLogger: </PERFLOG method=SparkCreateTran.Reducer 2 start=1450973578874 end=1450973579073
duration=199 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO log.PerfLogger: </PERFLOG method=SparkBuildPlan start=1450973574707 end=1450973579074
duration=4367 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO log.PerfLogger: <PERFLOG method=SparkBuildRDDGraph from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkExecutor@192.168.1.64:35089
<akka.tcp://sparkExecutor@192.168.1.64:35089>] has failed, address is now gated for
[5000] ms. Reason: [Disassociated] 
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO log.PerfLogger: </PERFLOG method=SparkBuildRDDGraph start=1450973579074 end=1450973579273
duration=199 from=org.apache.hadoop.hive.ql.exec.spark.SparkPlan>
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 15/12/24 17:12:59
INFO client.RemoteDriver: Failed to run job d3746d11-eac8-4bf9-9897-bef27fd0423e
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: java.lang.IllegalStateException:
Cannot call methods on a stopped SparkContext
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.SparkContext.org
<http://org.apache.spark.sparkcontext.org/>$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:104)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.SparkContext.submitJob(SparkContext.scala:1981)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:118)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1.apply(AsyncRDDActions.scala:116)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.rdd.AsyncRDDActions.foreachAsync(AsyncRDDActions.scala:116)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.api.java.JavaRDDLike$class.foreachAsync(JavaRDDLike.scala:690)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.spark.api.java.AbstractJavaRDDLike.foreachAsync(JavaRDDLike.scala:47)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:257)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> 15/12/24 17:12:59 [stderr-redir-1]: INFO client.SparkClientImpl: 	at java.lang.Thread.run(Thread.java:745)
>> 15/12/24 17:12:59 [RPC-Handler-3]: INFO client.SparkClientImpl: Received result for
d3746d11-eac8-4bf9-9897-bef27fd0423e
>> Status: Failed
>> 15/12/24 17:12:59 [Thread-8]: ERROR status.SparkJobMonitor: Status: Failed
>> 15/12/24 17:12:59 [Thread-8]: INFO log.PerfLogger: </PERFLOG method=SparkRunJob
start=1450973569576 end=1450973579584 duration=10008 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
>> FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
>> 15/12/24 17:13:01 [main]: ERROR ql.Driver: FAILED: Execution Error, return code 3
from org.apache.hadoop.hive.ql.exec.spark.SparkTask
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=Driver.execute
start=1450973565261 end=1450973581307 duration=16046 from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1450973581308
end=1450973581308 duration=0 from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 finished. closing... 
>> 15/12/24 17:13:01 [main]: INFO exec.ListSinkOperator: 7 Close done
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
>> 15/12/24 17:13:01 [main]: INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1450973581362
end=1450973581362 duration=0 from=org.apache.hadoop.hive.ql.Driver>
>> 
>> 
>> The only useful thing I can find at the Spark side is in the worker log:
>> 
>> 15/12/24 17:12:53 INFO worker.Worker: Asked to launch executor app-20151224171253-0000/0
for Hive on Spark
>> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing view acls to: ubuntu
>> 15/12/24 17:12:53 INFO spark.SecurityManager: Changing modify acls to: ubuntu
>> 15/12/24 17:12:53 INFO spark.SecurityManager: SecurityManager: authentication disabled;
ui acls disabled; users with view permissions: Set(ubuntu); users with modify permissions:
Set(ubuntu)
>> 15/12/24 17:12:53 INFO worker.ExecutorRunner: Launch command: "/usr/lib/jvm/java-7-openjdk-amd64/bin/java"
"-cp" "/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/sbin/../conf/:/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/lib/spark-assembly-1.5.2-hadoop2.4.0.jar:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/etc/hadoop/:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar"
"-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=44858" "-Dhive.spark.log.dir=/home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/logs/"
"-XX:MaxPermSize=256m" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url"
"akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler <akka.tcp://sparkDriver@192.168.1.64:44858/user/CoarseGrainedScheduler>"
"--executor-id" "0" "--hostname" "192.168.1.64" "--cores" "3" "--app-id" "app-20151224171253-0000"
"--worker-url" "akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker <akka.tcp://sparkWorker@192.168.1.64:54209/user/Worker>"
>> 15/12/24 17:12:58 INFO worker.Worker: Asked to kill executor app-20151224171253-0000/0
>> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Runner thread for executor app-20151224171253-0000/0
interrupted
>> 15/12/24 17:12:58 INFO worker.ExecutorRunner: Killing process!
>> 15/12/24 17:12:58 ERROR logging.FileAppender: Error writing stream to file /home/ubuntu/Downloads/spark-1.5.2-bin-hadoop2-without-hive/work/app-20151224171253-0000/0/stderr
>> java.io.IOException: Stream closed
>> 	at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
>> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:272)
>> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>> 	at java.io.FilterInputStream.read(FilterInputStream.java:107)
>> 	at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
>> 	at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
>> 	at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
>> 	at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
>> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
>> 	at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
>> 15/12/24 17:12:59 INFO worker.Worker: Executor app-20151224171253-0000/0 finished
with state KILLED exitStatus 143
>> 15/12/24 17:12:59 INFO worker.Worker: Cleaning up local directories for application
app-20151224171253-0000
>> 15/12/24 17:12:59 INFO shuffle.ExternalShuffleBlockResolver: Application app-20151224171253-0000
removed, cleanupLocalDirs = true
>> 
>> Here is my Spark configuration
>> 
>> export HADOOP_HOME=/usr/local/hadoop
>> export PATH=$PATH:$HADOOP_HOME/bin
>> export SPARK_DIST_CLASSPATH=`hadoop class path`
>> 
>> 
>> Any hints as to what could be going wrong? Why is the executor getting killed? Have
I built Spark wrongly? I have tried building it in several different ways and I keep failing.
>> I must admit I am confused with the information I find online on how to use/build
Spark on Hive and which version goes with what.
>> Can I download a pre-built version of Spark that would be suitable with my existing
Hadoop 2.7.1 and my Hive 1.2.1?
>> This error has been baffling me for weeks..
>> 
>> 
>> More than grateful for any help!
>> Sofia
>> 
>> 


Mime
View raw message