hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8951) Spark remote context doesn't work with local-cluster [Spark Branch]
Date Mon, 24 Nov 2014 22:26:13 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223646#comment-14223646
] 

Xuefu Zhang commented on HIVE-8951:
-----------------------------------

okay. I saw this in Hive log:
{code}
2014-11-24 14:16:05,403 ERROR [main]: exec.Task (SparkTask.java:execute(126)) - Failed to
execute spark task.
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:122)
        at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:105)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1645)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1405)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1217)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:712)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.spark.SparkException: Timed out waiting for remote driver to connect.
        at org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:85)
        at org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79)
        at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:75)
        at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:54)
        at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:53)
        ... 20 more
{code}
Any idea?

> Spark remote context doesn't work with local-cluster [Spark Branch]
> -------------------------------------------------------------------
>
>                 Key: HIVE-8951
>                 URL: https://issues.apache.org/jira/browse/HIVE-8951
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>
> What I did:
> {code}
> set spark.home=/home/xzhang/apache/spark;
> set spark.master=local-cluster[2,1,2048];
> set hive.execution.engine=spark; 
> set spark.executor.memory=2g;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
> set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;
> select name, avg(value) as v from dec group by name order by v;
> {code}
> Exeptions seen:
> {code}
> 14/11/23 10:42:15 INFO Worker: Spark home: /home/xzhang/apache/spark
> 14/11/23 10:42:15 INFO AppClient$ClientActor: Connecting to master spark://xzdt.local:55151...
> 14/11/23 10:42:15 INFO Master: Registering app Hive on Spark
> 14/11/23 10:42:15 INFO Master: Registered app Hive on Spark with ID app-20141123104215-0000
> 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app
ID app-20141123104215-0000
> 14/11/23 10:42:15 INFO NettyBlockTransferService: Server created on 41676
> 14/11/23 10:42:15 INFO BlockManagerMaster: Trying to register BlockManager
> 14/11/23 10:42:15 INFO BlockManagerMasterActor: Registering block manager xzdt.local:41676
with 265.0 MB RAM, BlockManagerId(<driver>, xzdt.local, 41676)
> 14/11/23 10:42:15 INFO BlockManagerMaster: Registered BlockManager
> 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling
beginning after reached minRegisteredResourcesRatio: 0.0
> 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040:
java.net.BindException: Address already in use
> java.net.BindException: Address already in use
> 	at sun.nio.ch.Net.bind0(Native Method)
> 	at sun.nio.ch.Net.bind(Net.java:174)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
> 	at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.eclipse.jetty.server.Server.doStart(Server.java:293)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
> 	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> 	at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
> 	at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204)
> 	at org.apache.spark.ui.WebUI.bind(WebUI.scala:102)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:267)
> 	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> 	at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:106)
> 	at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:616)
> 	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@4c9fd062:
java.net.BindException: Address already in use
> java.net.BindException: Address already in use
> 	at sun.nio.ch.Net.bind0(Native Method)
> 	at sun.nio.ch.Net.bind(Net.java:174)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
> 	at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.eclipse.jetty.server.Server.doStart(Server.java:293)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
> 	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> 	at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
> 	at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204)
> 	at org.apache.spark.ui.WebUI.bind(WebUI.scala:102)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:267)
> 	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> 	at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:106)
> 	at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:616)
> 	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> I also saw SparkSubmit process is working hard launching some other processes:
> {code}
> xzhang@xzdt:~/apache/spark$ jps
> 12731 CoarseGrainedExecutorBackend
> 11746 RunJar
> 25974 TaskTracker
> 12067 SparkSubmit
> 25524 SecondaryNameNode
> 25771 JobTracker
> 25280 DataNode
> 25108 NameNode
> 12885 Jps
> 12742 CoarseGrainedExecutorBackend
> 12408 CoarseGrainedExecutorBackend
> 12409 CoarseGrainedExecutorBackend
> 11879 SparkSubmit
> {code}
> If I change spark.master to point to a standalone, it works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message