spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <vivek.meghanat...@wipro.com>
Subject RE: spark 1.6.0 on ec2 doesn't work
Date Tue, 19 Jan 2016 07:05:45 GMT
Have you verified the spark master/slaves are started correctly? Please check using netstat
command and open ports mode. Are they listening? Binds to which address etc..

From: Oleg Ruchovets [mailto:oruchovets@gmail.com]
Sent: 19 January 2016 11:24
To: Peter Zhang <zhangjunfs@gmail.com>
Cc: Daniel Darabos <daniel.darabos@lynxanalytics.com>; user <user@spark.apache.org>
Subject: Re: spark 1.6.0 on ec2 doesn't work

I am running from  $SPARK_HOME.
    It looks like connection  problem to port 9000. It is on master machine.
What is this process is spark tries to connect?
Should I start any framework , processes before executing spark?

Thanks
OIeg.


16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 9 time(s); retry

On Tue, Jan 19, 2016 at 1:13 PM, Peter Zhang <zhangjunfs@gmail.com<mailto:zhangjunfs@gmail.com>>
wrote:
Could you run spark-shell at $SPARK_HOME DIR?

You can try to change you command run at $SPARK_HOME or, point to README.md with full path.


Peter Zhang
--
Google
Sent with Airmail


On January 19, 2016 at 11:26:14, Oleg Ruchovets (oruchovets@gmail.com<mailto:oruchovets@gmail.com>)
wrote:
It looks spark is not working fine :

I followed this link ( http://spark.apache.org/docs/latest/ec2-scripts.html. ) and I see spot
instances installed on EC2.

from spark shell I am counting lines and got connection exception.
scala> val lines = sc.textFile("README.md")
scala> lines.count()



scala> val lines = sc.textFile("README.md")

16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated
size 26.5 KB, free 26.5 KB)
16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory
(estimated size 5.6 KB, free 32.1 KB)
16/01/19 03:17:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.31.28.196:44028<http://172.31.28.196:44028>
(size: 5.6 KB, free: 511.5 MB)
16/01/19 03:17:35 INFO spark.SparkContext: Created broadcast 0 from textFile at <console>:21
lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21

scala> lines.count()

16/01/19 03:17:55 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>.
Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
java.lang.RuntimeException: java.net.ConnectException: Call to ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>
failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:567)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count(RDD.scala:1143)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
at $iwC$$iwC$$iwC.<init>(<console>:33)
at $iwC$$iwC.<init>(<console>:35)
at $iwC.<init>(<console>:37)
at <init>(<console>:39)
at .<init>(<console>:43)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org<http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org<http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Call to ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>
failed on connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:563)
... 64 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:457)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249)
at org.apache.hadoop.ipc.Client.call(Client.java:1093)
... 84 more


scala>




On Tue, Jan 19, 2016 at 1:22 AM, Daniel Darabos <daniel.darabos@lynxanalytics.com<mailto:daniel.darabos@lynxanalytics.com>>
wrote:

On Mon, Jan 18, 2016 at 5:24 PM, Oleg Ruchovets <oruchovets@gmail.com<mailto:oruchovets@gmail.com>>
wrote:
I thought script tries to install hadoop / hdfs also. And it looks like it failed. Installation
is only standalone spark without hadoop. Is it correct behaviour?

Yes, it also sets up two HDFS clusters. Are they not working? Try to see if Spark is working
by running some simple jobs on it. (See http://spark.apache.org/docs/latest/ec2-scripts.html.)

There is no program called Hadoop. If you mean YARN, then indeed the script does not set up
YARN. It sets up standalone Spark.

Also errors in the log:
   ERROR: Unknown Tachyon version
   Error: Could not find or load main class crayondata.com.log

As long as Spark is working fine, you can ignore all output from the EC2 script :).


The information contained in this electronic message and any attachments to this message are
intended for the exclusive use of the addressee(s) and may contain proprietary, confidential
or privileged information. If you are not the intended recipient, you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately and destroy all copies
of this message and any attachments. WARNING: Computer viruses can be transmitted via email.
The recipient should check this email and any attachments for the presence of viruses. The
company accepts no liability for any damage caused by any virus transmitted by this email.
www.wipro.com
Mime
View raw message