hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Unable to start Hadoop mapred cluster on EC2 with Hadoop 0.20.0
Date Mon, 20 Jul 2009 21:30:59 GMT
Hi Jeyendran,

Is it possible that you've configured the jobtracker's RPC address
(mapred.job.tracker) to be the same as its HTTP address? The "Address
already in use" error indicates that someone is already claiming port 50002.
That might be another daemon on the same machine, or it could be another
port in use by the JT.

-Todd

On Mon, Jul 20, 2009 at 2:20 PM, Jeyendran Balakrishnan <
jbalakrishnan@docomolabs-usa.com> wrote:

> Hello,
>
> I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
> launch a Hadoop cluster on Amazon EC2. To do so, I modified the bundled
> scripts above for my EC2 account, and then created my own Hadoop 0.20.0
> AMI. The steps I followed for creating AMIs and launching EC2 Hadoop
> clusters are the same I was using for over a year with Hadoop 0.18.* and
> 0.19.*.
>
> I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and
> ran the following to launch a new cluster:
> root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2
>
> After the usual EC2 wait, one master and two slave instances were
> launched on EC2, as expected. When I ssh'ed into the instances, here is
> what I found:
>
> Slaves: DataNode and NameNode are running
> Master: Only NameNode is running
>
> I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts)
> without any problems, from both master and slaves. However, since
> JobTracker is not running, I cannot run map-reduce jobs.
>
> I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker,
> reproduced below:
> -----------------------------------------------
> <<<
> 2009-07-20 16:56:30,273 WARN org.apache.hadoop.conf.Configuration:
> DEPRECATED: hadoop-site.xml found in the classpath. Usage of
> hadoop-site.xml is deprecated. Instead use core-site.xml,
> mapred-site.xml and h
> dfs-site.xml to override properties of core-default.xml,
> mapred-default.xml and hdfs-default.xml respectively
> 2009-07-20 16:56:30,320 INFO org.apache.hadoop.mapred.JobTracker:
> STARTUP_MSG:
> /************************************************************
> STARTUP_MSG: Starting JobTracker
> STARTUP_MSG:   host = domU-12-31-39-04-30-16/10.240.55.228
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
> 763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
> ************************************************************/
> 2009-07-20 16:56:31,332 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
> Initializing RPC Metrics with hostName=JobTracker, port=50002
> 2009-07-20 16:56:31,603 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2009-07-20 16:56:31,900 INFO org.apache.hadoop.http.HttpServer: Jetty
> bound to port 50030
> 2009-07-20 16:56:31,900 INFO org.mortbay.log: jetty-6.1.14
> 2009-07-20 16:56:33,461 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:50030
> 2009-07-20 16:56:33,462 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=JobTracker, sessionId=
> 2009-07-20 16:56:33,531 INFO org.apache.hadoop.mapred.JobTracker:
> JobTracker up at: 50002
> 2009-07-20 16:56:33,532 INFO org.apache.hadoop.mapred.JobTracker:
> JobTracker webserver: 50030
> 2009-07-20 16:56:51,554 INFO org.apache.hadoop.mapred.JobTracker:
> Cleaning up the system directory
> 2009-07-20 16:56:53,060 INFO org.apache.hadoop.hdfs.DFSClient:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
> /mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
> nodes, instead of 1
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
> SNamesystem.java:1256)
>        at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4
> 22)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
>        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>        at org.apache.hadoop.ipc.Client.call(Client.java:739)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>        at $Proxy4.addBlock(Unknown Source)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
> a:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
> Impl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
> cationHandler.java:82)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
> Handler.java:59)
>        at $Proxy4.addBlock(Unknown Source)
>        at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF
> SClient.java:2873)
> ...
> ...
> 2009-07-20 16:56:55,878 WARN org.apache.hadoop.hdfs.DFSClient:
> NotReplicatedYetException sleeping
> /mnt/hadoop/mapred/system/jobtracker.info retries left 1
> 2009-07-20 16:56:59,082 WARN org.apache.hadoop.hdfs.DFSClient:
> DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info
> could only
>  replicated to 0 nodes, instead of 1
>        at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
> SNamesystem.java:1256)
> ...
> ...
>
> 2009-07-20 16:57:00,092 FATAL org.apache.hadoop.mapred.JobTracker:
> java.net.BindException: Problem binding to
> domU-12-31-39-04-30-16.compute-1.internal/10.240.55.228:50002 : Address
> already in use
>        at org.apache.hadoop.ipc.Server.bind(Server.java:190)
>        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:253)
>        at org.apache.hadoop.ipc.Server.<init>(Server.java:1026)
>        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:488)
>        at org.apache.hadoop.ipc.RPC.getServer(RPC.java:450)
>        at
> org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1537)
>        at
> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174)
>        at
> org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528)
> Caused by: java.net.BindException: Address already in use
>        at sun.nio.ch.Net.bind(Native Method)
>        at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119
> )
>        at
> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>        at org.apache.hadoop.ipc.Server.bind(Server.java:188)
>        ... 7 more
>
>
> 2009-07-20 16:57:00,093 INFO org.apache.hadoop.mapred.JobTracker:
> SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down JobTracker at
> domU-12-31-39-04-30-16/10.240.55.228
> ************************************************************/
> >>>
> -----------------------------------------------
>
> So it looks like the JobTracker launched, but then died trying to
> replicate the jobtracker.info file to one or more slaves.
>
> Would appreciate any help in this...
>
> Thanks a lot,
> jp
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message