hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeyendran Balakrishnan" <jbalakrish...@docomolabs-usa.com>
Subject Unable to start Hadoop mapred cluster on EC2 with Hadoop 0.20.0
Date Mon, 20 Jul 2009 21:20:00 GMT
Hello,

I downloaded Hadoop 0.20.0 and used the src/contrib/ec2/bin scripts to
launch a Hadoop cluster on Amazon EC2. To do so, I modified the bundled
scripts above for my EC2 account, and then created my own Hadoop 0.20.0
AMI. The steps I followed for creating AMIs and launching EC2 Hadoop
clusters are the same I was using for over a year with Hadoop 0.18.* and
0.19.*.

I launched an instance with my new Hadoop 0.20.0 AMI, then logged in and
ran the following to launch a new cluster:
root(/vol/hadoop-0.20.0)> bin/launch-hadoop-cluster hadoop-test 2

After the usual EC2 wait, one master and two slave instances were
launched on EC2, as expected. When I ssh'ed into the instances, here is
what I found:

Slaves: DataNode and NameNode are running
Master: Only NameNode is running

I could use HDFS commands (using $HADOOP_HOME/bin/hadoop scripts)
without any problems, from both master and slaves. However, since
JobTracker is not running, I cannot run map-reduce jobs.

I checked the logs from /vol/hadoop-0.20.0/logs for the JobTracker,
reproduced below:
-----------------------------------------------
<<<
2009-07-20 16:56:30,273 WARN org.apache.hadoop.conf.Configuration:
DEPRECATED: hadoop-site.xml found in the classpath. Usage of
hadoop-site.xml is deprecated. Instead use core-site.xml,
mapred-site.xml and h
dfs-site.xml to override properties of core-default.xml,
mapred-default.xml and hdfs-default.xml respectively
2009-07-20 16:56:30,320 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host = domU-12-31-39-04-30-16/10.240.55.228
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r
763504; compiled by 'ndaley' on Thu Apr  9 05:18:40 UTC 2009
************************************************************/
2009-07-20 16:56:31,332 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=JobTracker, port=50002
2009-07-20 16:56:31,603 INFO org.mortbay.log: Logging to
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
org.mortbay.log.Slf4jLog
2009-07-20 16:56:31,900 INFO org.apache.hadoop.http.HttpServer: Jetty
bound to port 50030
2009-07-20 16:56:31,900 INFO org.mortbay.log: jetty-6.1.14
2009-07-20 16:56:33,461 INFO org.mortbay.log: Started
SelectChannelConnector@0.0.0.0:50030
2009-07-20 16:56:33,462 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-07-20 16:56:33,531 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker up at: 50002
2009-07-20 16:56:33,532 INFO org.apache.hadoop.mapred.JobTracker:
JobTracker webserver: 50030
2009-07-20 16:56:51,554 INFO org.apache.hadoop.mapred.JobTracker:
Cleaning up the system directory
2009-07-20 16:56:53,060 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/mnt/hadoop/mapred/system/jobtracker.info could only be replicated to 0
nodes, instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
SNamesystem.java:1256)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:4
22)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
cationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
Handler.java:59)
        at $Proxy4.addBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DF
SClient.java:2873)
...
...
2009-07-20 16:56:55,878 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/mnt/hadoop/mapred/system/jobtracker.info retries left 1
2009-07-20 16:56:59,082 WARN org.apache.hadoop.hdfs.DFSClient:
DataStreamer Exception: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File /mnt/hadoop/mapred/system/jobtracker.info
could only 
 replicated to 0 nodes, instead of 1
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(F
SNamesystem.java:1256)
...
...

2009-07-20 16:57:00,092 FATAL org.apache.hadoop.mapred.JobTracker:
java.net.BindException: Problem binding to
domU-12-31-39-04-30-16.compute-1.internal/10.240.55.228:50002 : Address
already in use
        at org.apache.hadoop.ipc.Server.bind(Server.java:190)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:253)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:1026)
        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:488)
        at org.apache.hadoop.ipc.RPC.getServer(RPC.java:450)
        at
org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1537)
        at
org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174)
        at
org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528)
Caused by: java.net.BindException: Address already in use
        at sun.nio.ch.Net.bind(Native Method)
        at
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119
)
        at
sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
        at org.apache.hadoop.ipc.Server.bind(Server.java:188)
        ... 7 more


2009-07-20 16:57:00,093 INFO org.apache.hadoop.mapred.JobTracker:
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down JobTracker at
domU-12-31-39-04-30-16/10.240.55.228
************************************************************/
>>>
-----------------------------------------------

So it looks like the JobTracker launched, but then died trying to
replicate the jobtracker.info file to one or more slaves.

Would appreciate any help in this...

Thanks a lot,
jp


Mime
View raw message