hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iman E <hadoop_...@yahoo.com>
Subject Re: starting jobtracker error- hadoop config question
Date Mon, 21 Dec 2009 17:00:31 GMT
Hi,
I tried to move to another set of machines of the cluster, mainly choosing another master.
The hdfs will not start. I made sure that all nodes are able to connect to each other without
a password. But data nodes will fail to start with this error:

2009-12-21 11:11:32,852 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /XXX.XXX.XXX.103:54310.
Already tried 0 time(s).
2009-12-21 11:11:53,856 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /XXX.XXX.XXX.103:54310.
Already tried 1 time(s).


I tried changing the port number. I also removed the contents of /tmp where I found some hadoop-*.pid
files but no luck. Any suggestions to fix these problems.
Thanks

 

________________________________
From: Iman E <hadoop_ami@yahoo.com>
To: common-user@hadoop.apache.org
Sent: Thu, December 17, 2009 6:11:59 PM
Subject: starting jobtracker error- hadoop config question

Hi,
I do have this basic question about hadoop configuration. Whenever I try to start the jobtracker
it will remain in "initializing" mode forever, and when I checked the log file, I found the
following errors:

several lines like these for different slaves in my cluster:

2009-12-17 17:47:43,717 INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException: 66000 millis timeout while waiting for channel to be ready
for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/XXX.XXX.XXX.XXX:50010]
2009-12-17 17:47:43,717 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_7740448897934265604_1010
2009-12-17 17:47:43,720 INFO org.apache.hadoop.hdfs.DFSClient: Waiting to find target node:
XXX..XXX.XXX.XXX:50010

then 

2009-12-17 17:47:49,727 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException:
Unable to create new block.
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs..DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
2009-12-17 17:47:49,728 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_7740448897934265604_1010
bad datanode[0] nodes == null
2009-12-17 17:47:49,728 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "${mapred.system.dir}/mapred/system/jobtracker.info" - Aborting...
2009-12-17 17:47:49,728 WARN org.apache.hadoop.mapred.JobTracker: Writing to file ${fs.default.name}/${mapred.system.dir}/mapred/system/jobtracker.info
failed!
2009-12-17 17:47:49,728 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready
yet!
2009-12-17 17:47:49,749 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery
manager. 
java.net.SocketTimeoutException: 66000 millis timeout while waiting for channel to be ready
for connect.. ch : java.nio.channels.SocketChannel[connection-pending remote=/XXX.XXX..XXX.XXX:50010]
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2837)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
2009-12-17 17:47:59,757 WARN org.apache.hadoop.mapred.JobTracker: Retrying...


then it will start all over again. 

I am not sure what is the reason for this error. I tried to set mapred.system.dir to leave
it to the default value, and overwriting it in mapred-site.xml to both local and shared directories
but no use. In all cases the this error will show in the log file: Writing to file ${fs.default.name}/${mapred.system.dir}/mapred/system/jobtracker.info
failed!
Is it true that hadoop append these values together? What should I do to avoid this? Does
anyone know what I am doing wrong or what could be causing these errors?

Thanks


      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message