hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <praveen.pe...@nokia.com>
Subject RE: Unable to use hadoop cluster on the cloud
Date Wed, 09 Mar 2011 02:31:24 GMT
Thanks Adarsh. You are right slave nodes are not able to talk to each other.

It turns out that Hadoop is getting confused due to multiple nic cards. Hadoop is returning
machine name of slave node instead of hostname configured in slaves file. Machine name is
tied to the public ip address that is not visible to other machines. So I configured hadoop
on slave machines using "slave-host-name" property in mapred-site.xml with proper host name
instead of hadoop guessing the wrong name. That fixed the issue.

I still think Hadoop is making this whole thing complicated when there are multiple nic cards.
There must be a simple way to do this. Since copy/delete etc was working fine, may be the
same logic needs to be used by tasktracker to communicate between servers.

Thanks all for your help.

Praveen
________________________________________
From: ext Adarsh Sharma [adarsh.sharma@orkash.com]
Sent: Monday, March 07, 2011 12:26 AM
To: common-user@hadoop.apache.org
Subject: Re: Unable to use hadoop cluster on the cloud

praveen.peddi@nokia.com wrote:
> Thanks Adarsh for the reply.
>
> Just to clarify the issue a bit, I am able to do all operations (-copyFromLocal, -get
-rmr etc) from the master node. So I am confident that the communication between all hadoop
machines is fine. But when I do the same operation from another machine that also has same
hadoop config, I get below errors. However I can do -lsr and it lists the files correctly.
>

Praveen, Your error is due to communication problem between your
datanodes i.e Datanode1 couldn't able to place the replica of a block
into coresponding datanode2.
U mention the HDFS commands.

Simply check from datanode1 as

ssh datanode2_ip or
ping datanode2_ip


Best Rgds, Adarsh


> Praveen
>
> -----Original Message-----
> From: ext Adarsh Sharma [mailto:adarsh.sharma@orkash.com]
> Sent: Friday, March 04, 2011 12:12 AM
> To: common-user@hadoop.apache.org
> Subject: Re: Unable to use hadoop cluster on the cloud
>
> Hi Praveen, Check through ssh & ping that your datanodes are communicating with each
other or not.
>
> Cheers, Adarsh
> praveen.peddi@nokia.com wrote:
>
>> Hello all,
>> I installed hadoop0.20.2 on physical machines and everything works like a charm.
Now I installed hadoop using the same hadoop-install gz file on the cloud. Installation seems
fine. I can even copy files to hdfs from master machine. But when I try to do it from another
"non hadoop" machine, I get following error. I did googling and lot of people got this error
but could not find any solution.
>>
>> Also I didn't see any exceptions in the hadoop logs.
>>
>> Any thoughts?
>>
>> $ /usr/local/hadoop-0.20.2/bin/hadoop fs -copyFromLocal
>> Merchandising-ear.tar.gz /tmp/hadoop-test/Merchandising-ear.tar.gz
>> 11/03/03 21:58:50 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.ConnectException: Connection timed
>> out
>> 11/03/03 21:58:50 INFO hdfs.DFSClient: Abandoning block
>> blk_-8243207628973732008_1005
>> 11/03/03 21:58:50 INFO hdfs.DFSClient: Waiting to find target node:
>> xx.xx.12:50010
>> 11/03/03 21:59:17 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.ConnectException: Connection timed
>> out
>> 11/03/03 21:59:17 INFO hdfs.DFSClient: Abandoning block
>> blk_2852127666568026830_1005
>> 11/03/03 21:59:17 INFO hdfs.DFSClient: Waiting to find target node:
>> xx.xx.16.12:50010
>> 11/03/03 21:59:44 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.ConnectException: Connection timed
>> out
>> 11/03/03 21:59:44 INFO hdfs.DFSClient: Abandoning block
>> blk_2284836193463265901_1005
>> 11/03/03 21:59:44 INFO hdfs.DFSClient: Waiting to find target node:
>> xx.xx.16.12:50010
>> 11/03/03 22:00:11 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.ConnectException: Connection timed
>> out
>> 11/03/03 22:00:11 INFO hdfs.DFSClient: Abandoning block
>> blk_-5600915414055250488_1005
>> 11/03/03 22:00:11 INFO hdfs.DFSClient: Waiting to find target node:
>> xx.xx.16.11:50010
>> 11/03/03 22:00:17 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException:
Unable to create new block.
>>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
>>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC
>> lient.java:2288)
>>
>> 11/03/03 22:00:17 WARN hdfs.DFSClient: Error Recovery for block
>> blk_-5600915414055250488_1005 bad datanode[0] nodes == null
>> 11/03/03 22:00:17 WARN hdfs.DFSClient: Could not get block locations. Source file
"/tmp/hadoop-test/Merchandising-ear.tar.gz" - Aborting...
>> copyFromLocal: Connection timed out
>> 11/03/03 22:00:17 ERROR hdfs.DFSClient: Exception closing file
>> /tmp/hadoop-test/Merchandising-ear.tar.gz : java.net.ConnectException:
>> Connection timed out
>> java.net.ConnectException: Connection timed out
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>>         at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870)
>>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
>>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC
>> lient.java:2288)
>> [C4554954_admin@c4554954vl03 relevancy]$
>>
>>
>>
>>
>
>


Mime
View raw message