hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Map phase hanging for wordcount example
Date Tue, 06 Sep 2011 10:15:58 GMT
The wordcount example, by default, will run a single reducer. This is
controllable by passing -Dmapred.reduce.tasks=2 to your launcher. The
following will work:

hadoop jar hadoop-examples.jar wordcount -Dmapred.reduce.tasks=2 input output

Note that just cause a cluster has N nodes, N reducers aren't
necessary to run. It is not dependent on such things, and is simply a
user-configurable number with the default value of 1.

On Tue, Sep 6, 2011 at 3:23 PM, john smith <js1987.smith@gmail.com> wrote:
> Yep , it works .. I just synced /etc/hosts files and I didnt change other
> configs and now its working fine. Thanks for the help Harsh. Sorry for
> spamming without checking my TTlogs properly.
>
> Also 1 more doubt . Any idea why its scheduling only a single reduce? I have
> 2 datanodes and I am expecting it to run 2 reducers (data size of 500MB) .
>
> Any hints?
>
>
> On Tue, Sep 6, 2011 at 3:17 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> John,
>>
>> Yes, looks like your slave nodes aren't able to properly resolve some
>> hostnames. Hadoop requires a sane network setup to work properly.
>> Also, yes, you need to use a hostname for your fs.default.name and
>> other configs to the extent possible.
>>
>> The easiest way is to keep a properly synchronized /etc/hosts file.
>>
>> For example, it may look like so, on all machines:
>>
>> 127.0.0.1 localhost.localdomain localhost
>> 192.168.0.1 master.hadoop master
>> 192.168.0.2 slave3.hadoop slave3
>> (and so on…)
>>
>> (This way master can resolve slaves, and slaves can resolve master. If
>> you have the time, setup a DNS, its the best thing to do.)
>>
>> Then, in core-site.xml you'll need:
>>
>> fs.default.name = hdfs://master
>>
>> And in mapred-site.xml:
>>
>> mapred.job.tracker = master:8021
>>
>> That should do it, so long as the slave hosts can freely access the
>> master hosts (no blockage of ports via firewall and such).
>>
>> On Tue, Sep 6, 2011 at 3:05 PM, john smith <js1987.smith@gmail.com> wrote:
>> > Hey My TT logs show this ,
>> >
>> > 2011-09-06 13:22:41,860 ERROR org.apache.hadoop.mapred.TaskTracker:
>> Caught
>> > exception: java.net.UnknownHostException: unknown host: rip-pc.local
>> > at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
>> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:853)
>> > at org.apache.hadoop.ipc.Client.call(Client.java:723)
>> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>> > at $Proxy5.getProtocolVersion(Unknown Source)
>> > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>> > at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>> > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>> > at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>> > ^C at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>> >
>> >
>> > May be some error in configs ?? I am using IPs in the conf files ..should
>> I
>> > put entries in /etc/hosts files?
>> >
>> > On Tue, Sep 6, 2011 at 3:00 PM, john smith <js1987.smith@gmail.com>
>> wrote:
>> >
>> >> Hi Harsh,
>> >>
>> >> My jt log : http://pastebin.com/rXAEeDkC
>> >>
>> >> I have some startup exceptions (which doesn't matter much I guess) but
>> the
>> >> tail indicates that its locating the splits correctly and then it hangs
>> !
>> >>
>> >> Any idea?
>> >>
>> >> Thanks
>> >>
>> >>
>> >> On Tue, Sep 6, 2011 at 1:30 PM, Harsh J <harsh@cloudera.com> wrote:
>> >>
>> >>> I'd check the tail of JobTracker logs after a submit is done to see
if
>> >>> an error/warn there is causing this. And then dig further on
>> >>> why/what/how.
>> >>>
>> >>> Hard to tell what your problem specifically is without logs :)
>> >>>
>> >>> On Tue, Sep 6, 2011 at 1:18 PM, john smith <js1987.smith@gmail.com>
>> >>> wrote:
>> >>> > Hi Folks,
>> >>> >
>> >>> > I am working on a 3 node cluster (1 NN + 2 DNs) . I loaded some
test
>> >>> data
>> >>> > with replication factor 3 (around 400MB data). However when I run
>> >>> wordcount
>> >>> > example , it hangs at map 0%.
>> >>> >
>> >>> > bin/hadoop jar hadoop-examples-0.20.3-SNAPSHOT.jar wordcount
>> /test_data
>> >>> > /out2
>> >>> > 11/09/06 13:07:28 INFO input.FileInputFormat: Total input paths
to
>> >>> process :
>> >>> > 2
>> >>> > 11/09/06 13:07:28 INFO mapred.JobClient: Running job:
>> >>> job_201109061248_0002
>> >>> > 11/09/06 13:07:29 INFO mapred.JobClient:  map 0% reduce 0%
>> >>> >
>> >>> > TTs and DNs are running fine on my slaves . I see them running
when I
>> >>> run
>> >>> > jps command.
>> >>> >
>> >>> >
>> >>> > Can any one help me out on this? Any idea why this would happen?
I am
>> >>> > totally clueless as nothing shows up in logs too.!
>> >>> >
>> >>> > Thanks,
>> >>> > jS
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Harsh J
>> >>>
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>



-- 
Harsh J

Mime
View raw message