hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Foss User <foss...@gmail.com>
Subject All keys went to single reducer in WordCount program
Date Thu, 07 May 2009 10:42:26 GMT
I have two reducers running on two different machines. I ran the
example word count program with some of my own System.out.println()
statements to see what is going on.

There were 2 slaves each running datanode as well as tasktracker.
There was one namenode and one jobtracker. I know there is a very
elaborate setup for such a small cluster but I did it only to learn.

I gave two input files, a.txt and b.txt with a few lines of english
text. Now, here are my questions.

(1) I found that three mapper tasks ran, all in the first slave. The
first task processed the first file. The second task processed the
second file. The third task didn't process anything. Why is it that
the third task did not process anything? Why was this task created in
the first place?

(2) I found only one reducer task, on the second slave. It processed
all the values for keys. keys were words in this case of Text type. I
tried printing out the key.hashCode() for each key and some of them
were even and some of them were odd. I was expecting the keys with
even hashcodes to go to one slave and the others to go to another
slave. Why didn't this happen?

Mime
View raw message