hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Foss User <foss...@gmail.com>
Subject Question on distribution of classes and jobs
Date Sat, 04 Apr 2009 06:39:34 GMT
If I have written a WordCount.java job in this manner:

        conf.setMapperClass(Map.class);
        conf.setCombinerClass(Combine.class);
        conf.setReducerClass(Reduce.class);

So, you can see that three classes are being used here.  I have
packaged these classes into a jar file called wc.jar and I run it like
this:

$ bin/hadoop jar wc.jar WordCountJob

1) I want to know when the job runs in a 5 machine cluster, is the
whole JAR file distributed across the 5 machines or the individual
class files are distributed individually?

2) Also, let us say the number of reducers are 2 while the number of
mappers are 5. What happens in this case? How are the class files or
jar files distributed?

3) Are they distributed via RPC or HTTP?

Mime
View raw message