hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-25) a new map/reduce example and moving the examples from src/java to src/examples
Date Tue, 07 Feb 2006 07:59:58 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-25?page=comments#action_12365402 ] 

Owen O'Malley commented on HADOOP-25:

Ok, how about if I add a target for "examples" that builds the examples tar ball. Although
compiling the examples is really fast and if they are compiled by default, there is one less
thing for the new user tutorial to explain.

The new package name makes sense.

I guess we can use the hadoop script to set up the classpath, but the users are going to need
to figure out the classpath when they write their own applications. I guess we could go one
step further and modify the script to take a jar and run the "main" class of the jar. So it
would look like:
% bin/hadoop run my-app.jar [args...]
For the example jar, we could use a driver that tested the next string for the class to run,
so it would look like:
% bin/hadoop run build/hadoop-examples.jar (wordcount|grep|...) [args...]

I think that having the application set the number of maps and reduces is better than having
it defined by the cluster. Certainly the number of reduces should be set by the user and/or
application rather than the cluster since it controls the output fragmentation. But even the
optimum number of map instances depends a lot on how resource hungry the map function is.

And finally, you are right that letting main throw out the exception is equivalent, but I
bet the Java language spec doesn't require that the JVM print the exception that was thrown
out of main. I'll change it. (Too much C++ coding where exceptions don't have stacks associated
with them by the runtime system.)

> a new map/reduce example and moving the examples from src/java to src/examples
> ------------------------------------------------------------------------------
>          Key: HADOOP-25
>          URL: http://issues.apache.org/jira/browse/HADOOP-25
>      Project: Hadoop
>         Type: Improvement
>   Components: mapred
>     Reporter: Owen O'Malley
>     Priority: Minor
>  Attachments: examples.patch
> The new example is the word count example from Google's paper. I moved the examples into
a separate jar file to demonstrate how to run stand-alone application code.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message