Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 67218 invoked from network); 7 Feb 2006 17:21:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Feb 2006 17:21:25 -0000 Received: (qmail 57053 invoked by uid 500); 7 Feb 2006 17:21:21 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 57034 invoked by uid 500); 7 Feb 2006 17:21:21 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 57025 invoked by uid 99); 7 Feb 2006 17:21:21 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2006 09:21:21 -0800 Received: from ajax.apache.org (ajax.apache.org [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id 05255CB for ; Tue, 7 Feb 2006 18:21:00 +0100 (CET) Message-ID: <1857935748.1139332860018.JavaMail.jira@ajax.apache.org> Date: Tue, 7 Feb 2006 18:21:00 +0100 (CET) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-25) a new map/reduce example and moving the examples from src/java to src/examples In-Reply-To: <187621972.1139267817333.JavaMail.jira@ajax.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-25?page=comments#action_12365449 ] Doug Cutting commented on HADOOP-25: ------------------------------------ I don't feel that strongly in particular about "ant compile" compiling the examples. It's more the principle of keeping that command, the default, minimal. When the next person comes along and adds something optional that compiles to build.xml I don't want them to also add it to "ant compile". On the other hand, "ant test" should be maximal, compiling and testing as much as possible. My rule is that "ant clean test" should be run before every commit. I like the idea of making bin/hadoop easily extensible with something like 'bin/hadoop run build/hadoop-examples.jar'. We could by convention create executable jars whose default main() listed the commands that the jar supports. We could even change hadoop.jar to be like this, moving the command selection logic out of the shell script and into a Java class. +1 I really think it would be nice if for common MapReduce operations (e.g., sorting, inverting, etc.) on a well configured cluster one does not have to specify number of map tasks or reduce tasks. That way one can run something on one cluster with 20 single-processor machines, and then turn and run it on another with 200 dual-processor machines with the system doing a reasonable job. One should also be able to fine-tune things for a particular cluster if one likes, but that should be optional. There are cases where the precise number of outputs is critical, but there are (in my experience) many more where the precise number of outputs does not matter. One way to sidestep this might be to add a standard '-D' option to bin/hadoop that permits one to specify any configuration option. That way one could, e.g., always easily set the number of map or reduce tasks for each job, but also not be forced to. And you're right: a JVM probably isn't required to print that stack, but I'm strongly in favor of things that make code smaller (easier to read, easier to maintain), especially example code. > a new map/reduce example and moving the examples from src/java to src/examples > ------------------------------------------------------------------------------ > > Key: HADOOP-25 > URL: http://issues.apache.org/jira/browse/HADOOP-25 > Project: Hadoop > Type: Improvement > Components: mapred > Reporter: Owen O'Malley > Priority: Minor > Attachments: examples.patch > > The new example is the word count example from Google's paper. I moved the examples into a separate jar file to demonstrate how to run stand-alone application code. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira