hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-113) Allow multiple Output Dirs to be specified for a job
Date Thu, 30 Mar 2006 19:56:26 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-113?page=comments#action_12372563 ] 

Doug Cutting commented on HADOOP-113:
-------------------------------------

We should probably instead add a Configuration.getFiles() method, used by this and by getInputDirs().
 This should be implemented in terms of Configuration.getStrings().   And we should add a
Configuration.addFile() method that's used by this and by addInputDir().  This should be implemented
in terms of a Configuration.addString() method.  Otherwise we end up copying the same code
around in too many places.

However I'm not yet convinced that this feature is the best way to achieve your goal.  I've
commented on NUTCH-171 that an alternate mechanism might better achive your goal.  If that
or something like it makes sense (specifying, for a job, the maximum number of its maps &
reduces that should be run on a single node at once, so that a job can use less than the entire
cluster, permitting other jobs to pass it) then we should start a new Hadoop bug for that.

> Allow multiple Output Dirs to be specified for a job
> ----------------------------------------------------
>
>          Key: HADOOP-113
>          URL: http://issues.apache.org/jira/browse/HADOOP-113
>      Project: Hadoop
>         Type: New Feature
>   Components: mapred
>     Versions: 0.1
>     Reporter: Rod Taylor
>  Attachments: hadoop_multisegment.patch
>
> Allow a single job to create multiple outputs. 2 additional simple functions only
> This allows for more complex branching of the process to occur either with multiple steps
of the same type or allow different actions to take place on each output directory depending
on the required actions.
> For my specific use, it allows me to run multiple Generate Outputs instead of a single
Generate Output as submitted in NUTCH-171(http://issues.apache.org/jira/browse/NUTCH-171)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message