hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-313) running a reduce task standalone
Date Fri, 30 Jun 2006 20:58:31 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-313?page=all ]

Owen O'Malley updated HADOOP-313:
---------------------------------

    Attachment: stand-alone-task.patch

This patch does a lot more:
   1. It moves the jobId and partition fields from ReduceTask up into Task.
   2. It replaces Michel's stand-alone reduce runner, with a more general map or reduce runner.
   3. It adds new task localizations into the job.xml.
       For all tasks:
          a. map.task.id = the task id
          b. mapred.task.is.map = is this a map
          c. mapred.task.partition = numeric task id
          d. mapred.job.id = job id
      For maps:
          a. map.input.file = the file that we are reading
          b. map.input.start = the offset in the file to start at
          c. map.input.length = the number of bytes in the split
      For reduces:
          a. mapred.map.tasks = correct number of maps
      These new attributes allow me to reconstruct the MapTask or ReduceTask with just the
job.xml.
   4. A new configuration variable keep.failed.task.files that tells the system to keep files/directories
for tasks that fail. This attribute can be set on a particular JobConf.
     

> running a reduce task standalone
> --------------------------------
>
>          Key: HADOOP-313
>          URL: http://issues.apache.org/jira/browse/HADOOP-313
>      Project: Hadoop
>         Type: Bug

>     Reporter: Michel Tourn
>     Assignee: Michel Tourn
>  Attachments: sareduce.patch, stand-alone-task.patch
>
> This is a tool to reproduce problems and to run unit tests involving a reduce task.
> You just give it a reduce directory on the command line.
> Usage: java org.apache.hadoop.mapred.StandaloneReduceTask <taskdir> [<limitmaps>]
> taskdir name encodes: task_<jobid>_r_<partition>_<attempt>
> taskdir contains job.xml and one or more input files named: map_<dddd>.out
> You should run with the same -Xmx option as the TaskTracker child JVM

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message