hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Dahiya (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-307) Many small jobs benchmark for MapReduce
Date Sun, 06 Aug 2006 09:20:14 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-307?page=all ]

Sanjay Dahiya updated HADOOP-307:

    Attachment: patch.txt

Patch for classpath issues. The benchmark can now be run using hadoop script without having
to set any extra classpath - $HADOOP_HOME/bin/hadoop jar <path to MRBenchmark.jar> smallJobsBenchmark
<options .. >. See Readme.txt for an example of options. 
bin/run.sh script can be used as an optional helper script if benchmark needs to be run multiple
times with different input configurations. 

thanks Uros for pointing this out. 

> Many small jobs benchmark for MapReduce
> ---------------------------------------
>                 Key: HADOOP-307
>                 URL: http://issues.apache.org/jira/browse/HADOOP-307
>             Project: Hadoop
>          Issue Type: Task
>          Components: mapred
>            Reporter: Sanjay Dahiya
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>             Fix For: 0.5.0
>         Attachments: patch.txt, patch.txt, patch.txt
> A benchmark that runs many small MapReduce tasks in sequence. A single map reduce implementation
is used, it is invoked multiple times with input as the output from previous run. The input
to first Map is a TextInputFormat ( a text file with few hundred KBs). Input records are passed
to output without much processing. The idea is to benchmark the time taken by initialization
of Mapper and Reducer. An initial prototyping on a single machine with 20 MR tasks in sequence
took ~47 seconds per task. Looking for suggestions on what else can be included in the benchmark.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message