hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1465) archive partSize should be configurable
Date Mon, 08 Feb 2010 18:02:28 GMT
archive partSize should be configurable

                 Key: MAPREDUCE-1465
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1465
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: harchive
            Reporter: Tsz Wo (Nicholas), SZE
            Assignee: Mahadev konar

The archive part size is current set to 2GB.  For archiving 10^5 small files, it took 52 minutes
since there is only 1 mapper.

-bash-3.1$ time $H archive ${Q} -archiveName ${DIR}.3.har -p ${PARENT} ${DIR} ${PARENT}
10/02/06 01:55:14 INFO mapred.JobClient: Running job: job_201002042035_5737
10/02/06 02:47:18 INFO mapred.JobClient:  map 100% reduce 100%
10/02/06 02:47:19 INFO mapred.JobClient: Job complete: job_201002042035_5737
10/02/06 02:47:19 INFO mapred.JobClient:     Reduce input records=100002

real    52m27.188s
user    0m29.314s
sys     0m1.276s

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message