hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew McNabb (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-960) Incorrect number of map tasks when there are multiple input files
Date Tue, 30 Jan 2007 19:05:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12468744

Andrew McNabb commented on HADOOP-960:

>From an end-user perspective, it seems like the need to split input evenly would be pretty
common.  Especially for a user like myself who uses hadoop-streaming, it would be nice to
 set an "even splits" configuration option rather than subclassing InputFormat.  Since the
rest of the custom code is non-Java, it is a little awkward to write a Java class for this.

Of course, all of this is based on my belief that this would be commonly desirable.  It seems
like something that would be a nice standard feature of the Hadoop toolbox.  I hope the tone
of my request is more clear now.  I really think that this would be a great thing to be available
for everyone.

Thanks again for everything you do.

> Incorrect number of map tasks when there are multiple input files
> -----------------------------------------------------------------
>                 Key: HADOOP-960
>                 URL: https://issues.apache.org/jira/browse/HADOOP-960
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: documentation
>    Affects Versions: 0.10.1
>            Reporter: Andrew McNabb
>            Priority: Minor
> This problem happens with hadoop-streaming and possibly elsewhere.  If there are 5 input
files, it will create 130 map tasks, even if mapred.map.tasks=128.  The number of map tasks
is incorrectly set to a multiple of the number of files.  (I wrote a much more complete bug
report, but Jira lost it when it had an error, so I'm not in the mood to write it all again)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message