hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ari Rabkin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3616) TextInputFormat taking max of two minima as the minimum
Date Mon, 23 Jun 2008 15:30:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607258#action_12607258
] 

Ari Rabkin commented on HADOOP-3616:
------------------------------------

This code looks correct -- the programmer's goal was presumably to ensure that the split size
was at least mapred.min.split.size -- and that means, you need to take the maximum.

> TextInputFormat taking max of two minima as the minimum
> -------------------------------------------------------
>
>                 Key: HADOOP-3616
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3616
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.0
>            Reporter: Josh Myer
>            Priority: Minor
>
> When choosing its minimum split size, FileInputFormat is using the larger of the two
minimum split values, instead of the smaller.  I can't find any good explanation for why this
would be, so it would be helpful to add a comment there (or change to Math.min if that's the
intent).
> Line 237:
>     long minSize = Math.max(job.getLong("mapred.min.split.size", 1), minSplitSize);

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message