pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PIG-3903) Configure mapred.min.split.size to be same as pig.maxCombinedSplitSize
Date Thu, 17 Apr 2014 19:20:15 GMT
Rohini Palaniswamy created PIG-3903:
---------------------------------------

             Summary: Configure mapred.min.split.size to be same as pig.maxCombinedSplitSize
                 Key: PIG-3903
                 URL: https://issues.apache.org/jira/browse/PIG-3903
             Project: Pig
          Issue Type: Bug
            Reporter: Rohini Palaniswamy


FileInputFormat calculates the split size as 
Math.max(minSize, Math.min(maxSize, blockSize));

By default pig.maxCombinedSplitSize is 128MB if pig.noSplitCombinaton is not specifically
turned off. We should set the mapred.min.split.size (if not already set by the user) to same
as pig.maxCombinedSplitSize, so the underlying FileInputFormat itself gives us bigger splits
when possible instead of pig combining smaller splits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message