pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (PIG-3903) Configure mapred.min.split.size to be same as pig.maxCombinedSplitSize
Date Fri, 18 Apr 2014 04:25:16 GMT

     [ https://issues.apache.org/jira/browse/PIG-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Rohini Palaniswamy reassigned PIG-3903:

    Assignee: Rohini Palaniswamy

> Configure mapred.min.split.size to be same as pig.maxCombinedSplitSize
> ----------------------------------------------------------------------
>                 Key: PIG-3903
>                 URL: https://issues.apache.org/jira/browse/PIG-3903
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
> FileInputFormat calculates the split size as 
> Math.max(minSize, Math.min(maxSize, blockSize));
> By default pig.maxCombinedSplitSize is 128MB if pig.noSplitCombinaton is not specifically
turned off. We should set the mapred.min.split.size (if not already set by the user) to same
as pig.maxCombinedSplitSize, so the underlying FileInputFormat itself gives us bigger splits
when possible instead of pig combining smaller splits.

This message was sent by Atlassian JIRA

View raw message