hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Laukik Chitnis (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-657) splitsize is ignored in PigInputFormat
Date Thu, 05 Feb 2009 23:35:59 GMT
splitsize is ignored in PigInputFormat
--------------------------------------

                 Key: PIG-657
                 URL: https://issues.apache.org/jira/browse/PIG-657
             Project: Pig
          Issue Type: Bug
            Reporter: Laukik Chitnis


The way to control the number of mappers in Hadoop has been to specify a mapred.min.split.size
parameter in the job conf. For eg.  mapred.min.split.size=1073741824,mapred.map.tasks=10

However, even if this parameter is specified, Pig creates the number of mappers depending
only on the number of blocks in the file. This is because the parameter is not used in the
PigInputFormat.

The parameter can actually be extracted from the job conf object. So, one way of doing this
would be to pass an handle to the job conf object to the PigInputFormat or the custom slicer.




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message