hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1199) configure total number of mappers
Date Thu, 25 Feb 2010 19:16:27 GMT
configure total number of mappers
---------------------------------

                 Key: HIVE-1199
                 URL: https://issues.apache.org/jira/browse/HIVE-1199
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
             Fix For: 0.6.0


For users, it might be very difficult to control the number of mappers. There are many parameters
which confuses the users - 
for CombineHiveInputFormat, a different set of parameters is required to control the number
of mappers.

In general, users should have a way to specify the total number of mappers, which should be
obeyed. This will be very difficult
to guarantee, since the query might be reading from a large number of partitions, where a
mapper can only span one partition.
What if the number of mappers that the user wants is less than the total number of partitions
?

It would be a very hueristic to have - a simple usecase that Joy had is as follows:

A query needs to be run on one table, which has a lot of small files - it will be easy for
him to specify the total number of mappers
rather than the various rac local/node local combinefileinputformat parameters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message