hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1172) Modify TableInputFormat splitting algorithm to allow any number of mappers
Date Wed, 25 Feb 2009 20:47:01 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jonathan Gray updated HBASE-1172:

    Fix Version/s:     (was: 0.19.1)

Fix for 0.20.0

> Modify TableInputFormat splitting algorithm to allow any number of mappers
> --------------------------------------------------------------------------
>                 Key: HBASE-1172
>                 URL: https://issues.apache.org/jira/browse/HBASE-1172
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>             Fix For: 0.20.0
> Currently, the number of mappers specified when using TableInputFormat is strictly followed
if less than total regions on the input table.  If greater, the number of regions is used.
> This will modify the splitting algorithm to do the following:
> - Specify 0 mappers when you want # mappers = # regions
> - If you specify fewer mappers than regions, will use exactly the number you specify
based on the current algorithm
> - If you specify more mappers than regions, will divide regions up by determining [start,X)
[X,end).  The number of mappers will always be a multiple of number of regions.  This is so
we do not have scanners spanning multiple regions.
> There is an additional issue in that the default number of mappers in JobConf is set
to 1.  That means if a user does not explicitly set number of map tasks, a single mapper will
be used.  I'm going to deal with that in a separate jira as the issue currently exists, there
are a number of ways to implement this, and it's not required to complete this issue.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message