hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: InputSplit size and no of map tasks issue
Date Thu, 17 Jun 2010 05:41:57 GMT
Map tasks if you are using TableInputFormat will be equal to the
number of regions in your table.

Region is the natural body of work for a Map task using hbase as a MR
job source.  If little data in your table, splitting this way makes
little sense (You have one region only in your table, is that right?).
 You could force splits of your region to make more via the UI or

Otherwise, you need to make your own Splitter, one that has some
knowledge of the key space and is able to partition on other than
Region boundaries.

See below...

On Wed, Jun 16, 2010 at 10:36 PM, Raghava Mutharaju
<m.vijayaraghava@gmail.com> wrote:
> Hi all,
>      I checked the size of the InputSplit in Map and it gave out 0. I was
> expecting some number indicating the size of split in bytes, that this Map
> has received. Is this normal behavior?

Where are you seeing this (so I can be sure I'm following along properly).


> Another issue I am having is even though I set the mapred.map.tasks to a
> specific number (no of nodes*10), during execution, the no of map tasks is
> always 1. I think this is related to the above issue.
> I am using HBase as the data source and sink. Previously, when I used HDFS
> as data source, the no of map tasks were same as the one I used to set. I am
> using HBase 0.20.4
> Thank you.
> Regards,
> Raghava.

View raw message