hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Parallel Scan with TableMapReduceUtil
Date Fri, 16 May 2014 14:09:58 GMT
Hi Guillermo,

You should see as many MR tasks as you have regions in your input table.
There will be one scan per task. They will all run in parallel is you have
enough MR slots. Else, some of them will run in parallel, and the others
will wait for an available slot. HBase will try to run those tasks on the
RS the regions are. So doing on the client side using multiple thread will
have a bigger impact on the resources usage since you will have a lot of
calls between the client and all the region servers.


2014-05-07 8:34 GMT-04:00 Guillermo Ortiz <konstt2000@gmail.com>:

> I am processing data from HBase with a MapReduce. The input of my MapReduce
> is a "full" scan of a table.
> When I execute a full scan with TableMapReduceUtil, is this scan executed
> in parallel, so all mappers get the data in parallel?? same way that if I
> would execute many range scans with threads?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message