hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Sudheendra <pavan0...@gmail.com>
Subject Can I make use of TableSplit across Regions to make my MR job faster?
Date Mon, 26 Aug 2013 07:16:54 GMT
Hi all,

How to make use of a TableSplit or a Region Split? How is it used in
getSplits() ?

I have 6 Region Servers across the cluster for the map-reduce task which i
am using, How to leverage this so that the table is split across the
clusters and the map-reduce application finishes fast.. Right now, it is
very slow.. For aggregating 3 table values, 1 with 100,000 rows and other
two tables i'm only using get operating to get the value by passing the
key.. For this setup, it takes 40-50 mins.. Which is worse.. The first
table would eventually be around 20-25m rows.. Please lead me in the right
way.. I will paste the code if anybody is interested.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message