hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhaval Makawana <dhaval.makaw...@gmail.com>
Subject Number of map jobs per region
Date Sun, 28 Aug 2011 09:05:56 GMT
Hi,

We have 31 regions for a table in our HBase system and hence while scanning
the table via TableMapper, it creates 31 maps. Following is the line from
documentation where I got the reason for the same.

"Reading from HBase, the TableInputFormat asks HBase for the list of regions
and makes a map-per-region or mapred.map.tasks maps, whichever is smaller "
(
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html
)

Each region file size is almost 7 GB(lzo compressed  data) and map jobs are
taking huge time to processed the data. Is there any way to increase
parallelism(allocate more maps per region)?

Regards,
Dhaval

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message