hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <fancye...@gmail.com>
Subject Fwd: how to do parallel scanning in map reduce using hbase as input?
Date Tue, 22 Jul 2014 02:30:02 GMT
anyone could help? now I have about 1.1 billion nodes and it takes 2
hours to finish a map reduce job.

---------- Forwarded message ----------
From: Li Li <fancyerii@gmail.com>
Date: Thu, Jun 26, 2014 at 3:34 PM
Subject: how to do parallel scanning in map reduce using hbase as input?
To: user@hbase.apache.org


my table has about 700 million rows and about 80 regions. each task
tracker is configured with 4 mappers and 4 reducers at the same time.
The hadoop/hbase cluster has 5 nodes so at the same time, it has 20
mappers running. it takes more than an hour to finish mapper stage.
The hbase cluster's load is very low, about 2,000 request per second.
I think one mapper for a region is too small. How can I run more than
one mapper for a region so that it can take full advantage of
computing resources?

Mime
View raw message