hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Xu <wenhao...@gmail.com>
Subject How to have multiple mapper and reducer for a MapReduce job on a hbase table with hbase 0.20.6?
Date Mon, 28 Feb 2011 02:51:15 GMT
Dear all,

I am writing a Map-Reduce task to go through a HBase table to re-calculate
the entries stored in it daily. The number of entries would be hundreds of
millions. I use the TableMapper as the mapper and IdentityTableReducer as
the reducer followed the example in the HBase code. I found it would only
use 1 mapper and 1 reducer in my test table which has about 3 millions of
entries.

I am wondering how could I get multiple mapper or reducer in this case?
Because I need to finish the job in a couple of minutes. And now it will
take me 6 minutes to process 3 million entries which means it will take
about 300 minutes for 150 millions entries?

I found a SimpleTotalOrderPartitioner in the 0.90.0 api but it didn't exist
in 0.20.6. Is there anything I could use in 0.20.6?

Thanks.

Best wishes,
Stanley Xu

Mime
View raw message