hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peterm_second <regest...@gmail.com>
Subject Best number of mappers and reducers when processing data to and from HBase?
Date Mon, 20 Oct 2014 14:08:30 GMT
Hi Guys,
I have a somewhat abstract question to ask. I am reading data from Hbase 
and I was wondering how am I to know what's the best mapper and reducer 
count, I mean what are the criteria that need to be taken into 
consideration when determining the mapper and reducer counts. My MR job 
is reeding data from a Hbase table, said data is processed in the mapper 
and the reducer takes the data and outputs some stuff to another Hbase 
table. I want to be able to dinamicly deduce what's the correct number 
of mappers to initially process the data (actually map it to a specific 
criterion ) and the reducers to later do some other magic on it and 
output a new dataset which then saved to a new Hbase Table. I've read 
that when reading data from files I should have something like 10 
mappers per DFS block, but I have no clue how to translate that in my 
case where the input is a HBase table. Any ideas would be appreciated, 
even if it's a book or an article I should read.


View raw message