hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nqkoi nqkoev <regest...@gmail.com>
Subject Re: Best number of mappers and reducers when processing data to and from HBase?
Date Mon, 20 Oct 2014 16:10:17 GMT
Yes, it's effectively reading in the mapper and writing in the reducer. The
mapper is doing more than just reading the data, but as per my initial
tests the average map function time is around 1ms to 3ms so it's not a big
problem. The reducer is a bit slower however but it's still pretty fast. I
am trying to optmize the memory consumption and the speed of the mr job. I
don't want to just randomly change settings, if you guys can give me  a
hint on what should I read, that be great.

Thanks,
Peter

On Mon, Oct 20, 2014 at 5:22 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> For number of mappers, take a look at the following
> in TableInputFormatBase:
>
>   public List<InputSplit> getSplits(JobContext context) throws
> IOException {
>
> Is reducer required in your model ?
>
> Can you write to second hbase table from the mappers ?
>
>
> Cheers
>
> On Mon, Oct 20, 2014 at 7:08 AM, peterm_second <regestrer@gmail.com>
> wrote:
>
>> Hi Guys,
>> I have a somewhat abstract question to ask. I am reading data from Hbase
>> and I was wondering how am I to know what's the best mapper and reducer
>> count, I mean what are the criteria that need to be taken into
>> consideration when determining the mapper and reducer counts. My MR job is
>> reeding data from a Hbase table, said data is processed in the mapper and
>> the reducer takes the data and outputs some stuff to another Hbase table. I
>> want to be able to dinamicly deduce what's the correct number of mappers to
>> initially process the data (actually map it to a specific criterion ) and
>> the reducers to later do some other magic on it and output a new dataset
>> which then saved to a new Hbase Table. I've read that when reading data
>> from files I should have something like 10 mappers per DFS block, but I
>> have no clue how to translate that in my case where the input is a HBase
>> table. Any ideas would be appreciated, even if it's a book or an article I
>> should read.
>>
>> Regards,
>> Peter
>>
>
>

Mime
View raw message