hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Biedermann,S.,Fa. Post Direkt" <S.Biederm...@postdirekt.de>
Subject AW: data locality for reducer writes?
Date Thu, 14 Apr 2011 06:58:26 GMT
Many thanks for giving me this explanation of <data flow>.

I'll have a closer look at the incremental bulk load.



-----Urspr√ľngliche Nachricht-----
Von: jdcryans@gmail.com [mailto:jdcryans@gmail.com] Im Auftrag von Jean-Daniel Cryans
Gesendet: Mittwoch, 13. April 2011 19:10
An: user@hbase.apache.org
Betreff: Re: data locality for reducer writes?

It's not just a matter of transferring the data from the reducer to
the region server, you have to take into account that that data is
also replicated to other nodes.

So in a suboptimal setup you have:

Reducer  -> Network -> RegionServer -> Local Datanode -> Network ->
Remote Datanode1 -> Network -> Remote Datanode2

What you are trying to get is:

Reducer  -> Local RegionServer -> Local Datanode -> Network -> Remote
Datanode1 -> Network -> Remote Datanode2

Subsequent flushes of the inserted data will also follow the latest
pattern. That's what I meant earlier when I said the gain would be
marginal, you're only saving one network trip among many others. Also
I took a look at the JobTracker code and modifying it doesn't look so
easy.

Instead, since you already use the HRegionPartionioner, why don't you
do an incremental bulk load? http://hbase.apache.org/bulk-loads.html

J-D

On Wed, Apr 13, 2011 at 7:49 AM, Biedermann,S.,Fa. Post Direkt
<S.Biedermann@postdirekt.de> wrote:
> Hi Jean-Daniel,
>
> thx for your reply.
>
> What I assume is that the total network load during reduce is O(n) with n the number
of nodes in the cluster. We saw a major performance loss in the reduce step when our network
degraded to 100Mbit by accident (1h vs. 13 minutes).
>
> With more nodes I see 2 options:
>
> 1) using switches with a higher switching capacity
> 2) improve hbase/hadoop's assignment of reduce task to those nodes which serve the corresponding
hbase regions.
>
> What do you think?
>
> Sven
>
> -----Urspr√ľngliche Nachricht-----
> Von: jdcryans@gmail.com [mailto:jdcryans@gmail.com] Im Auftrag von Jean-Daniel Cryans
> Gesendet: Freitag, 8. April 2011 18:04
> An: user@hbase.apache.org
> Betreff: Re: data locality for reducer writes?
>
> Unfortunately it seems that there's nothing in the OutputFormat
> interface that we could implement (like getSplits in the InputFormat)
> to inform the JobTracker of the location of the regions. It kinda make
> sense, since when you're writing to HDFS in a "normal" MR job you
> always write to the local DataNode (well if there's one), but even
> then it is replicated to two other nodes. IMO even if we had that the
> gain would be marginal.
>
> J-D
>
> On Fri, Apr 8, 2011 at 4:18 AM, Biedermann,S.,Fa. Post Direkt
> <S.Biedermann@postdirekt.de> wrote:
>> Hi,
>>
>>
>>
>> we have a number of Reducer task each writing a bunch of rows into the
>> latest HBase via Puts.
>>
>> What is working is that each Reducer only creates Puts for one single
>> Region by using HRegionPartionioner.
>>
>>
>>
>> However, we are seeing that the Region flush itself is not local, but
>> going to some other node in the cluster. This puts load on the network.
>>
>> We'd like to see that instead the Reducer would be run on the same node
>> where the region is served.
>>
>>
>>
>> Is that possible?
>>
>> Any ideas or suggestions?
>>
>>
>>
>> Sven
>>
>>
>

Mime
View raw message