hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <dar...@ontrenet.com>
Subject Re: Why inter-rack communication in mapreduce slow?
Date Mon, 06 Jun 2011 13:26:11 GMT

I'm not a hadoop jedi, but in that case, wouldn't one of the hadoop
"trackers" get bottlenecked to resolve those dependencies?

Again, this exposes the oddity of hadoop IMO, it tries to NOT
be I/O bound, but seems its very I/O bound...

sorry. not trying to shift the thread topic.

On Mon, 06 Jun 2011 09:21:51 -0400, John Armstrong
<john.armstrong@ccri.com> wrote:
> On Mon, 06 Jun 2011 09:18:45 -0400, <darren@ontrenet.com> wrote:
>> I never understood how hadoop can throttle an inter-rack fiber switch.
>> Its supposed to operate on the principle of move-the-code to the data
>> because of the I/O cost of moving the data, right?
> 
> But what happens when a reducer on rack A gets most of its input from
> mappers on rack A, but needs a serious chunk of data from mappers on
racks,
> B, C, D...

Mime
View raw message