hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bejoy KS" <bejoy.had...@gmail.com>
Subject Re: What is the difference between Rack-local map tasks and Data-local map tasks?
Date Sun, 07 Oct 2012 18:29:55 GMT
Definitely, If data local map tasks are more the performance will be improved much.

Ideally if data is uniformly distributed across DNs and if you have enough number of map task
slots on colocated TTs then most of your map tasks should be Data Local. You may have just
a few non data local map tasks when the number of input splits/map tasks are large which is
quite common.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: centerqi hu <centerqi@gmail.com>
Date: Sun, 7 Oct 2012 23:28:55 
To: <user@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Re: What is the difference between Rack-local map tasks and
 Data-local map tasks?

Very good explanation,
If there is a way to reduce Rack-local map tasks
but can increase the Data-local map tasks ,
Whether to increase performance?

2012/10/7 Michael Segel <michael_segel@hotmail.com>

> Rack local means that while the data isn't local to the node running the
> task, it is still on the same rack.
> (Its meaningless unless you've set up rack awareness because all of the
> machines are on the default rack. )
>
> Data local means that the task is running local to the machine that
> contains the actual data.
>
> HTH
>
> -Mike
>
> On Oct 7, 2012, at 8:56 AM, centerqi hu <centerqi@gmail.com> wrote:
>
>
> hi all
>
> When I run "hadoop job -status xxx",Output the following some list.
>
> Rack-local map tasks=124
> Data-local map tasks=6
>
> What is the difference between Rack-local map tasks and Data-local map
> tasks?
> --
> centerqi@gmail.com|Sam
>
>
>


-- 
centerqi@gmail.com|齐忠

Mime
View raw message