hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Non data-local scheduling
Date Sat, 05 Oct 2013 22:19:55 GMT
Is this option set on a per-application-instance basis or is it a cluster-wide setting (or
Is this a MapReduce-specific issue, or a YARN issue?
I don't understand how the problem arises in the first place.  For example, if I have an idle
cluster with 10 nodes and each node has four containers available at the requested capacity,
and my YARN application requests 20 containers with node affinity such that I'm desiring two
containers per node , why wouldn't YARN give me exactly the task mapping that I requested?

From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
Sent: Thursday, October 03, 2013 12:32 PM
To: user@hadoop.apache.org
Subject: Re: Non data-local scheduling

Ah, I was going off the Fair Scheduler equivalent, didn't realize they were different.  In
that case you might try setting it to something like half the nodes in the cluster.

Nodes are constantly heartbeating to the Resource Manager.  When a node heartbeats, the scheduler
checks to see whether the node has any free space, and, if it does, offers it to an application.
 From the application's perspective, this offer is a "scheduling opportunity". Each application
will pass up yarn.scheduler.capacity.node-locality-delay before placing a container on a non-local


On Thu, Oct 3, 2013 at 10:36 AM, André Hacker <andrephacker@gmail.com<mailto:andrephacker@gmail.com>>
Thanks, but I can't set this to a fraction, it wants to see an integer.
My documentation is slightly different:
"Number of missed scheduling opportunities after which the CapacityScheduler attempts to schedule
rack-local containers. Typically this should be set to number of racks in the cluster, this
feature is disabled by default, set to -1."
I set it to 1 and now I had 33 data local and 11 rack local tasks, which is a better, but
still not optimal.
Couldn't find a good description of what this feature means (what is a scheduling opportunity,
how many are there?). It does not seem to be in the current documentation http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html

2013/10/3 Sandy Ryza <sandy.ryza@cloudera.com<mailto:sandy.ryza@cloudera.com>>
Hi Andre,

Try setting yarn.scheduler.capacity.node-locality-delay to a number between 0 and 1.  This
will turn on delay scheduling - here's the doc on how this works:

For applications that request containers on particular nodes, the number of scheduling opportunities
since the last container assignment to wait before accepting a placement on another node.
Expressed as a float between 0 and 1, which, as a fraction of the cluster size, is the number
of scheduling opportunities to pass up. The default value of -1.0 means don't pass up any
scheduling opportunities.


On Thu, Oct 3, 2013 at 9:57 AM, André Hacker <andrephacker@gmail.com<mailto:andrephacker@gmail.com>>

I have a 25 node cluster, running hadoop 2.1.0-beta, with capacity scheduler (default settings
for scheduler) and replication factor 3.
I have exclusive access to the cluster to run a benchmark job and I wonder why there are so
few data-local and so many rack-local maps.
The input format calculates 44 input splits and 44 map tasks, however, it seems to be random
how many of them are processed data locally. Here the counters of my last tries:
data-local / rack-local:
Test 1: data-local:15 rack-local: 29
Test 2: data-local:18 rack-local: 26

I don't understand why there is not always 100% data local. This should not be a problem since
the blocks of my input file are distributed over all nodes.

Maybe someone can give me a hint.

André Hacker, TU Berlin

View raw message