hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7612) Add Processor Framework for Rich Placement Constraints
Date Mon, 25 Dec 2017 05:47:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16303072#comment-16303072

Arun Suresh commented on YARN-7612:

[~cheersyang], Thanks for diving deep..

So, lets assume anti-affinity constraint for 3 containers and the associated allocation-tag
is "spark". We can enumerate three ways the request for the container can come in from a single
app say *app1*. When *app1* starts up, it registers placement constraints with the RM stating
that it requires anti-affinity for all scheduling requests with tag *spark*. Then:
# In a _single_ allocate call, it includes 1 SchedulingRequest object with numAllocations=3
and allocation tags=spark.
# In a _single_ allocate call, it includes 3 SchedulingRequest objects each with numAllocations=1
and allocation tags=spark, and each has different resource sizing.
# In the _first_ allocate call, it includes 2 SchedulingRequest objects each with numAllocations=1
and allocation tags=spark - AND then in the _second_ allocate call, it includes 1 SchedulingRequest
object with numAllocations=1 and allocation tags=spark.

Now, for cases 1 and 2, Since all the requests will exist in the same AlgorithmInput (since
we batch requests we get in a single allocate call), all three requests will be considered
at the same time by the algorithm and for all three of the requests, the state of the TagsManager
seen by the Algorithm will be the same.
For case 3, I agree, we can end up with a a situation you stated, depending on the timing
of the second allocate call and the size of the placement and scheduling thread pool - by
default it is 1, in which case, this is less likely to happen.

I think we mentioned in the BatchedRequests javadoc (and we should make that explicit in the
final docs as well) that for optimal placements, it is recommended that Applications send
all related scheduling requests - those associated with the same allocation tags - in the
same allocate call. In anycase, we are targetting _SoftConstraints_ in the first cut.

For _HardConstraints_, yes I agree, we need an extra check in the {{attemptAllocationOnNode}}
phase. Maybe exposing a canAssign method that takes the PlacementConstraint, TagsManager and
container tag. Feel free to raise a JIRA to add that and I can help review.

But, to be honest, even then, we cannot guarantee the constraint is perfectly honored - unless
the {{yarn.resourcemanager.placement-constraints.scheduler.pool-size}} == 1.

> Add Processor Framework for Rich Placement Constraints
> ------------------------------------------------------
>                 Key: YARN-7612
>                 URL: https://issues.apache.org/jira/browse/YARN-7612
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>             Fix For: 3.1.0
>         Attachments: YARN-7612-YARN-6592.001.patch, YARN-7612-YARN-6592.002.patch, YARN-7612-YARN-6592.003.patch,
YARN-7612-YARN-6592.004.patch, YARN-7612-YARN-6592.005.patch, YARN-7612-YARN-6592.006.patch,
YARN-7612-YARN-6592.007.patch, YARN-7612-YARN-6592.008.patch, YARN-7612-YARN-6592.009.patch,
YARN-7612-YARN-6592.010.patch, YARN-7612-YARN-6592.011.patch, YARN-7612-YARN-6592.012.patch,
YARN-7612-v2.wip.patch, YARN-7612.wip.patch
> This introduces a Placement Processor and a Planning algorithm framework to handle placement
constraints and scheduling requests from an app and places them on nodes.
> The actual planning algorithm(s) will be handled in a YARN-7613.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message