hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5819) Verify fairshare and minshare preemption
Date Thu, 10 Nov 2016 02:31:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15652786#comment-15652786
] 

Karthik Kambatla edited comment on YARN-5819 at 11/10/16 2:31 AM:
------------------------------------------------------------------

Patch v3 incorporates most recent review feedback except the following bits.

bq. In FSPreemptionThread.run(), the call to containerNode.removeContainerForPreemption(container)
is pretty important to keep the FSSchedulerNode state sane. Do you need to do anything to
make sure it doesn't get skipped because of an exception? In general, I'm a little nervous
about the bookkeeping on the containers to preempt; it seems like something that could quietly
get out of sync and cause issues.

Very valid point. The root of this is splitting the logic of adding and removing containers
on a node to be preempted across two threads - {{FSPreemptionThread}} marks them and {{PreemptContainersTask}}
attempts preemption and removes them from the collection. The latter is triggered via {{FSPreemptionThread.run()}}
-> {{preemptContainers()}} -> {{PreemptContainersTask.run()}}. In the current path,
there is no exception-throwing code. However, one might add it in the future. To future-proof
this, I could add a try-finally block to the task run method. That still leaves preemptContainers
and I don't think a try-finally block is enough there.

Ideally, I would like to call out that these two methods cannot throw Exceptions. Would javadoc-ing
that be enough? 

bq. {{FSAppAttempt.preemptedResources}} should be final, as should its neighbors.
The neighbors cannot be final the way they are today - we update the reference to point to
different objects at different times.

bq. The import of org.junit.After in TestFairSchedulerPreemption is grouped wrong.
The spacing alone was wrong. My IDE retains alphabetic order irrespective of whether the import
is static. I thought that was normal. Should I change anything there?

bq. {{TestFairSchedulerPreemption.writeAllocFile()}} never throws an IOException.
It actually does when we call {{new FileWriter()}}.


was (Author: kasha):
Patch v3 incorporates most recent review feedback except the following bits.

bq. In FSPreemptionThread.run(), the call to containerNode.removeContainerForPreemption(container)
is pretty important to keep the FSSchedulerNode state sane. Do you need to do anything to
make sure it doesn't get skipped because of an exception? In general, I'm a little nervous
about the bookkeeping on the containers to preempt; it seems like something that could quietly
get out of sync and cause issues.

Very valid point. The root of this is splitting the logic of adding and removing containers
on a node to be preempted across two threads - {{FSPreemptionThread}} marks them and {{PreemptContainersTask}}
attempts preemption and removes them from the collection. The latter is triggered via {{FSPreemptionThread.run()}}
-> {{preemptContainers()}} -> {{PreemptContainersTask.run()}}. In the current path,
there is no exception-throwing code. However, one might add it in the future. To future-proof
this, I could add a try-finally block to the task run method. That still leaves preemptContainers
and I don't think a try-finally block is enough there.

Ideally, I would like to call out that these two methods cannot throw Exceptions. Would javadoc-ing
that be enough? 

bq. {{FSAppAttempt.preemptedResources}} should be final, as should its neighbors.
The neighbors cannot be the way they are today - we update the reference to point to different
objects at different times.

bq. The import of org.junit.After in TestFairSchedulerPreemption is grouped wrong.
The spacing alone was wrong. My IDE retains alphabetic order irrespective of whether the import
is static. I thought that was normal. Should I change anything there?

bq. {{TestFairSchedulerPreemption.writeAllocFile()}} never throws an IOException.
It actually does when we call {{new FileWriter()}}.

> Verify fairshare and minshare preemption
> ----------------------------------------
>
>                 Key: YARN-5819
>                 URL: https://issues.apache.org/jira/browse/YARN-5819
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler
>    Affects Versions: 2.9.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>         Attachments: yarn-5819.YARN-4752.1.patch, yarn-5819.YARN-4752.2.patch, yarn-5819.YARN-4752.3.patch
>
>
> JIRA to track the unit test(s) verifying both fairshare and minshare preemption. The
tests should verify:
> # preemption within a single leaf queue
> # preemption between sibling leaf queues
> # preemption between non-sibling leaf queues
> # {{allowPreemption = false}} should disallow preemption from a queue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message