hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle
Date Sun, 13 Nov 2016 14:31:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15661566#comment-15661566
] 

Arun Suresh edited comment on YARN-4597 at 11/13/16 2:31 PM:
-------------------------------------------------------------

Updating patch addressing [~kkaranasos]'s and [~kasha]'s comments.

Thanks both of you for the detailed reviews.

I've responded to Karthik's comments directly on github.
[~kkaranasos], I've address your comments except the following.

1.
bq. queuedOpportunisticContainers will have concurrency issues. We are updating it when containers
arrive but also in the shedQueuedOpportunisticContainers.
Good catch. Ive fixed it in the latest patch by sending a 'SHED_QUEUED_CONTAINERS' event to
the ContainerScheduler when the Node HB response from the RM asks to shed queued containers.
In addition to preserving the fact that ContainerScheduler operations are serialized, it also
ensures that the Node HB thread is not blocked.

2.
bq. shedQueuedOpportunisticContainers: numAllowed is the number of allowed containers in the
queue. Instead, we are killing numAllowed containers. In other words, we should not kill numAllowed,
but queuedOpportunisticContainers.size() - numAllowed.
Even though I agreed with you offline, I took a look again, and actually the logic is correct.
The {{numAllowed}} counter is initialized to maxAllowed and the decremented in the loop. Containers
are killed only AFTER it's value goes <= 0. In any case, I've added a testcase in 'TestContainerSchedulerQueuing'
to verify that this actually works.

3.
bq. line 252, indeed we can now do extraOpportContainersToKill -> opportContainersToKill,
as Karthik mentioned at a comment.
I think 'extra' is still apt. Since (as I mentioned to Karthik), these are 'extra' opp containers
over and above what is already present in the 'oppContainersToKill'.

4.
bq. queuedGuaranteedContainers and queuedOpportunisticContainers: I think we should use queues.
I don't think we retrieve the container by the key anywhere either ways.
Karthik mentioned this in his comments too. LinkedHashMaps are essentially a indexed queue.
Additionally, there is actually 1 case where we need to retrieve by the key: When the AM asks
to kill a container that is queued. Furthermore, queue re-ordering etc. might be easier with
a map... Lets keep it as a LinkedHashMap unless we find it is detrimental in some way. 

5.
bq. oppContainersMarkedForKill: could be a Set, right?
Technically yes, but I would have to modify Container to add equals and hashcode too, which
I felt was too much of a hastle.. I prefer to keep it as it is.

6.
bq. fields of the opportunisticContainersStatus are set in different places. Due to that,
when we call getOpportunisticContainersStatus() we may see an inconsistent object. Let's set
the fields only in the getOpportunisticContainersStatus().
Agreed... I've also added the oppMem, oppCores & numOpp values in the NodeManagerMetrics.

7.
bq. There seem to be two redundant parameters at YarnConfiguration at the moment: NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH
and NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH. If I am not missing something, we should
keep one of the two.
Actually they are bit different. NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH and the corresponding
max value is used by the RM to calculate a limit value for the Queue. It is possible that
the Queue can momentarily go above that. While NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH
is used in the NM to prevent queuing beyond that value. It is a configuration hard limit ([~kasha]
had requested it)
 




was (Author: asuresh):
Updating patch addressing [~kkaranasos]'s and [~kasha]'s comments

I've responded to Karthik's comments directly on github.

[~kkaranasos],
bq. queuedOpportunisticContainers will have concurrency issues. We are updating it when containers
arrive but also in the shedQueuedOpportunisticContainers.
Good catch. Ive fixed it in the latest patch by sending a 'SHED_QUEUED_CONTAINERS' event to
the ContainerScheduler when the Node HB response from the RM asks to shed queued containers.
In addition to preserving the fact that ContainerScheduler operations are serialized, it also
ensures that the Node HB thread is not blocked.

bq. shedQueuedOpportunisticContainers: numAllowed is the number of allowed containers in the
queue. Instead, we are killing numAllowed containers. In other words, we should not kill numAllowed,
but queuedOpportunisticContainers.size() - numAllowed.
Even though I agreed with you offline, I took a look again, and actually the logic is correct.
The {{numAllowed}} counter is initialized to maxAllowed and the decremented in the loop. Containers
are killed only AFTER it's value goes <= 0. In any case, I've added a testcase in 'TestContainerSchedulerQueuing'
to verify that this actually works.

bq. line 252, indeed we can now do extraOpportContainersToKill -> opportContainersToKill,
as Karthik mentioned at a comment.
I think 'extra' is still apt. Since (as I mentioned to Karthik), these are 'extra' opp containers
over and above what is already present in the 'oppContainersToKill'.

bq. queuedGuaranteedContainers and queuedOpportunisticContainers: I think we should use queues.
I don't think we retrieve the container by the key anywhere either ways.
Karthik mentioned this in his comments too. LinkedHashMaps are essentially a indexed queue.
Additionally, there is actually 1 case where we need to retrieve by the key: When the AM asks
to kill a container that is queued. Furthermore, queue re-ordering etc. might be easier with
a map... Lets keep it as a LinkedHashMap unless we find it is detrimental in some way. 

bq. oppContainersMarkedForKill: could be a Set, right?
Technically yes, but I would have to modify Container to add equals and hashcode too, which
I felt was too much of a hastle.. I prefer to keep it as it is.

bq. fields of the opportunisticContainersStatus are set in different places. Due to that,
when we call getOpportunisticContainersStatus() we may see an inconsistent object. Let's set
the fields only in the getOpportunisticContainersStatus().
Agreed... I've also added the oppMem, oppCores & numOpp values in the NodeManagerMetrics.

bq. There seem to be two redundant parameters at YarnConfiguration at the moment: NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH
and NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH. If I am not missing something, we should
keep one of the two.
Actually they are bit different. NM_CONTAINER_QUEUING_MIN_QUEUE_LENGTH and the corresponding
max value is used by the RM to calculate a limit value for the Queue. It is possible that
the Queue can momentarily go above that. While NM_OPPORTUNISTIC_CONTAINERS_MAX_QUEUE_LENGTH
is used in the NM to prevent queuing beyond that value. It is a configuration hard limit ([~kasha]
had requested it)
 



> Add SCHEDULE to NM container lifecycle
> --------------------------------------
>
>                 Key: YARN-4597
>                 URL: https://issues.apache.org/jira/browse/YARN-4597
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Chris Douglas
>            Assignee: Arun Suresh
>              Labels: oct16-hard
>         Attachments: YARN-4597.001.patch, YARN-4597.002.patch, YARN-4597.003.patch, YARN-4597.004.patch,
YARN-4597.005.patch, YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, YARN-4597.009.patch,
YARN-4597.010.patch
>
>
> Currently, the NM immediately launches containers after resource localization. Several
features could be more cleanly implemented if the NM included a separate stage for reserving
resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message