hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: About YARN-1336 in hadoop 2.6.0
Date Wed, 17 Dec 2014 11:00:22 GMT
Container re-use is a separate JIRA, without any code behind it yet

All that happens on YARN-1336 NM restart is the containers stay up and the
NM reconnects to them. This actually forces the slider code to add some
more logic to handle the situation "NM down & stays down, container failure
report triggers new container allocation —but the existing container stays
up and heartbeats to our AM." we handle this by recognising an unknown
container checking in, and sending a message to its python agent saying
"you are no longer live, kill yourself and your processes"

On 17 December 2014 at 09:57, Li Shengmei <lishengmei@ict.ac.cn> wrote:

> Hi,
>          I want to ask some questions about YARN-1336. As we know, we can
> recover container after NM Restart as YARN-1336 described.
> I want to persist the container after the container finished after one
> iteration not after NM restart.
>    I want to persist the container and the immediate values after the
> container finished, and reuse the container and immediate values in the
> future, may be next iteration run. Can I use the implementation of
> YARN-1336? Does anyone give some hints?
>          My understand is that the immediate values are stored in proto.
> Right? And maybe I need to add another status of container?
> Thanks a lot.
> May

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message