hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5576) Core change to localize resource while container is running
Date Tue, 30 Aug 2016 15:06:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449258#comment-15449258
] 

Jian He edited comment on YARN-5576 at 8/30/16 3:05 PM:
--------------------------------------------------------

Thanks for the review, Varun

bq. Can we move this logic into the ContainerImpl class?
bq.  It looks like similar logic is required in the ResourceLocalization class?
The logic is different for these two places. The first one should only allow localizing when
the container is running, while the second should allow if the container is either localizing
or running. This logic is tailored to the localize API only, I feel adding canLocalizeResources
method to the interface of Container makes it a bit confusing, as container can also localize
while at localizing state in the normal scenario too. I prefer keep it outside as it should
be clear enough to readers

bq. I couldn't figure out where this logic was moved to - can you please explain?
It's moved to ResourceSet#resourceLocalized method 
bq. We need to check the diagnostics string size - if a AM sends us too many failed requests,
the diagnostics string will just balloon in size.
Not quite sure how to check though. doesn't seem appropriate to add a new config for the diagnostics
size. So just skip appending the diagnostics if it is larger than say 2000 in length?
bq. We should throw an error here or at least flag this as a failed localization?
I don't think we should throw error. It's mentioned in the doc that we'll skip updating the
symlink for now. This is a valid use case if we want to replacing existing resource while
container is running. The changing symlink part, instead of being done here, will be done
later when container re-launches.
bq. Can you explain the logic for this? I couldn't figure out why we need this.
It's required because, in the normal scenario, when the localization completes before launching
the container, the private localizer thread will be interrupted. And when we re-localize,
we need to create a new Thread object as the old thread is stopped. Added code comments for
this
bq. Can you also add a check in testLocalingResourceWhileContainerRunning to make sure we
can’t localize for non-running containers?
will do
bq. Do we need these lines in the test code? Maybe move them to LOG.debug?
It's useful for debugging test. Anyway, I just removed it.


was (Author: jianhe):
Thanks for the review, Varun

bq. Can we move this logic into the ContainerImpl class?
bq.  It looks like similar logic is required in the ResourceLocalization class?
The logic is different for these two places. The first one should only allow localizing when
the container is running, while the second should allow if the container is either localizing
or running. This logic is tailored to the localize API only, I feel adding canLocalizeResources
method to the interface of Container makes it a bit confusing, as container can also localize
while at localizing state in the normal scenario too. I prefer keep it outside as it should
be clear enough to readers

bq. I couldn't figure out where this logic was moved to - can you please explain?
It's moved to ResourceSet#resourceLocalized method 
bq. We need to check the diagnostics string size - if a AM sends us too many failed requests,
the diagnostics string will just balloon in size.
Not quite sure how to check though. doesn't seem appropriate to add a new config for the diagnostics
size. So just skip appending the diagnostics if it is larger than say 2000 in length?
bq. We should throw an error here or at least flag this as a failed localization?
I don't think we should throw error. It's mentioned in the doc that we'll skip updating the
symlink for now. This is a valid use case if we want to replacing existing resource while
container is running. The changing symlink part, instead of being done here, will be done
later when container re-launches.
bq. Can you explain the logic for this? I couldn't figure out why we need this.
It's required because, in the normal scenario, when the localization completes before launching
the container, the private localizer thread will be interrupted. And when we re-localize,
we need to create a new Thread object as the old thread is stopped. Added code comments for
this
bq. Do we need these lines in the test code? Maybe move them to LOG.debug?
I just removed it.
bq. Can you also add a check in testLocalingResourceWhileContainerRunning to make sure we
can’t localize for non-running containers?
will do
bq. Do we need these lines in the test code? Maybe move them to LOG.debug?
It's useful for debugging test. Anyway, I just removed it.

> Core change to localize resource while container is running
> -----------------------------------------------------------
>
>                 Key: YARN-5576
>                 URL: https://issues.apache.org/jira/browse/YARN-5576
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Jian He
>            Assignee: Jian He
>         Attachments: YARN-5576.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message