hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Omkar Vinit Joshi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-539) LocalizedResources are leaked in memory in case resource localization fails
Date Thu, 11 Apr 2013 00:31:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628492#comment-13628492

Omkar Vinit Joshi commented on YARN-539:

bq. Resource doesn't have life, so it can't 'fail'. In that sense, shall we rename ResourceFailedEvent
to ResourceFailedLocalizationEvent? Similarly ResourceEventType.FAILED to ResourceEventType.LOCALIZATION_FAILED?

bq. Dismantle localizationCompleted altogether? Makes code much more readable IMO.
Yeah.. this is no longer required and can be simplified :) .. updating handle accordingly.

bq. The log message for release doesn't need to specifically talk about failed resources.
A release on a resource that is long gone for whatever reason will run into this code-path.
Yes you are right ... ex. if resource's local file is deleted (becomes inaccessible for some
reason) then too we will end up getting these messages...

bq. Not related to your patch, but the code for REQUEST can simplified by doing the null check
No. I think the flow is correct.
* check if resource is not null and present on disk if not then remove it from cache.
* now if resource is null -> We can have below two possibilities. In both cases we need
to recreate the resource.
** Either resource's local copy is inaccessible 
** or resource request is coming for the first time

bq. Null checks needed for rsrc on LOCALIZED and FAILED events?

Can never occur. as when these events come resource will be there in DOWNLOADING state. Will
never be removed because for this resource ref count > 0 (ResourceRetentionSet.addResources).
> LocalizedResources are leaked in memory in case resource localization fails
> ---------------------------------------------------------------------------
>                 Key: YARN-539
>                 URL: https://issues.apache.org/jira/browse/YARN-539
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>         Attachments: yarn-539-20130410.1.patch, yarn-539-20130410.patch
> If resource localization fails then resource remains in memory and is
> 1) Either cleaned up when next time cache cleanup runs and there is space crunch. (If
sufficient space in cache is available then it will remain in memory).
> 2) reused if LocalizationRequest comes again for the same resource.
> I think when resource localization fails then that event should be sent to LocalResourceTracker
which will then remove it from its cache.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message