hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2691) Finish up the cleanup of distributed cache file resources and related tests.
Date Fri, 09 Sep 2011 03:27:09 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100916#comment-13100916
] 

Siddharth Seth commented on MAPREDUCE-2691:
-------------------------------------------

Thanks for the detailed review Vinod.

ContainerImpl.handle and ContainerManager.stopContainer()diagnostic - will revert the changes.

Vaguely remember seing diagnostic messages stacking up - where a single container would end
up with multiple copies of messages. Will create a jira for that when I see it again.

bq. RELEASE_CONTAINER_RESOURCES is always sent along with CLEANUP_CONTAINER_RESOURCES event.
I think we should just merge these into CLEANUP_CONTAINER_RESOURCES event itself. This will
also be inline with the fact that creation of container-dirs and the localization of files
both happen as part of a single event INIT_CONTAINER_RESOURCES, so cleanup should also be
a single event. We can send a Map<LocalResourceVisibility, Collection<LocalResourceRequest>>
as the event payload. To be symmetric, we should probably also merge the multiple INIT_CONTAINER_RESOURCES
calls one for each LocalResourceVisibility to be a single event. Thoughts?
Sounds good. Will make the changes. Had added separate events for RELEASE_CONTAINER_RESOURCES
to be consistent with the way resources were requested - 1 event for each type. Don't really
see a reason for the requests to be sent separately though.

Will make the changes in the test cases. 
Draining events immediately - don't quite remember why I added the option for a delayed drain
- possibly to be able to drain events 1 at a time sometime later. Anyway, it can be added
back if required.
Completely agree about the mocks - there's way too much and makes some of the tests hard to
understand. Will try getting rid of some of them.

bq. There is existing code for purging of cache under disk pressure - See ResourceLocalization.CacheCleanup
and ResourceRetentionSet. (We need tests for this though, will file a ticket) This only deletes
files that aren't in use at all. By LRU, do you mean selective deletion of these files based
on their usage? Can you please point me to the relevant MRV1 JIRA? Thanks!
Yep, deletion exists in MRv2 and works. For LRU, ref MR 2494, 2572.

> Finish up the cleanup of distributed cache file resources and related tests.
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2691
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2691
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>            Reporter: Amol Kekre
>            Assignee: Siddharth Seth
>             Fix For: 0.23.0, 0.24.0
>
>         Attachments: MR2691_1.patch, MR2691_2.patch, MR2691_3.patch
>
>
> Implement cleanup of distributed cache file resources

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message