hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1098) Incorrect synchronization in DistributedCache causes TaskTrackers to freeze up during localization of Cache for tasks.
Date Mon, 19 Oct 2009 09:02:31 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amareshwari Sriramadasu updated MAPREDUCE-1098:
-----------------------------------------------

    Attachment: patch-1098-2.txt

bq.     *  in getLocalCache(), refcount should be incremented only after localizeCache().
This was the earlier behavior and we should retain it.
This can not be moved to the other lock, because if delete happens inbetween the "marking
for use" and "localizing", and if reference count is zero, delete will go ahead and delete
it. So, reference count should be increment before.

bq.    * Similarly, decrement of baseDirSize should be done only after the filesystem delete.
Incorporated in the patch


> Incorrect synchronization in DistributedCache causes TaskTrackers to freeze up during
localization of Cache for tasks.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1098
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1098
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>            Reporter: Sreekanth Ramakrishnan
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-1098-0.20.txt, patch-1098-1.txt, patch-1098-2.txt, patch-1098.txt
>
>
> Currently {{org.apache.hadoop.filecache.DistributedCache.getLocalCache(URI, Configuration,
Path, FileStatus, boolean, long, Path, boolean)}} allows only one {{TaskRunner}} thread in
TT to localize {{DistributedCache}} across jobs. Current way of synchronization is across
baseDir this has to be changed to lock on the same baseDir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message