hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Iyappan Srinivasan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1098) Incorrect synchronization in DistributedCache causes TaskTrackers to freeze up during localization of Cache for tasks.
Date Wed, 21 Oct 2009 13:05:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768221#action_12768221
] 

Iyappan Srinivasan commented on MAPREDUCE-1098:
-----------------------------------------------

+1 from QA for patch-1098-0.20.txt

1) Brought up cluster, made sure that the file uploaded is around 2 GB (using -files option).
Submitted two jobs  which acceses these files.
Before patch, the first job finished uploading the file and then only the second job file's
uploading starts, as
clearly seen from logs. After patch, both upload starts independently, as seen from logs.

2) Ran sleep jobs and also streaming jobs to test this behaviour.

3) Ran with one slave cluster and made sure that two jobs access same file/ different file
using -files and -cacheFile. In all
cases it went fine.
After patch, when different files are given with -files option, then uploading happens independently.
When same files are
provided with -files option, still it happens independently because jt places them on different
directories for each job, as seen from the conf file of the job.
with -cacheFile and with only one TT, the first file is localized by the first job and the
second job just access this localized file, as
soon as the lock over that file is removed.


> Incorrect synchronization in DistributedCache causes TaskTrackers to freeze up during
localization of Cache for tasks.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1098
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1098
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>            Reporter: Sreekanth Ramakrishnan
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>
>         Attachments: patch-1098-0.20.txt, patch-1098-1.txt, patch-1098-2.txt, patch-1098.txt
>
>
> Currently {{org.apache.hadoop.filecache.DistributedCache.getLocalCache(URI, Configuration,
Path, FileStatus, boolean, long, Path, boolean)}} allows only one {{TaskRunner}} thread in
TT to localize {{DistributedCache}} across jobs. Current way of synchronization is across
baseDir this has to be changed to lock on the same baseDir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message