hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3517) The last InMemory merge may be missed
Date Mon, 09 Jun 2008 05:51:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603476#action_12603476

Arun C Murthy commented on HADOOP-3517:

Fair point.

We probably need to track _opened but not yet closed_ files along with ramManager.close and
then get waitForDataToMerge to return a boolean indicating that the InMemMergeThread should
gracefully exit. Thoughts?

> The last InMemory merge may be missed
> -------------------------------------
>                 Key: HADOOP-3517
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3517
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Devaraj Das
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
> This is post HADOOP-3366. The inmem merge thread has the loop:
> {code}
>         while (!exitInMemMerge) {
>             ramManager.waitForDataToMerge();
>             doInMemMerge();
>           }
> {code}
> The fetchOutputs, at the end of copying everything, does the following:
> {code}
>         exitInMemMerge = true; 
>         ramManager.close();
> {code}
> Now if the merge thread is doing a merge (inside the doInMemMerge method) when the exitInMemMerge
is set to true, the loop will break and the last merge of the files that got shuffled recently
will be skipped. ramManager.close(), that internally does a final notify to the merge thread
also won't have any effect in this case.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message