hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2374) "Text File Busy" errors launching MR tasks
Date Wed, 22 Aug 2012 23:42:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439949#comment-13439949
] 

Andy Isaacson commented on MAPREDUCE-2374:
------------------------------------------

bq. I looked at trunk and it does have this bug

Thanks for double checking me.  Yes, {{launchContainer()}} uses {{lfs.util().copy()}} to write
the {{CONTAINER_SCRIPT}}, and {{FileContext.Util#copy}} does open the output file directly,
and both {{launchContainer}} and {{writeLocalWrapperScript}} use {{bash -c}} to run scripts.

So, yes, this bug is also present on trunk.

I'll prepare a separate patch for trunk.

Is there anything stopping us from checking in the branch-1 patch?
                
> "Text File Busy" errors launching MR tasks
> ------------------------------------------
>
>                 Key: MAPREDUCE-2374
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.1
>
>         Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch,
mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace
>
>
> Some very small percentage of tasks fail with a "Text file busy" error.
> The following was the original diagnosis:
> {quote}
> Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows
all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional
task failures with the message "Text file busy" - assumedly because the close() call is failing
silently for some reason.
> {quote}
> .. but turned out to be another issue as well (see below)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message