pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haitao Yao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2812) Spill InternalCachedBag into only 1 file
Date Sat, 21 Jul 2012 08:43:34 GMT

    [ https://issues.apache.org/jira/browse/PIG-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419768#comment-13419768

Haitao Yao commented on PIG-2812:

well, the spilled files should have been cleared under the normal condition. But if your job
failed, and the Tasktracker reuses the child java process, the OOM would happen.
I really think spill to one file is better.

> Spill InternalCachedBag into only 1 file
> ----------------------------------------
>                 Key: PIG-2812
>                 URL: https://issues.apache.org/jira/browse/PIG-2812
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>            Reporter: Haitao Yao
>             Fix For: 0.11
>         Attachments: aa.jpg
> I encountered a reducer's OOM because of java.io.DeleteOnExitHook. And I found out that
the InternalCachedBag creates a seperate tmp file, and the tmp files is deleted on exit. So
the file delete hook caused the OOM. 
> Why not just hold the tmp file handle and spill only one tmp file?
> Too many tmp files may block the tasktracker start process, if the tmp files are not
cleaned on time and the tasktracker restarts at this specific time.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message