hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pi Song (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-166) Disk Full
Date Sun, 06 Apr 2008 00:09:24 GMT

    [ https://issues.apache.org/jira/browse/PIG-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586073#action_12586073

Pi Song commented on PIG-166:

I have implemented a minimalist temp file manager (probably not minimalist but not as much
as I expected) just to get out of the current problem quickly
- "temp.limit" can be set via system variable or .pigrc. By default, it is set to LONG.MAX_VALUE.
- We keep track of all temp space usage by hooking up when temp streams are closed (That's
where the actual file size is).
- If the execution engine is exceeding disk space limit and a temp file is to be created,
the temp file manager will try to free up more space first. If still there is no space, a
RuntimeException will be thrown, causing the execution to stop.

> Disk Full
> ---------
>                 Key: PIG-166
>                 URL: https://issues.apache.org/jira/browse/PIG-166
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Amir Youssefi
> Occasionally spilling fills up (all) hard drive(s) on a Data Node and crashes Task Tracker
(and other processes) on that node. We need to have a safety net and fail the task before
crashing happens (and more). 
> In Pig + Hadoop setting, Task Trackers get Black Listed. And Pig console gets stock at
a percentage without returning nodes to cluster. I talked to Hadoop team to explore Max Percentage
idea. Nodes running into this problem get into permanent problems and manual cleaning by administrator
is necessary. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message