hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2815) Allowing processes to cleanup dfs on shutdown
Date Fri, 29 Feb 2008 08:28:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573634#action_12573634
] 

dhruba borthakur commented on HADOOP-2815:
------------------------------------------

Extending the Trash API might be ok in the short term but does not sound too appealing from
a long-term perspective.

One suggestion to solve this problem is to have a FileSystem API that allows applications
to register shutdown hooks. PIG can then register its shutdown hooks with the FileSystem.
The FileSystem shutdown hook would first invoke all application-registered shutdown hooks
before shutting down the filesystem. This will allow applications like PIG to ensure that
their shutdown hook gets invoked before the FileSystem object is destroyed.


> Allowing processes to cleanup dfs on shutdown
> ---------------------------------------------
>
>                 Key: HADOOP-2815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2815
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Olga Natkovich
>
> Pig creates temp files that it wants to be removed at the end of the processing. The
code that removes the temp file is in the shutdown hook so that they get removed both under
normal shutdown as well as when process gets killed.
> The problem that we are seeing is that by the time the code is called the DFS might already
be closed and the delete fails leaving temp files behind. Since we have no control over the
shutdown order, we have no way to make sure that the files get removed.
> One way to solve this issue is to be able to mark the files as temp files so that hadoop
can remove them during its shutdown.
> The stack trace I am seeing is
> at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:158)
>         at org.apache.hadoop.dfs.DFSClient.delete(DFSClient.java:417)
>         at org.apache.hadoop.dfs.DistributedFileSystem.delete(DistributedFileSystem.java:144)
>         at org.apache.pig.backend.hadoop.datastorage.HPath.delete(HPath.java:96)
>         at org.apache.pig.impl.io.FileLocalizer$1.run(FileLocalizer.java:275)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message