hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Chansler (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-2815) support for DeleteOnExit
Date Wed, 20 Feb 2008 01:42:43 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Chansler updated HADOOP-2815:
------------------------------------

    Component/s: dfs

While the intention is understandable, it appears difficult to define a semantics that is
generally applicable. In a model where the file system runs forever, there may be no opportunity
(start/stop) for cleanup. If the file system is abruptly terminated, does the application
really want the files to go away (even if the loss of the file system was not observed)? If
the application closes its access to the file system before cleaning house, maybe there is
a solution within the application. If the application ever restarts, perhaps that is an opportunity
to clean house.

The idea of building a sandbox where applications can execute and be confident that all resources
are released when the application terminates is attractive, but not a short-term effort.

> support for DeleteOnExit
> ------------------------
>
>                 Key: HADOOP-2815
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2815
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Olga Natkovich
>
> Pig creates temp files that it wants to be removed at the end of the processing. The
code that removes the temp file is in the shutdown hook so that they get removed both under
normal shutdown as well as when process gets killed.
> The problem that we are seeing is that by the time the code is called the DFS might already
be closed and the delete fails leaving temp files behind. Since we have no control over the
shutdown order, we have no way to make sure that the files get removed.
> One way to solve this issue is to be able to mark the files as temp files so that hadoop
can remove them during its shutdown.
> The stack trace I am seeing is
> at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:158)
>         at org.apache.hadoop.dfs.DFSClient.delete(DFSClient.java:417)
>         at org.apache.hadoop.dfs.DistributedFileSystem.delete(DistributedFileSystem.java:144)
>         at org.apache.pig.backend.hadoop.datastorage.HPath.delete(HPath.java:96)
>         at org.apache.pig.impl.io.FileLocalizer$1.run(FileLocalizer.java:275)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message