Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 46058 invoked from network); 20 Feb 2008 19:36:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Feb 2008 19:36:19 -0000 Received: (qmail 73116 invoked by uid 500); 20 Feb 2008 19:36:13 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 72694 invoked by uid 500); 20 Feb 2008 19:36:12 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 72685 invoked by uid 99); 20 Feb 2008 19:36:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Feb 2008 11:36:12 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Feb 2008 19:35:48 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8AA2C234C043 for ; Wed, 20 Feb 2008 11:35:43 -0800 (PST) Message-ID: <1078974358.1203536143566.JavaMail.jira@brutus> Date: Wed, 20 Feb 2008 11:35:43 -0800 (PST) From: "Konstantin Shvachko (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-2815) support for DeleteOnExit In-Reply-To: <19116229.1202953628163.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570806#action_12570806 ] Konstantin Shvachko commented on HADOOP-2815: --------------------------------------------- Another simple solution to this would be if you just keep temporary files open and never close them. With the new semantics files under construction are readable for all clients. When the client that created (but did not close) a file dies the lease should expire in due time, and hdfs will automatically remove it. I am only worried about if too many open files will slow down the name-node. How many temp files do you need and for how long they should remain open? > support for DeleteOnExit > ------------------------ > > Key: HADOOP-2815 > URL: https://issues.apache.org/jira/browse/HADOOP-2815 > Project: Hadoop Core > Issue Type: New Feature > Components: dfs > Reporter: Olga Natkovich > > Pig creates temp files that it wants to be removed at the end of the processing. The code that removes the temp file is in the shutdown hook so that they get removed both under normal shutdown as well as when process gets killed. > The problem that we are seeing is that by the time the code is called the DFS might already be closed and the delete fails leaving temp files behind. Since we have no control over the shutdown order, we have no way to make sure that the files get removed. > One way to solve this issue is to be able to mark the files as temp files so that hadoop can remove them during its shutdown. > The stack trace I am seeing is > at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:158) > at org.apache.hadoop.dfs.DFSClient.delete(DFSClient.java:417) > at org.apache.hadoop.dfs.DistributedFileSystem.delete(DistributedFileSystem.java:144) > at org.apache.pig.backend.hadoop.datastorage.HPath.delete(HPath.java:96) > at org.apache.pig.impl.io.FileLocalizer$1.run(FileLocalizer.java:275) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.