Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 74327 invoked from network); 27 Feb 2008 11:58:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Feb 2008 11:58:48 -0000 Received: (qmail 20039 invoked by uid 500); 27 Feb 2008 11:58:43 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 19682 invoked by uid 500); 27 Feb 2008 11:58:41 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 19667 invoked by uid 99); 27 Feb 2008 11:58:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2008 03:58:41 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Feb 2008 11:58:03 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 25E3829A0017 for ; Wed, 27 Feb 2008 03:57:51 -0800 (PST) Message-ID: <1397018937.1204113471154.JavaMail.jira@brutus> Date: Wed, 27 Feb 2008 03:57:51 -0800 (PST) From: "Hemanth Yamijala (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-2899) hdfs:///mapredsystem directory not cleaned up after deallocation In-Reply-To: <477584133.1204047651525.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572885#action_12572885 ] Hemanth Yamijala commented on HADOOP-2899: ------------------------------------------ Possible solutions: 1. The directory is world-readable. In HADOOP-1873, Doug commented that this should not be world-readable, so I guess this is not an option, for some reason. 2. JobTracker can delete the directory when it is shutdown. This seems, IMO, the best solution. Whoever creates, deletes. 3. HOD can create a directory that's specific to the user, something like mapredsystem/user-name.clusterid, where clusterid could be like a torque jobid. This may create a huge number of directories in HDFS, don't know if that's an issue. Otherwise, this is probably the *easiest* solution to implement 4. HOD can delete the mapredsystem directory at deallocation. This seems wrong, because it doesn't create it, and further, given our current design, this is very hard to implement. 5. JobTracker can avoid cleaning up the directory at startup. This may not be safe, in case there's a crash or something. Comments ? > hdfs:///mapredsystem directory not cleaned up after deallocation > ----------------------------------------------------------------- > > Key: HADOOP-2899 > URL: https://issues.apache.org/jira/browse/HADOOP-2899 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/hod > Affects Versions: 0.16.0 > Reporter: Luca Telloli > > Each submitted job creates a hdfs:///mapredsystem directory, created by (I guess) the hodring process. Problem is that it's not cleaned up at the end of the process; a use case would be: > - user A allocates a cluster, the hodring is svrX, so a /mapredsystem/srvX directory is created > - user A deallocates the cluster, but that directory is not cleaned up > - user B allocates a cluster, and the first node chosen as hodring is svrX, so hodring tries to write hdfs:///mapredsystem but it fails > - allocation succeeds, but there's no hodring running; looking at > 0-jobtracker/logdir/hadoop.log under the temporary directory I can read: > 2008-02-26 17:28:42,567 WARN org.apache.hadoop.mapred.JobTracker: Error starting tracker: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.fs.permission.AccessControlException: Permission denied: user=B, access=WRITE, inode="mapredsystem":hadoop:supergroup:rwxr-xr-x > I guess a possible solution would be to clean up those directories during the deallocation process. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.