Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 96388 invoked from network); 4 Nov 2008 19:47:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Nov 2008 19:47:02 -0000 Received: (qmail 49237 invoked by uid 500); 4 Nov 2008 19:47:03 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 49179 invoked by uid 500); 4 Nov 2008 19:47:03 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 49125 invoked by uid 99); 4 Nov 2008 19:47:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2008 11:47:03 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nathan@rapleaf.com designates 208.96.16.213 as permitted sender) Received: from [208.96.16.213] (HELO mail.rapleaf.com) (208.96.16.213) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Nov 2008 19:45:45 +0000 Received: from mail.rapleaf.com (localhost.localdomain [127.0.0.1]) by mail.rapleaf.com (Postfix) with ESMTP id BA5BD12502FD for ; Tue, 4 Nov 2008 11:46:26 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=rapleaf.com; q=dns; s=m1; b=USJDE CnbWtIrQFbfLfO3JbXFeKD102dxj2Vh/WejbfcNZCIsieLJMsWfKZme6JQx1TOPO ZrjNTJ3Y/a+ncS3vut1z8XAcSE6KXkpKE2XulY1mf6ZESBbFaG3I1cftOnUXpeNv hhtoiQ3woeGFllaBiQrW9rAl6xSzzuvXuTv0P0= Received: from [10.100.18.148] (unknown [10.100.18.148]) by mail.rapleaf.com (Postfix) with ESMTP id A9D941250050 for ; Tue, 4 Nov 2008 11:46:26 -0800 (PST) Message-Id: From: Nathan Marz To: core-user@hadoop.apache.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: _temporary directories not deleted Date: Tue, 4 Nov 2008 11:46:26 -0800 X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org Hello all, Occasionally when running jobs, Hadoop fails to clean up the "_temporary" directories it has left behind. This only appears to happen when a task is killed (aka a speculative execution), and the data that task has outputted so far is not cleaned up. Is this a known issue in hadoop? Is the data from that task guaranteed to be duplicate data of what was outputted by another task? Is it safe to just delete this directory without worrying about losing data? Thanks, Nathan Marz Rapleaf