Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 72378 invoked from network); 7 Mar 2006 04:53:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Mar 2006 04:53:03 -0000 Received: (qmail 94627 invoked by uid 500); 7 Mar 2006 04:53:03 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 94510 invoked by uid 500); 7 Mar 2006 04:53:01 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 94341 invoked by uid 99); 7 Mar 2006 04:53:01 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2006 20:53:01 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Mar 2006 20:53:00 -0800 Received: from ajax (localhost.localdomain [127.0.0.1]) by ajax (Postfix) with ESMTP id EDAA16ACAB for ; Tue, 7 Mar 2006 04:52:39 +0000 (GMT) Message-ID: <1501471532.1141707159970.JavaMail.jira@ajax> Date: Tue, 7 Mar 2006 04:52:39 +0000 (GMT) From: "eric baldeschwieler (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-66) dfs client writes all data for a chunk to /tmp In-Reply-To: <549957597.1141689449195.JavaMail.jira@ajax> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-66?page=comments#action_12369159 ] eric baldeschwieler commented on HADOOP-66: ------------------------------------------- So the problem with /tmp is that this can fill up and cause failures. This is very config / install specific. We almost never use /tmp because it gets blown out by something sometime, always when you least expect it. Maybe we should throw by default and provide some config to do something else, such as provide a a file path for temp files? This could be in /tmp if you chose, or map reduce could default to its temp directory where it is storing everything else. Performance is clearly not an issue if this is truly an exceptional case. > dfs client writes all data for a chunk to /tmp > ---------------------------------------------- > > Key: HADOOP-66 > URL: http://issues.apache.org/jira/browse/HADOOP-66 > Project: Hadoop > Type: Bug > Components: dfs > Versions: 0.1 > Reporter: Sameer Paranjpye > Fix For: 0.1 > > The dfs client writes all the data for the current chunk to a file in /tmp, when the chunk is complete it is shipped out to the Datanodes. This can cause /tmp to fill up fast when a lot of files are being written. A potentially better scheme is to buffer the written data in RAM (application code can set the buffer size) and flush it to the Datanodes when the buffer fills up. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira