Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 1418 invoked from network); 14 Mar 2011 16:22:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Mar 2011 16:22:19 -0000 Received: (qmail 22428 invoked by uid 500); 14 Mar 2011 16:22:18 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 22382 invoked by uid 500); 14 Mar 2011 16:22:18 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 22374 invoked by uid 99); 14 Mar 2011 16:22:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 16:22:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.9] (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 14 Mar 2011 16:22:17 +0000 Received: (qmail 99340 invoked by uid 99); 14 Mar 2011 16:21:57 -0000 Received: from localhost.apache.org (HELO dhcp-02.private.iobm.com) (127.0.0.1) (smtp-auth username aw, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 16:21:57 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Will blocks of an unclosed file get lost when HDFS client (or the HDFS cluster) crashes? From: Allen Wittenauer In-Reply-To: Date: Mon, 14 Mar 2011 09:21:55 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <31847860-7699-4AA3-9975-8DAD66B76A44@apache.org> References: To: X-Mailer: Apple Mail (2.1082) No. If a close hasn't been committed to the file, the associated = blocks/files disappear in both client crash and namenode crash = scenarios. =20 On Mar 13, 2011, at 10:09 PM, Sean Bigdatafun wrote: > I meant an HDFS chunk (the size of 64MB), and I meant the version of > 0.20.2 without append patch. >=20 > I think even without the append patch, the previous 64MB blocks (in my > example, the first 5 blocks) should be safe. Isn't it? >=20 >=20 > On 3/13/11, Ted Dunning wrote: >> What do you mean by block? An HDFS chunk? Or a flushed write? >>=20 >> The answer depends a bit on which version of HDFS / Hadoop you are = using. >> With the append branches, things happen a lot more like what you = expect. >> Without that version, it is difficult to say what will happen. >>=20 >> Also, there are very few guarantees about what happens if the = namenode >> crashes. There are some provisions for recovery, but none of them = really >> have any sort of transactional guarantees. This means that there may = be >> some unspecified time before the writes that you have done are = actually >> persisted in a recoverable way. >>=20 >> On Sun, Mar 13, 2011 at 9:52 AM, Sean Bigdatafun >> wrote: >>=20 >>> Let's say an HDFS client starts writing a file A (which is 10 blocks >>> long) and 5 blocks have been writen to datanodes. >>>=20 >>> At this time, if the HDFS client crashes (apparently without a close >>> op), will we see 5 valid blocks for file A? >>>=20 >>> Similary, at this time if the HDFS cluster crashes, will we see 5 >>> valid blocks for file A? >>>=20 >>> (I guess both answers are yes, but I'd have some confirmation :-) >>> -- >>> --Sean >>>=20 >>=20 >=20 >=20 > --=20 > --Sean