Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 24167 invoked from network); 14 Mar 2011 05:09:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Mar 2011 05:09:48 -0000 Received: (qmail 76337 invoked by uid 500); 14 Mar 2011 05:09:48 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 76287 invoked by uid 500); 14 Mar 2011 05:09:47 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 76279 invoked by uid 99); 14 Mar 2011 05:09:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 05:09:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sean.bigdatafun@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Mar 2011 05:09:41 +0000 Received: by iyj12 with SMTP id 12so7161831iyj.35 for ; Sun, 13 Mar 2011 22:09:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=dvK7uiSHXG4FVTra92KQOdigrHd5p/eG8I0xiSFqGz4=; b=TmCQJ7EwaH9ICJawY0+hfrspMgpwvvnZGqx/OXX8m+jei+H7yXFee3vphsVOYt+kKj jRANcgzyl/X79n0SSddKzgcuc3IPksnGUzXu//CRWGCiXgVEnLmbwqp/A5YPi6+x7LXd oGYGwXGXXI1SxwE4y21UN8mx06KCgki1g/4rA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=YUHgApXQ1JxWWDspXXnPdPEudb7STb2XSDJnSvNy2FvTAQlaYViGz9en4XXMmihf/I IMzhmxMLp+KPqLJyU7jhbRD3wn/e/pZltINu7Rb83VzhQ+wqXeQuz8R7Vh4F+bPbkYnw bccrwxif/HAgfun2x2qzdhsvwqS59f9+fyQCE= MIME-Version: 1.0 Received: by 10.43.43.131 with SMTP id uc3mr15816422icb.300.1300079360237; Sun, 13 Mar 2011 22:09:20 -0700 (PDT) Received: by 10.231.206.13 with HTTP; Sun, 13 Mar 2011 22:09:20 -0700 (PDT) In-Reply-To: References: Date: Sun, 13 Mar 2011 22:09:20 -0700 Message-ID: Subject: Re: Will blocks of an unclosed file get lost when HDFS client (or the HDFS cluster) crashes? From: Sean Bigdatafun To: Ted Dunning Cc: hdfs-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 I meant an HDFS chunk (the size of 64MB), and I meant the version of 0.20.2 without append patch. I think even without the append patch, the previous 64MB blocks (in my example, the first 5 blocks) should be safe. Isn't it? On 3/13/11, Ted Dunning wrote: > What do you mean by block? An HDFS chunk? Or a flushed write? > > The answer depends a bit on which version of HDFS / Hadoop you are using. > With the append branches, things happen a lot more like what you expect. > Without that version, it is difficult to say what will happen. > > Also, there are very few guarantees about what happens if the namenode > crashes. There are some provisions for recovery, but none of them really > have any sort of transactional guarantees. This means that there may be > some unspecified time before the writes that you have done are actually > persisted in a recoverable way. > > On Sun, Mar 13, 2011 at 9:52 AM, Sean Bigdatafun > wrote: > >> Let's say an HDFS client starts writing a file A (which is 10 blocks >> long) and 5 blocks have been writen to datanodes. >> >> At this time, if the HDFS client crashes (apparently without a close >> op), will we see 5 valid blocks for file A? >> >> Similary, at this time if the HDFS cluster crashes, will we see 5 >> valid blocks for file A? >> >> (I guess both answers are yes, but I'd have some confirmation :-) >> -- >> --Sean >> > -- --Sean