Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-user@hadoop.apache.org
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Apple Message framework v1082)
Subject: Re: Will blocks of an unclosed file get lost when HDFS client (or the
 HDFS cluster) crashes?
From: Allen Wittenauer <aw@apache.org>
In-Reply-To: <AANLkTi=NMGxsZqC_rDD6wWW3ok1bsxpkSYphvBNVxDh8@mail.gmail.com>
Date: Mon, 14 Mar 2011 09:21:55 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <31847860-7699-4AA3-9975-8DAD66B76A44@apache.org>
References: <AANLkTimGDTHAnQ47-nHTimENvT64f_JMhp7a8gC1dDyr@mail.gmail.com>
 <AANLkTik3wmnaMtWfVEQK8cgasnUvCTBvLTVaSrhmNhs9@mail.gmail.com>
 <AANLkTi=NMGxsZqC_rDD6wWW3ok1bsxpkSYphvBNVxDh8@mail.gmail.com>
To: <hdfs-user@hadoop.apache.org>

	No.

	If a close hasn't been committed to the file, the associated =
blocks/files disappear in both client crash and namenode crash =
scenarios. =20


On Mar 13, 2011, at 10:09 PM, Sean Bigdatafun wrote:

> I meant an HDFS chunk (the size of 64MB), and I meant the version of
> 0.20.2 without append patch.
>=20
> I think even without the append patch, the previous 64MB blocks (in my
> example, the first 5 blocks) should be safe. Isn't it?
>=20
>=20
> On 3/13/11, Ted Dunning <tdunning@maprtech.com> wrote:
>> What do you mean by block?  An HDFS chunk?  Or a flushed write?
>>=20
>> The answer depends a bit on which version of HDFS / Hadoop you are =
using.
>> With the append branches, things happen a lot more like what you =
expect.
>> Without that version, it is difficult to say what will happen.
>>=20
>> Also, there are very few guarantees about what happens if the =
namenode
>> crashes.  There are some provisions for recovery, but none of them =
really
>> have any sort of transactional guarantees.  This means that there may =
be
>> some unspecified time before the writes that you have done are =
actually
>> persisted in a recoverable way.
>>=20
>> On Sun, Mar 13, 2011 at 9:52 AM, Sean Bigdatafun
>> <sean.bigdatafun@gmail.com>wrote:
>>=20
>>> Let's say an HDFS client starts writing a file A (which is 10 blocks
>>> long) and 5 blocks have been writen to datanodes.
>>>=20
>>> At this time, if the HDFS client crashes (apparently without a close
>>> op), will we see 5 valid blocks for file A?
>>>=20
>>> Similary, at this time if the HDFS cluster crashes, will we see 5
>>> valid blocks for file A?
>>>=20
>>> (I guess both answers are yes, but I'd have some confirmation :-)
>>> --
>>> --Sean
>>>=20
>>=20
>=20
>=20
> --=20
> --Sean