Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <8373996d0903260521x738d174amd918609453c0bebe@mail.gmail.com>
References: <8373996d0903260521x738d174amd918609453c0bebe@mail.gmail.com>
Date: Thu, 26 Mar 2009 17:41:02 +0100
Message-ID: <d6d7c4410903260941i4e3589e9u2187c4b2b0262f79@mail.gmail.com>
Subject: Re: corrupt unreplicated block in dfs (0.18.3)
From: Aaron Kimball <aaron@cloudera.com>
To: core-user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=0016368e1d2212d90f046608498c

--0016368e1d2212d90f046608498c
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Just because a block is corrupt doesn't mean the entire file is corrupt.
Furthermore, the presence/absence of a file in the namespace is a completely
separate issue to the data in the file. I think it would be a surprising
interface change if files suddenly disappeared just because 1 out of
potentially many blocks were corrupt.

- Aaron

On Thu, Mar 26, 2009 at 1:21 PM, Mike Andrews <mra@xoba.com> wrote:

> i noticed that when a file with no replication (i.e., replication=1)
> develops a corrupt block, hadoop takes no action aside from the
> datanode throwing an exception to the client trying to read the file.
> i manually corrupted a block in order to observe this.
>
> obviously, with replication=1 its impossible to fix the block, but i
> thought perhaps hadoop would take some other action, such as deleting
> the file outright, or moving it to a "corrupt" directory, or marking
> it or keeping track of it somehow to note that there's un-fixable
> corruption in the filesystem? thus, the current behaviour seems to
> sweep the corruption under the rug and allows its continued existence,
> aside from notifying the specific client doing the read with an
> exception.
>
> if anyone has any information about this issue or how to work around
> it, please let me know.
>
> on the other hand, i tested that corrupting a block in a replication=3
> file causes hadoop to re-replicate the block from another existing
> copy, which is good and is i what i expected.
>
> best,
> mike
>
>
> --
> permanent contact information at http://mikerandrews.com
>

--0016368e1d2212d90f046608498c--