hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Judd <d...@zvents.com>
Subject Re: Hadoop 0.19.1
Date Tue, 03 Feb 2009 02:18:29 GMT
Comments inline ...

On Mon, Feb 2, 2009 at 4:23 PM, Konstantin Shvachko <shv@yahoo-inc.com>wrote:

> >  What do you recommend?
> In general. There may be people/organizations, which will not compromise
> on the reduced functionality in favor of the stability, this is
> understandable.
> I would propose to create a separate (unofficial experimental) branch,
> which
> would track changes like HADOOP-4379. The branch may later either die when
> the
> main stream is fixed or be merged with the trunk if the changes proved to
> be stable.

Sure, that sounds reasonable.  One thing I would caution against is spending
a lot of time doing incremental patchwork on something that needs a
ground-up overhaul.  I would much rather wait a couple of months longer and
get software that is based on a well thought out design that is
fundamentally sound.  Ultimately that will be the fastest path to stability.

> >1. the file length (as returned by getFileStatus) is incorrect

> May be the following work around will be useful.
> If you read from a file you always try to read more data than the length
> reported
> by the name-node. How much more? The size of one block would be enough, or
> even to the next (ceiling) block boundary.

I could certainly implement a workaround, however, from an API standpoint,
the filesystem (IMHO) should always give you a way to obtain the real length
of the file.  The semantics of the current getFileStatus() make it difficult
to reason about the state of your filesystem.  It basically returns a
"possibly stale" version of the length.  I would prefer to wait for an
implementation that gives an accurate answer and spend my time and energy
helping to test that one, rather than spending a bunch of time implementing
a workaround for the current version.

>2. When an application comes up after a crash, it seems to hang for about
> 60
> Don't have enough context on that, sorry.

I spoke too soon on this.  The reason that HDFS was hanging on lease
recovery was because I was opening the file in append mode to force lease
recovery (at Dhruba's suggestion) so that it would update the NameNode with
the proper length.  If I had a method of obtaining the accurate length of
the file, I wouldn't need to do this.  Hence, I didn't bother filing an
issue on this.

- Doug

> Thanks,
> --Konstantin
> Doug Judd wrote:
>> Sounds good.  I would much rather wait and have fsync() done correctly in
>> 0.20 than get some sort of hacked version in 0.19.  I'll create a couple
>> of
>> issues and mark them for 0.20  Thanks.
>> - Doug
>> On Mon, Feb 2, 2009 at 1:51 PM, Owen O'Malley <omalley@apache.org> wrote:
>>  On Feb 2, 2009, at 12:51 PM, Doug Judd wrote:
>>>  What do you recommend?  Is there anyway we could get these two issues
>>>> fixed
>>>> for 0.19.1, or should I file issues for them and get them on the
>>>> schedule
>>>> for 0.19.2?
>>>>  Given the outstanding problems and general level of uncertainty, I'd
>>> favor
>>> releasing a 0.19.1 with the equivalent of the 0.18.3 disable on fsync and
>>> append. Let's get them fixed in 0.20 first and then we can debate whether
>>> the rewards of pushing them back into an 0.19.2 would make sense. I'm
>>> pretty
>>> uncomfortable at the moment with how the entire functional complex seems
>>> to
>>> cause a continuous stream of problems.
>>> -- Owen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message