hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: Hadoop 0.19.1
Date Mon, 02 Feb 2009 19:59:22 GMT
Raghu, thanks for providing the link.

Jim> Are proposing disabling both append and sync?

Jim, this statement is probably too strong.
sync is not disabled per se, you will be able to use it, although
its full semantic is not guaranteed in some failure scenarios.
See more here
https://issues.apache.org/jira/browse/HADOOP-4663#action_12661802

We will have to really disable append (throw UnsupportedOperationException)
because otherwise current solution may lead to a loss of previously existed data.

I agree with Nigel that there is a need for an urgent 0.19.1 release
because a lot of bugs were fixed since 0.18.2 and 0.19.0.
The system is now stable on our clusters with 0.18.3 same fixes went into 0.19.1.

If we try to rush fixing the bugs for append (listed in my comment) we
risk to destabilize the system again, and this is my main concern.

Formally we should not release until a feature is fixed, but I think it
is better to let people use a stable release with limited functionality
rather than having full functionality with a risk of data loss.

Hope this will work for everybody.
--Konstantin


Raghu Angadi wrote:
> Raghu Angadi wrote:
>>> Is it there also that we would find what is involved making append
>>> work in 0.19.1?
>>
>> If one knew what is enough to fix properly, it would be easy. But over 
>> last couple of months, there have been many fixes (some of these jiras 
>> are listed in one Konstantins HADOOP-4663). 
> 
> Konstantin's comment I referred to (it was also linked from HADOOP-4663, 
> but harder to find).
> 
> https://issues.apache.org/jira/browse/HADOOP-5027#action_12668136
> 
> Raghu.
> 
>> The discussions are still bringing up more cases where the 
>> implementation or algorithm should change. But these are improvements 
>> for sure. But doubt if I would be ready to call it is 'completely 
>> fixed'. It needs time and a lot of testing in large clusters.
>>
>> Personally I am +1 for getting these into 0.19 branch. Most 
>> importantly even clusters and application not using append or sync 
>> were also affected, thats why extra caution.
>>
>> my 2 cents. hope this does not digress too much from the main topic.
>>
>> Raghu.
>>
>>> Thanks,
>>> St.Ack
>>>
>>>
>>>
>>> On Fri, Jan 30, 2009 at 2:36 AM, Steve Loughran <stevel@apache.org> 
>>> wrote:
>>>
>>>> Nigel Daley wrote:
>>>>
>>>>> Folks,
>>>>>
>>>>> Some Hadoop deployments have upgraded to 0.19.0.  Clearly, the 0.19 
>>>>> branch
>>>>> has issues and a 0.19.1 release is needed.
>>>>>
>>>>> Quality issues in the changes made for the file append feature have
>>>>> prevented some from deploying Hadoop 0.19.  One of these changes 
>>>>> (sync) has
>>>>> now been "fixed" by reducing its semantics in Hadoop 0.18.3 
>>>>> (HADOOP-4997).
>>>>>  This was necessary to stabilize the 0.18 branch.
>>>>>
>>>>> I would like to propose that we apply this same "fix" to sync in 
>>>>> 0.19.1
>>>>> and 0.20.0.  Since append requires the full semantics of sync, I 
>>>>> propose we
>>>>> also disable append (perhaps throw UnsupportedOperationException 
>>>>> from API?).
>>>>>  Yes, this would unfortunately be an incompatible change between 
>>>>> 0.19.0 and
>>>>> 0.19.1.  We can then take the time needed to fix append properly in 
>>>>> 0.21.0.
>>>>>
>>>> I can see some people being unhappy about this, but giving them a 
>>>> choice
>>>> between having the filesystem work or not, hopefully they will see the
>>>> merits of the change. And I am +1 to taking time to fix things; fast 
>>>> fixes
>>>> often create new problems
>>>>
>>>
>>
> 
> 

Mime
View raw message