hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@yahoo-inc.com>
Subject Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
Date Tue, 18 Jan 2011 09:59:27 GMT
Thanks for the clarifications Roy.

I considered either b) and c).

As I mentioned, the reason I think b) wasn't useful in this context is  
that we have, in several cases, 5-6 patches per jira (bug-fix on on  
top of a bug-fix) and several jiras didn't make sense for trunk since  
the bug didn't exist in trunk etc. etc. Also, I was considering a  
scenario where I would squash relevant patches together to produce a  
minimal set of coherent patches. Then there is work to remove Yahoo!  
specific commits.

IAC, I agree - we've spent too much time talking and too little doing  
actual work. Let me get the job done and folks can then weigh-in on  
the release at later point, folks might be willing to consider this  
more positively once they see the  branch, the change-log etc.

Of course we need to get the small number of remaining patches into  
trunk asap for 0.22 and beyond.

Arun

On Jan 18, 2011, at 12:20 AM, Roy T. Fielding wrote:

> I thought that this discussion would have reached some sensible
> understanding of how Apache projects work by now, but it seems not.
>
> On Jan 13, 2011, at 6:12 PM, Arun C Murthy wrote:
>> The version I'm offering to push to the community has fixed all of  
>> them, *plus* the added benefit of several stability and performance  
>> fixes we have done since 20.104.3, almost 10 internal releases.  
>> This is a battle tested and hardened version which we have deployed  
>> on 40,000+ nodes. It is a significant upgrade on 0.20.104.3 which  
>> we never deployed. I'm pretty sure *some* users will find that  
>> valuable. ;)
>>
>> Also, I've offered to push individual patches as a background  
>> activity on a branch - that should suffice, no? Or, do you consider  
>> this a blocker?
>>
>> Again, my goal in this exercise is to get a stable, improved  
>> version of Hadoop into the hands of our users asap, and focus on  
>> 0.22 and beyond.
>
> So, you have a bunch of changes that you want to contribute.
> Please do so.  There are several ways:
>
> a) break the changes down into a sequence of patches, create jira
>    issues for each one (or append to the existing issue), and then
>    provide the group with a list of the issue links so that people
>    can quickly +1 each one.  When it seems worthwhile to you, create
>    a branch off of some prior Apache release point in svn and commit
>    each patch to it until the branch is identical to (or, in your own
>    opinion, better than) the source code that you have tested locally.
>    Then RM a tarball and start a release vote.  Since all of this is
>    being done in jira and svn, others can help you do all but the
>    first part (breaking down the big patch).
>
> or
>
> b) create a branch off of some prior Apache release point in svn
>    and replay the internal Y! commits on that branch until the branch
>    source code is identical to what you have tested locally.  Then
>    RM a tarball based on that branch and start a release vote.
>    Since the history is now in svn, others could do the RM bit if
>    you don't have time.
>
> or
>
> c) create a branch off of some prior Apache release point in svn
>    and apply one big ugly patch to it.  Then RM a tarball based
>    on that branch and ask for a release vote.
>
> You will note that none of the above requires a discussion on this
> list prior to the release vote, though (a) would likely result in
> more +1s than (b), and (b) would likely receive more +1s than (c).
> Regardless, the release vote is a lazy majority decision.
>
> I do not believe that there is any rational reason to apply a
> single big patch.  "It takes too long" is nonsense -- you have
> already spent far more time discussing it than would be required
> to do it.  "It is too hard" is also nonsense -- use your version
> control system to extract the set of changes and just replay them
> (with appropriate changelog editing).  "It has already been tested
> at Y!" is simply irrelevant -- the source code has been tested, not
> the order in which the patches have been applied, so all you should
> care about is that the final branch code is comparable to the tested
> source code (i.e., use diff).
>
> Nevertheless, all contributions at Apache are voluntary.  Do what
> you have time for, now, with the understanding that others may or
> may not complete the task, and may or may not vote for the release.
>
> You can make a branch, apply the big patch, and stand by
> while the rest of the group chooses whether to just accept it
> as a big change.  Someone else might create a parallel branch to
> apply the specific changes in an orderly matter, or perhaps you'll
> discover an easy way to do that a few days from now.  Or it
> might just sit there and never be released.
>
> There is no need for the group to agree to a plan up front, just
> as there is no need for the group to approve a release just because
> someone did the work of RMing a tarball.  Sure, it might save
> a lot of time if potential disagreements can be resolved before
> work is done, but the fact is that people tend to disagree less
> with actual work products than with abstract plans.  After all,
> everyone has a plan.  It is also far easier to convince people
> to fix their own problems if the problem is right in front of them.
>
> When the release vote happens, encourage folks to test and +1
> the release.  If it passes, woohoo!  If not, then listen to the
> reasons given by the other PMC members and see if you can make
> enough changes to the release to get those extra +1s.
>
> In other words, collaborate.
>
> ....Roy


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message