hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSS] Hadoop Security Release off Yahoo! patchset
Date Fri, 14 Jan 2011 01:35:13 GMT
On Thursday, January 13, 2011, Arun C Murthy <acm@yahoo-inc.com> wrote:
>
> On Jan 13, 2011, at 3:34 PM, Todd Lipcon wrote:
>
>
> On Thu, Jan 13, 2011 at 3:05 PM, Arun C Murthy <acm@yahoo-inc.com> wrote:
>
>
> Since this could be applied as a linear set of patches instead of a big
>
> lump, would there be interest in using this as the 0.20.>100 Apache
> release?
> I can take the time to remove any patches that are cloudera specific or
> not
> yet applied upstream.
>
>
>
> Interesting discussion, thanks.
>
> I'm sure it took you a fair amount of work to squash patches (which I tried
> too, btw).
>
>
>
> Yep, I had a great summer ;-)
>
>
>
> That, plus the fact that we would need to do a similar amount of work for
> the 10 or so releases we have done after 0.20.100.3 scares me.
>
>
>
> Sorry, I actually meant 0.20.104.3. Have there been many releases since
> then? That's the last version available on the Yahoo github, and that's the
> version we incorporated/linearized.
>
>
> Yep. I had a great summer! ;-)
>
>
>
> As we Nigel and I discussed here, the jumbo  patch and an up-to-date
> CHANGES.txt provides almost all of the benefits we seek and allows all of us
> to get this done very quickly to focus on hadoop-0.22 and beyond.
>
>
>
> In my opinion here are the downsides to this plan:
>
>
>
> I agree there are downsides, I think I did point them out at the outset! :)
>
>
> - a mondo "merge" patch is a big pain when trying to do debugging. It may be
> sufficient for a user to look at CHANGES.txt, but I find myself using
> blame/log/etc on individual files to understand code lineage on a daily
> basis. If all of the merge shows up as a big patch it will be very difficult
> (at least the way I work with code) to help users debug issues or understand
> which JIRA a certain regression may have come from.
>
>
>
> Right, no question. Which is why I offered to do this as a background activity right
after... this ensures that the source of truth is *always* a branch in Apache subversion.
>
> I feel that we could get a usable release out of door quickly for our users. Also, please
remember that almost every patch we have committed is available on relevant jiras. I understand
the devs have a problem and I feel we can bear with it for a little while. Again, I agree
this isn't an ideal solution, I'm just trying to expedite the release for the users.
>
>
>
> To clarify my position a bit here - I definitely appreciate your
> volunteering to do the work, and wouldn't *block* the proposal as you've put
> it forth. I just think it will have limited utility for the community by
> being opaque (if contributed as a giant patch) and by not including the sync
> feature which is critical for a large segment of users. Given those
> downsides I'd rather see the effort diverted towards making a killer 0.22
> release that we can all jump on.
>
>
>
> Thanks for understanding.
>
> Again, I completely agree this isn't an ideal situation, but I do hope it has a bit more
than *limited utility* for our end-users. Who knows, I maybe hopelessly deluded! *smile*
>
> Also, I'm trying to do exactly what you suggested - spend very little time on this so
that everyone, including me, can focus on the future.
>
> thanks,
> Arun
>

Given that Todd has already done the work to rebase the 0.20.104.3
patch set on 0.20.2, and in a way that doesn't require one big change,
and his patch set includes branch20-append which the HBase guys want
an Apache release of wouldn't it make sense to go this route?  What do
others think? Seems better to have one 0.20.100 release than multiple
ones for security and append.

Thanks,
Eli

Mime
View raw message