Mailing-List: contact dev-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hive.apache.org
Message-ID: <555F6DEB.2070907@gmail.com>
Date: Fri, 22 May 2015 10:56:59 -0700
From: Alan Gates <alanfgates@gmail.com>
User-Agent: Postbox 3.0.11 (Macintosh/20140602)
MIME-Version: 1.0
To: dev@hive.apache.org
Subject: Re: [DISCUSS] Supporting Hadoop-1 and experimental features
References: <55512F4A.2080907@gmail.com>
 <D176B009.160CC%sergey@hortonworks.com>
 <D17A2B3F.2B17E%gopal@hortonworks.com> <55567D30.1020202@gmail.com>
 <D17BD286.19C0A%vgumashta@hortonworks.com>
 <CADx-ob01dLkTpLY54aT1QkYr6tGXVddCAp=75aVH9dadzc1Xdg@mail.gmail.com>
 <5556B74E.8010908@gmail.com>
 <CADx-ob3dOhiejjfxDcmgMuxRR0dBRUuNraKtLsUPR_XMTTeXwQ@mail.gmail.com>
 <CAENxBwz2fnAawBUBnbP43FAJiy_MwMwu-1FC5RnMgAzV90iVfA@mail.gmail.com>
 <D17F7F0F.1756B%sergey@hortonworks.com>
 <D17F81AE.1759B%sergey@hortonworks.com>
 <687310404.202578.1432280952027.JavaMail.yahoo@mail.yahoo.com>
 <D184A173.18590%sergey@hortonworks.com>
 <CAKKt98QY+=rLU8UkDmQP-1srAfcF7Am=WhbiJ7BN0Qn0hChZ8Q@mail.gmail.com>
In-Reply-To: 
 <CAKKt98QY+=rLU8UkDmQP-1srAfcF7Am=WhbiJ7BN0Qn0hChZ8Q@mail.gmail.com>
Content-Type: multipart/alternative;
 boundary="------------040703020601060406030702"

--------------040703020601060406030702
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

I don't think anyone is advocating for option 2, as that would be 
disastrous.  Option 3 is closest to what I'm proposing, though again 
dropping support for Hadoop 1 is only a part of it.

Alan.

> Alexander Pivovarov <mailto:apivovarov@gmail.com>
> May 22, 2015 at 10:03
> Looks like we discussing 3 options:
>
> 1. Support hadoop 1, 2 and 3 in master branch.
>
> 2. Support hadoop 1 in branch-1, hadoop 2 in branch-2, hadoop 3 in 
> branch-3
>
> 3. Support hadoop 2 and 3 in master
>
> I DO not think option 2 is good solution because it is much more 
> difficuilt
> to manage 3 active prod branches rather than one master branch.
>
> I think we should go with options 1 or 3.
>
> +1 on Xuefu and Edward opinion
>
> Sergey Shelukhin <mailto:sergey@hortonworks.com>
> May 22, 2015 at 9:08
> I think branch-2 doesn’t need to be framed as particularly adventurous
> (other than due to general increase of the amount of work done in Hive by
> community).
> All the new features that normally go on trunk/master will go to branch-2.
> branch-2 is just trunk as it is now, in fact there will be no branch-2,
> just master :) The difference is the dropped functionality, not added one.
> So you shouldn’t lose stability if you retain the same process as now by
> just staying on versions off master.
>
> Perhaps, as is usually the case in Apache projects, developing features on
> older branches would be discouraged. Right now, all features usually go on
> trunk/master, and are then back ported as needed and practical; so you
> wouldn’t (in Apache) make a feature on Hive 0.14 to be released in 0.14.N,
> and not back port to master.
>
>
> Chris Drome <mailto:cdrome@yahoo-inc.com.INVALID>
> May 22, 2015 at 0:49
> I understand the motivation and benefits of creating a branch-2 where 
> more disruptive work can go on without affecting branch-1. While not 
> necessarily against this approach, from Yahoo's standpoint, I do have 
> some questions (concerns).
> Upgrading to a new version of Hive requires a significant commitment 
> of time and resources to stabilize and certify a build for deployment 
> to our clusters. Given the size of our clusters and scale of datasets, 
> we have to be particularly careful about adopting new functionality. 
> However, at the same time we are interested in new testing and making 
> available new features and functionality. That said, we would have to 
> rely on branch-1 for the immediate future.
> One concern is that branch-1 would be left to stagnate, at which point 
> there would be no option but for users to move to branch-2 as branch-1 
> would be effectively end-of-lifed. I'm not sure how long this would 
> take, but it would eventually happen as a direct result of the very 
> reason for creating branch-2.
> A related concern is how disruptive the code changes will be in 
> branch-2. I imagine that changes in early in branch-2 will be easy to 
> backport to branch-1, while this effort will become more difficult, if 
> not impractical, as time goes. If the code bases diverge too much then 
> this could lead to more pressure for users of branch-1 to add features 
> just to branch-1, which has been mentioned as undesirable. By the same 
> token, backporting any code in branch-2 will require an increasing 
> amount of effort, which contributors to branch-2 may not be interested 
> in committing to.
> These questions affect us directly because, while we require a certain 
> amount of stability, we also like to pull in new functionality that 
> will be of value to our users. For example, our current 0.13 release 
> is probably closer to 0.14 at this point. Given the lifespan of a 
> release, it is often more palatable to backport features and bugfixes 
> than to jump to a new version.
>
> The good thing about this proposal is the opportunity to evaluate and 
> clean up alot of the old code.
> Thanks,
> chris
>
>
>
> On Monday, May 18, 2015 11:48 AM, Sergey Shelukhin 
> <sergey@hortonworks.com> wrote:
>
>
> Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some
> people are set in their ways or have practical considerations and don’t
> care for new shiny stuff.
>
>
>
>
>
> Sergey Shelukhin <mailto:sergey@hortonworks.com>
> May 18, 2015 at 11:47
> Note: by “cannot” I mean “are unwilling to”; upgrade paths exist, but some
> people are set in their ways or have practical considerations and don’t
> care for new shiny stuff.
>
>
> Sergey Shelukhin <mailto:sergey@hortonworks.com>
> May 18, 2015 at 11:46
> I think we need some path for deprecating old Hadoop versions, the same
> way we deprecate old Java version support or old RDBMS version support.
> At some point the cost of supporting Hadoop 1 exceeds the benefit. Same
> goes for stuff like MR; supporting it, esp. for perf work, becomes a
> burden, and it’s outdated with 2 alternatives, one of which has been
> around for 2 releases.
> The branches are a graceful way to get rid of the legacy burden.
>
> Alternatively, when sweeping changes are made, we can do what Hbase did
> (which is not pretty imho), where 0.94 version had ~30 dot releases
> because people cannot upgrade to 0.96 “singularity” release.
>
>
> I posit that people who run Hadoop 1 and MR at this day and age (and more
> so as time passes) are people who either don’t care about perf and new
> features, only stability; so, stability-focused branch would be perfect to
> support them.
>
>
>

--------------040703020601060406030702
Content-Type: multipart/related;
 boundary="------------040309010603030804050907"


--------------040309010603030804050907
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit

<html><head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head><body bgcolor="#FFFFFF" text="#000000">I don't think anyone is 
advocating for option 2, as that would be disastrous.  Option 3 is 
closest to what I'm proposing, though again dropping support for Hadoop 1
 is only a part of it.<br>
<br>
Alan.<br>
<br>
<blockquote style="border: 0px none;" 
cite="mid:CAKKt98QY+=rLU8UkDmQP-1srAfcF7Am=WhbiJ7BN0Qn0hChZ8Q@mail.gmail.com"
 type="cite">
  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 
style="display:table;width:100%;border-top:1px solid 
#EDEEF0;padding-top:5px"> 	<div 
style="display:table-cell;vertical-align:middle;padding-right:6px;"><img
 photoaddress="apivovarov@gmail.com" photoname="Alexander Pivovarov" 
src="cid:part1.01030809.02010608@gmail.com" 
name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div
 
style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">
   	<a moz-do-not-send="true" href="mailto:apivovarov@gmail.com" 
style="color:#737F92 
!important;padding-right:6px;font-weight:bold;text-decoration:none 
!important;">Alexander Pivovarov</a></div>   <div 
style="display:table-cell;white-space:nowrap;vertical-align:middle;">   
  <font color="#9FA2A5"><span style="padding-left:6px">May 22, 2015 at 
10:03</span></font></div></div></div>
  <div style="color:#888888;margin-left:24px;margin-right:24px;" 
__pbrmquotes="true" class="__pbConvBody"><div>Looks like we discussing 3
 options:<br><br>1. Support hadoop 1, 2 and 3 in master branch.<br><br>2.
 Support hadoop 1 in branch-1, hadoop 2 in branch-2, hadoop 3 in 
branch-3<br><br>3. Support hadoop 2 and 3 in master<br><br>I DO not 
think option 2 is good solution because it is much more difficuilt<br>to
 manage 3 active prod branches rather than one master branch.<br><br>I 
think we should go with options 1 or 3.<br><br>+1 on Xuefu and Edward 
opinion<br></div><div><!----><br></div></div>
  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 
style="display:table;width:100%;border-top:1px solid 
#EDEEF0;padding-top:5px"> 	<div 
style="display:table-cell;vertical-align:middle;padding-right:6px;"><img
 photoaddress="sergey@hortonworks.com" photoname="Sergey Shelukhin" 
src="cid:part1.01030809.02010608@gmail.com" 
name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div
 
style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">
   	<a moz-do-not-send="true" href="mailto:sergey@hortonworks.com" 
style="color:#737F92 
!important;padding-right:6px;font-weight:bold;text-decoration:none 
!important;">Sergey Shelukhin</a></div>   <div 
style="display:table-cell;white-space:nowrap;vertical-align:middle;">   
  <font color="#9FA2A5"><span style="padding-left:6px">May 22, 2015 at 
9:08</span></font></div></div></div>
  <div style="color:#888888;margin-left:24px;margin-right:24px;" 
__pbrmquotes="true" class="__pbConvBody"><div>I think branch-2 doesn’t 
need to be framed as particularly adventurous<br>(other than due to 
general increase of the amount of work done in Hive by<br>community).<br>All
 the new features that normally go on trunk/master will go to branch-2.<br>branch-2
 is just trunk as it is now, in fact there will be no branch-2,<br>just 
master :) The difference is the dropped functionality, not added one.<br>So
 you shouldn’t lose stability if you retain the same process as now by<br>just
 staying on versions off master.<br><br>Perhaps, as is usually the case 
in Apache projects, developing features on<br>older branches would be 
discouraged. Right now, all features usually go on<br>trunk/master, and 
are then back ported as needed and practical; so you<br>wouldn’t (in 
Apache) make a feature on Hive 0.14 to be released in 0.14.N,<br>and not
 back port to master.<br><br></div><div><!----><br></div></div>
  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 
style="display:table;width:100%;border-top:1px solid 
#EDEEF0;padding-top:5px"> 	<div 
style="display:table-cell;vertical-align:middle;padding-right:6px;"><img
 photoaddress="cdrome@yahoo-inc.com.INVALID" photoname="Chris Drome" 
src="cid:part1.01030809.02010608@gmail.com" 
name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div
 
style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">
   	<a moz-do-not-send="true" href="mailto:cdrome@yahoo-inc.com.INVALID"
 style="color:#737F92 
!important;padding-right:6px;font-weight:bold;text-decoration:none 
!important;">Chris Drome</a></div>   <div 
style="display:table-cell;white-space:nowrap;vertical-align:middle;">   
  <font color="#9FA2A5"><span style="padding-left:6px">May 22, 2015 at 
0:49</span></font></div></div></div>
  <div style="color:#888888;margin-left:24px;margin-right:24px;" 
__pbrmquotes="true" class="__pbConvBody"><div>I understand the 
motivation and benefits of creating a branch-2 where more disruptive 
work can go on without affecting branch-1. While not necessarily against
 this approach, from Yahoo's standpoint, I do have some questions 
(concerns).<br>Upgrading to a new version of Hive requires a significant
 commitment of time and resources to stabilize and certify a build for 
deployment to our clusters. Given the size of our clusters and scale of 
datasets, we have to be particularly careful about adopting new 
functionality. However, at the same time we are interested in new 
testing and making available new features and functionality. That said, 
we would have to rely on branch-1 for the immediate future.<br>One 
concern is that branch-1 would be left to stagnate, at which point there
 would be no option but for users to move to branch-2 as branch-1 would 
be effectively end-of-lifed. I'm not sure how long this would take, but 
it would eventually happen as a direct result of the very reason for 
creating branch-2.<br>A related concern is how disruptive the code 
changes will be in branch-2. I imagine that changes in early in branch-2
 will be easy to backport to branch-1, while this effort will become 
more difficult, if not impractical, as time goes. If the code bases 
diverge too much then this could lead to more pressure for users of 
branch-1 to add features just to branch-1, which has been mentioned as 
undesirable. By the same token, backporting any code in branch-2 will 
require an increasing amount of effort, which contributors to branch-2 
may not be interested in committing to.<br>These questions affect us 
directly because, while we require a certain amount of stability, we 
also like to pull in new functionality that will be of value to our 
users. For example, our current 0.13 release is probably closer to 0.14 
at this point. Given the lifespan of a release, it is often more 
palatable to backport features and bugfixes than to jump to a new 
version.<br><br>The good thing about this proposal is the opportunity to
 evaluate and clean up alot of the old code.<br>Thanks,<br>chris<br> <br><br><br>
     On Monday, May 18, 2015 11:48 AM, Sergey Shelukhin 
<a class="moz-txt-link-rfc2396E" href="mailto:sergey@hortonworks.com">&lt;sergey@hortonworks.com&gt;</a> wrote:<br>   <br><br> Note: by “cannot” I
 mean “are unwilling to”; upgrade paths exist, but some<br>people are 
set in their ways or have practical considerations and don’t<br>care for
 new shiny stuff.<br><br></div><div><!----><br><br><br>  <br></div></div>
  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 
style="display:table;width:100%;border-top:1px solid 
#EDEEF0;padding-top:5px"> 	<div 
style="display:table-cell;vertical-align:middle;padding-right:6px;"><img
 photoaddress="sergey@hortonworks.com" photoname="Sergey Shelukhin" 
src="cid:part1.01030809.02010608@gmail.com" 
name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div
 
style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">
   	<a moz-do-not-send="true" href="mailto:sergey@hortonworks.com" 
style="color:#737F92 
!important;padding-right:6px;font-weight:bold;text-decoration:none 
!important;">Sergey Shelukhin</a></div>   <div 
style="display:table-cell;white-space:nowrap;vertical-align:middle;">   
  <font color="#9FA2A5"><span style="padding-left:6px">May 18, 2015 at 
11:47</span></font></div></div></div>
  <div style="color:#888888;margin-left:24px;margin-right:24px;" 
__pbrmquotes="true" class="__pbConvBody"><div>Note: by “cannot” I mean 
“are unwilling to”; upgrade paths exist, but some<br>people are set in 
their ways or have practical considerations and don’t<br>care for new 
shiny stuff.<br><br></div><div><!----><br></div></div>
  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 
style="display:table;width:100%;border-top:1px solid 
#EDEEF0;padding-top:5px"> 	<div 
style="display:table-cell;vertical-align:middle;padding-right:6px;"><img
 photoaddress="sergey@hortonworks.com" photoname="Sergey Shelukhin" 
src="cid:part1.01030809.02010608@gmail.com" 
name="compose-unknown-contact.jpg" height="25px" width="25px"></div>   <div
 
style="display:table-cell;white-space:nowrap;vertical-align:middle;width:100%">
   	<a moz-do-not-send="true" href="mailto:sergey@hortonworks.com" 
style="color:#737F92 
!important;padding-right:6px;font-weight:bold;text-decoration:none 
!important;">Sergey Shelukhin</a></div>   <div 
style="display:table-cell;white-space:nowrap;vertical-align:middle;">   
  <font color="#9FA2A5"><span style="padding-left:6px">May 18, 2015 at 
11:46</span></font></div></div></div>
  <div style="color:#888888;margin-left:24px;margin-right:24px;" 
__pbrmquotes="true" class="__pbConvBody"><div>I think we need some path 
for deprecating old Hadoop versions, the same<br>way we deprecate old 
Java version support or old RDBMS version support.<br>At some point the 
cost of supporting Hadoop 1 exceeds the benefit. Same<br>goes for stuff 
like MR; supporting it, esp. for perf work, becomes a<br>burden, and 
it’s outdated with 2 alternatives, one of which has been<br>around for 2
 releases.<br>The branches are a graceful way to get rid of the legacy 
burden.<br><br>Alternatively, when sweeping changes are made, we can do 
what Hbase did<br>(which is not pretty imho), where 0.94 version had ~30
 dot releases<br>because people cannot upgrade to 0.96 “singularity” 
release.<br><br><br>I posit that people who run Hadoop 1 and MR at this 
day and age (and more<br>so as time passes) are people who either don’t 
care about perf and new<br>features, only stability; so, 
stability-focused branch would be perfect to<br>support them.<br><br><br></div><div><!----><br></div></div>
</blockquote>
</body></html>

--------------040309010603030804050907
Content-Type: image/jpeg; x-apple-mail-type=stationery;
 name="compose-unknown-contact.jpg"
Content-Transfer-Encoding: base64
Content-ID: <part1.01030809.02010608@gmail.com>
Content-Disposition: inline;
 filename="compose-unknown-contact.jpg"

/9j/4AAQSkZJRgABAQEARwBHAAD/2wBDAAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEC
AQEBAQEBAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/2wBDAQEBAQEBAQICAgICAgIC
AgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgL/wAAR
CAAZABkDAREAAhEBAxEB/8QAGAAAAwEBAAAAAAAAAAAAAAAABgcICQr/xAA0EAABAwMCAgUK
BwAAAAAAAAACAQMEBQYRABITIQcUMUF2CBUXIjI2N0JRtVRWkZOV0dL/xAAYAQEAAwEAAAAA
AAAAAAAAAAADAAEEAv/EACQRAAICAAQGAwAAAAAAAAAAAAABAhEDMrHREyExM0FxgfDx/9oA
DAMBAAIRAxEAPwDuEt+gW/ULet6oVC3rfqNQqFv0OfPn1GhUqfOmzZtKZlS5UqZMaNwzNwiJ
VIl7eXLCaZIGwBl3TY8epPx2+jy2ZNPjvkwc9uhW8j7nCPhvOsQliYIeS7cvCpp8o50qwrC4
v3lsNSDbdmTEhvs2tahxpfV3WnmbbozJEw/gwdadbYExVRXKEKoSdvJcaOSqxE7/AAiX0gXx
+a69/JSf9alIlste0VzaNpeFrcT9KKymotyiaZ0KRCnzacoE7Kjzn4gi2KqUh3jqDHDHv4mR
UfruTWlMzlVUKIVNp9GguEJnAh0+IZjyAiisgyRDnu5azS8miKqjOTVkKqS/psG37fo1Fbab
eg25b8eZPeFJBBJSjMG5HjMeyihnaauZwe4OGiju13GAcpOwBeN+U8/IkGbsiS8b7ryogmbz
hbyc9REROfZhERO5ETShjPtvpGqTUyLErytS4siSwx5x2tRH4hPOI0DkjZtaJtFxuVEbIUUi
yeNujlBUJGbJN6nM/Cyf2Hf60YgjvKA+NPSP4gT7axpcPtr51YWJnYn9dnAQWl722p4ot37y
zqnlfp6FrqbwawG8/9k=
--------------040309010603030804050907--

--------------040703020601060406030702--