hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joydeep Sen Sarma <>
Subject RE: branching Hive and getting to first release
Date Tue, 10 Mar 2009 16:36:49 GMT
I am also a little worried about a lot of releases and managing them. perhaps what's clouding
my judgement is that there are a lot of critical bugs yet to be fixed - so I don't see how
we can stabilize the first release in a couple of weeks - or even a month (which is what killed
0.2 I think to some extent).

I would say that the first release is somewhat special. We are fixing a boatload of issues
from a very large push of code (all of it!). In subsequent releases - there wouldn't be as
many bugs - and a faster release cycle would be feasible.

So my vote would be to branch now (before predicate push down), get the release stable as
fast as possible (but potentially wait as long as it takes) - and then only start cutting
more branches. Over time - we can converge to a faster release cycle - but right now this
seems dubious to me.

Can't put a newborn into kindergarten directly man .. :-)

-----Original Message-----
From: Johan Oskarsson [] 
Sent: Tuesday, March 10, 2009 3:43 AM
Subject: Re: branching Hive and getting to first release

I'm worried that trying to create a new release every other week will be
too often. Isn't there a risk that we're still fixing bugs in 0.3 when
the 0.5 branch is cut if we run into something unexpected?
It seems Hadoop is suffering from this issue a bit lately even though
they branch quarterly, 0.19 still have lots of issues open when people
are committing patches to 0.21 (trunk). Granted Hadoop is a much larger
codebase with more patches applied.

That said, I won't oppose trying the period suggested and see how it
goes, it's quite easy to change after all.


Ashish Thusoo wrote:
> For 0.2 we had set a feature freeze date on the 28th of Jan and as I had mentioned in
the previous email, the plan was cut a branch on the last wednesday of every month and then
> issue a vote for making it a release once it ran satisfactorily (no blocker bugs) for
atleast 2 weeks @ facebook. Accordingly I was hoping that we would limit the changes that
would go into the branch (0.2) in this case to the blocker bugs only but it seems that we
had some feature creep and as a result we switched to using trunk at facebook without giving
sufficient time for 0.2 to stabilize. It also means that perhaps waiting for a month for each
release is too long at this stage at least for FB. If others are in agreement, how about we
do the following going forward..
> Cut a branch every other wednesday, only checkin the most ciritcal blocker bugs into
the branch and reserve the features for trunk which will be picked up in the next branch and
relegiously deploy only the versions of the branch at FB. We can start off a vote to make
a branch an official release once we have atleast 2 weeks of run on the branch without any
blocker bugs (i.e. we did not have a need to upgrade the production machines at FB).
> We can start off by creating a 0.3 branch this wednesday accordingly...
> Once we have an agreement on this we can document this procedure on the wiki and religiously
follow it. Without controlling the tendency of a feature creep it would be difficult to get
a stable version out...
> Thoughts?
> Ashish
> -----Original Message-----
> From: Johan Oskarsson [] 
> Sent: Tuesday, March 03, 2009 2:54 AM
> To:
> Subject: Re: branching Hive and getting to first release
> To be honest I must've missed that 0.2 was branched (I found the email now though), was
there a feature freeze date set?
> After branching shouldn't we have moved the non critical issues to 0.3 and pushed for
fixing the remaining bugs in order to release?
> That aside, I don't have a strong opinion whether the next release is
> 0.2 or 0.3, since there hasn't been an Apache release yet. How about setting a feature
freeze date now and take it from there?
> /Johan
> Joydeep Sen Sarma wrote:
>> Hey folks,
>> A few of us were chatting earlier today (some Facebook and Cloudera folks) on best
approach to get to a first Hive release.
>> While 0.2 has been branched - it seems awkward to base the first release on it. The
reason is twofold:
>> -          new changes to trunk since 0.2 have been relatively contained AFAIK (so
no added instability). As evidence - Facebook has reverted to running trunk in production
for the last week or so.
>> -          the changes that have gone into trunk since 0.2 are extremely important
from performance perspective. This includes the LazySerDe that Zheng added and upcoming hive-232.
>> So one proposal is to branch 0.3 at this point and try to make that first official
release for Hive.
>> This does look a little haphazard - and the natural question is whether we can stick
to this (or we end up repeating this once we throw in some more goodies). The feeling is that
this may be a good time - hive-279 has major changes to the hive compiler and branching 0.3
before those changes are checked in gives us a good chance of producing a stable release with
good performance (and the major changes will probably prevent us from repeating this trick
going forward :)).
>> What do people think?
>> Joydeep

View raw message