hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Söztutar <e...@apache.org>
Subject Re: HEADSUP: Working on new 0.96.0RC
Date Wed, 09 Oct 2013 23:51:29 GMT
HBASE-9563 is already committed to 0.96. That leaves only HBASE-9696 and
HBASE-9724 under discussion. I am holding on committing 9724 for the time
being. Are there any more issues that might be a blocker against this

After 1.5 years without a major release, and the RC process nearing 40
days, I think we should only accept absolute blockers at this point. As far
as I am concerned, neither 9724 nor 9696 is a blocker against 0.96. Merge
is a new feature, and nothing critical depends on it. We can release saying
that merge is experimental (which was how it originally introduced, AFAIK)
and disable merge in CM for now if it makes tests flaky. We did not
identify a root cause that would point to 9696 although we are running
tests with CM for some time. We can still fix the merge and do a quick
0.96.1, in the release train model that proved to be so successful for
0.94. We do not have to delay 0.96 another month just because to fix a
corner case for a new feature.

As per our testing, we have been testing the 95 and 96 branches for a
couple of months. We still see some sporadic failures for CM tests, but no
blockers at this point. Most of the issues have been fixed so far. Our
nightlies run ITTBLL, ITLAV, both with and without CM running for ~3 hours,
ITMTTR, and many other IT's. My manual runs for longer intervals also
succeeds for now. Remember that none of these IT's would run even once for
earlier versions of 0.94 or before.

Ellliot, what are the root causes for the failures you are seeing? There
are no blockers raised as far as I can see. Let's decide on HBASE-9696
whether it is a blocker, and do the new candidate based on that unless
there are more blockers.


On Wed, Oct 9, 2013 at 2:52 PM, Elliott Clark <eclark@apache.org> wrote:

> On Wed, Oct 9, 2013 at 2:33 PM, Devaraj Das <ddas@hortonworks.com> wrote:
> >
> > For the 0.96.0 version, can we not say that "merge" should be used
> > with caution.
> I would feel very uncomfortable with that.  Telling people to just
> hope that the servers don't crash while a merge is going on seems like
> an unwise strategy.  Crashing or power failures are completely beyond
> users control. Since we have a proposed fix it seems better to me,
> that we hold off on this.  Get the tests done.  Then get the patch in,
> and start another round of testing.
> Also the master not coming back up, while not a known data loss issue
> like 9696 is very concerning.  We should get to the bottom of this.
> It's making TestMTTR fail, along with others sporadically.
> We've taken > 1.5 years on this release and we're on the home stretch.
>  We should make sure this is a really stable and quality release and
> not try and rush it.  Right now we're failing IT tests left and right.
>  We can't even pass an ingest test that lasts 4 hours.  That's
> something I can't see myself recommending to anyone in it's current
> state.  So that seems to me something that we shouldn't release.  And
> if we put up an RC now then we just know that it's going to fail IT
> tests and so will probably be a failed RC.
> I want this release out as badly as anyone else but I'd rather we have
> something that people can really and truly trust and not just
> something we have rushed.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message