zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Han <h...@cloudera.com>
Subject Re: RC1 issues (was: Re: [VOTE] Apache ZooKeeper release 3.5.2-alpha candidate 1)
Date Mon, 04 Jul 2016 21:01:12 GMT
Both Java and C unit tests coming with 3.5.2-alpha passed for me in 5 runs.
Are the failed tests deterministically reproducible? If not, it seems we
have more flaky tests related to threading / timing that needs to be taken
care of, and they don't sound blocker for the release to me.

On Sun, Jul 3, 2016 at 9:48 PM, Rakesh Radhakrishnan <rakeshr@apache.org>
wrote:

> >> I'm suggesting as a blocker for 3.5.3, I think we should proceed with
> 3.5.2 as is and give some love to the C client in the next release.
>
> Since the current release is alpha I also feel its OK to go ahead with RC1
> and address the C client issue in 3.5.3. That way we'll get more folks
> trying it out and stabilize 3.5 version eventually. Probably will listen to
> others opinion as well.
>
> -Rakesh
>
> On Mon, Jul 4, 2016 at 12:32 AM, Flavio Junqueira <fpj@apache.org> wrote:
>
> >
> > > On 03 Jul 2016, at 17:53, Chris Nauroth <cnauroth@hortonworks.com>
> > wrote:
> > >
> > > For my part, I got a successful full test run from RC1 before starting
> > the
> > > [VOTE].  The problem with the silent failure of multi tests could have
> > > snuck past me easily though.  (Flavio, thank you for filing
> > > ZOOKEEPER-2463.)  I'm curious to hear test results from others who are
> > > trying RC1.
> >
> > The test failures seem to be related to test timing, not bugs, but I
> > haven't been able to confirm for the last two I mentioned. Granted that
> > timing is in some sense a bug, all I'm saying is that it doesn't seem to
> > indicate a regression or anything.
> >
> > >
> > > It looks like we also need an issue to track updating the copyright
> > notice
> > > in the docs.  I don't believe this is an ASF compliance problem in the
> > > same way that an erroneous NOTICE file would be, so I propose that we
> > > address it in 3.5.3.
> >
> > Agreed, we need an issue for that.
> >
> > >
> > > Flavio, you suggested filing a blocker for the ZooKeeperQuorumServer.cc
> > > failure.  Did you want that targeted to 3.5.2 or 3.5.3?
> > >
> >
> > I'm suggesting as a blocker for 3.5.3, I think we should proceed with
> > 3.5.2 as is and give some love to the C client in the next release.
> >
> > > Overall, how are people feeling about the RC1 [VOTE] at this point?  Is
> > > anyone considering a -1, or shall we proceed (keeping in mind it's an
> > > alpha) with the intent of fixing things in a more rapid 3.5.3 release
> > > cycle?
> >
> > I'd say we proceed.
> >
> > -Flavio
> >
> > >
> > >
> > >
> > > On 7/3/16, 8:43 AM, "Flavio Junqueira" <fpj@apache.org> wrote:
> > >
> > >> The issue with the TestReconfigServer test is that the client port is
> > >> still used and we get a bind exception, which prevents the server from
> > >> starting. To verify this locally, I simply added some code to retry
> and
> > >> it works fine with that fix. Going forward we need a better fox.
> > >>
> > >> I haven't able to figure out yet the issue with the
> > >> Zookeeper_simpleSystem tests.
> > >>
> > >> I have also found something strange with the multi tests. I have
> created
> > >> ZK-2463 for this problem and made it a blocker for 3.5.3.
> > >>
> > >> -Flavio
> > >>
> > >>> On 03 Jul 2016, at 15:25, Flavio Junqueira <fpj@apache.org> wrote:
> > >>>
> > >>> I have spun a new ubuntu VM to check the C failures. I get three
> > >>> failures with the new installation:
> > >>>
> > >>> Zookeeper_simpleSystem::testFirstServerDown : assertion : elapsed
> 10911
> > >>> tests/TestClient.cc:411: Assertion: equality assertion failed
> > >>> [Expected: -101, Actual  : -4]
> > >>> tests/TestClient.cc:322: Assertion: assertion failed [Expression:
> > >>> ctx.waitForConnected(zk)]
> > >>> Failures !!!
> > >>> Run: 43   Failure total: 2   Failures: 2   Errors: 0
> > >>>
> > >>>
> > >>>
> > >>> TestReconfigServer::testRemoveFollower/usr/bin/java
> > >>> ZooKeeper JMX enabled by default
> > >>> Using config: ./../../build/test/test-cppunit/conf/0.conf
> > >>> Starting zookeeper ... FAILED TO START
> > >>> zktest-mt: tests/ZooKeeperQuorumServer.cc:61: void
> > >>> ZooKeeperQuorumServer::start(): Assertion `system(command.c_str())
==
> > 0'
> > >>> failed.
> > >>> /bin/bash: line 5: 47059 Aborted                 (core dumped)
> > >>> ZKROOT=./../.. CLASSPATH=$CLASSPATH:$CLOVER_HOME/lib/clover.jar
> > >>> ${dir}$tst
> > >>>
> > >>> -Flavio
> > >>>
> > >>>
> > >>>> On 03 Jul 2016, at 15:19, Edward Ribeiro <edward.ribeiro@gmail.com>
> > >>>> wrote:
> > >>>>
> > >>>> Hi Flavio,
> > >>>>
> > >>>>
> > >>>> On Sun, Jul 3, 2016 at 5:54 AM, Flavio Junqueira <fpj@apache.org
> > >>>> <mailto:fpj@apache.org>> wrote:
> > >>>> Hey Eddie,
> > >>>>
> > >>>> A few comments on your points:
> > >>>>
> > >>>>>
> > >>>>> - the copyright notice is still dating "2008-2013". It's worth
> > >>>>> updating to
> > >>>>> the current year?
> > >>>>
> > >>>> Where are you seeing this? The NOTICE file is correct from what
I
> can
> > >>>> see.
> > >>>>
> > >>>> ​Ops, sorry. I was referring to the PDFs and HTMLs in the docs/
> > >>>> folder. Even after running "ant docs" the footnote has "2008-2013"
> > >>>> copyright. Images attached.
> > >>>>
> > >>>>
> > >>>>
> > >>>>> - I consistently ran on an test error equals to the one at
> > >>>>> https://builds.apache.org/job/ZooKeeper-trunk/2982/console
> > >>>>> <https://builds.apache.org/job/ZooKeeper-trunk/2982/console>
> > >>>>> <https://builds.apache.org/job/ZooKeeper-trunk/2982/console
> > >>>>> <https://builds.apache.org/job/ZooKeeper-trunk/2982/console>>
> > >>>>
> > >>>> I think this is ZK-2152, which Chris has moved to 3.5.3, so even
> > >>>> though it isn't ideal. it is expected.
> > >>>>
> > >>>> ​Got it. :)
> > >>>> ​
> > >>>>
> > >>>>> - Also this one:
> > >>>>>
> > >>>>>
> > https://mail-archives.apache.org/mod_mbox/zookeeper-dev/201601.mbox/%3C
> > >>>>> 1279938263.1283.1453526737790.JavaMail.jenkins@crius%3E
> > >>>>> <
> > https://mail-archives.apache.org/mod_mbox/zookeeper-dev/201601.mbox/%3
> > >>>>> C1279938263.1283.1453526737790.JavaMail.jenkins@crius%3E>
> > >>>>>
> > >>>>
> > >>>> I don't know if there is a jira for this one. If not, better create
> > >>>> one and make it a blocker.
> > >>>>
> > >>>> ​Okay, gonna look for and do this.
> > >>>>
> > >>>>
> > >>>>> - In fact, there were 14 failing tests total (I suspect all
of them
> > >>>>> related
> > >>>>> to the C tests). Any ideas? A couple of flacky tests?
> > >>>>>
> > >>>>>
> > >>>>
> > >>>> In general, having a release with so many tests failing is bad.
I
> > >>>> didn't get these test failures, so it would be great to report
them
> or
> > >>>> make sure that there are jiras for it.
> > >>>>
> > >>>> ​Right. I was only skep​tical of my own tests because I ran
the unit
> > >>>> tests on a relatively old Ubuntu version, even though it was Java
> 1.7.
> > >>>> So, I am running the tests on a newer Linux soon just to make sure
> it
> > >>>> was not a false negative.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Test failures are possibly an indication that something is bad
with
> > >>>> the RC, so I wouldn't have +1 it if I had observed all those. It
> might
> > >>>> be ok given that this is still labeled alpha.
> > >>>>
> > >>>> ​Excuse me. I only +1'ed because I suspect the errors are restricted
> > >>>> to the C binding and my Ubuntu version, etc. But I should have
> > >>>> researched further before giving +1, nevertheless. Point taken.
:)
> > >>>>
> > >>>> Edward
> > >>>
> > >>
> > >>
> > >
> >
> >
>



-- 
Cheers
Michael.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message