curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: CURATOR-3.0 tests
Date Tue, 31 May 2016 22:11:19 GMT
Ok, running it again now.

Is the problem that the watcher clean up for the recipes is done
asynchronously after they are closed?

On Wed, Jun 1, 2016 at 1:35 AM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> OK - please try now. I added a loop in the “no watchers” checker. If there
> are remaining watchers, it will sleep a bit and try again.
>
> -Jordan
>
> > On May 31, 2016, at 1:33 AM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
> >
> > Looks like these failures are intermittent. Running them directly in
> > Eclipse they seem to be passing. I will run the whole thing again in the
> > morning and see how it goes.
> >
> > On Tue, May 31, 2016 at 2:29 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> > wrote:
> >
> >> There are still 2 tests failing for me:
> >>
> >> FAILURE! - in
> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache
> >>
> testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >> Time elapsed: 17.488 sec  <<< FAILURE!
> >> java.lang.AssertionError: One or more child watchers are still
> registered:
> >> [/test]
> >> at
> >>
> org.apache.curator.framework.imps.TestCleanState.closeAndTestClean(TestCleanState.java:53)
> >> at
> >>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(TestPathChildrenCache.java:707)
> >>
> >> FAILURE! - in
> >>
> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster
> >>
> testCluster(org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster)
> >> Time elapsed: 96.641 sec  <<< FAILURE!
> >> java.lang.AssertionError: expected [true] but found [false]
> >> at org.testng.Assert.fail(Assert.java:94)
> >> at org.testng.Assert.failNotEquals(Assert.java:494)
> >> at org.testng.Assert.assertTrue(Assert.java:42)
> >> at org.testng.Assert.assertTrue(Assert.java:52)
> >> at
> >>
> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster.testCluster(TestInterProcessSemaphoreCluster.java:294)
> >>
> >> Failed tests:
> >>
> >>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>  Run 1: TestPathChildrenCache.testKilledSession:707 One or more child
> >> watchers are still registered: [/test]
> >>  Run 2: PASS
> >>
> >>  TestInterProcessSemaphoreCluster.testCluster:294 expected [true] but
> >> found [false]
> >>
> >> Tests run: 495, Failures: 2, Errors: 0, Skipped: 0
> >>
> >>
> >>
> >>
> >>
> >> On Tue, May 31, 2016 at 12:52 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com
> >>> wrote:
> >>
> >>> Thanks, CURATOR-332 wasn't pushed. I will run the tests against that,
> and
> >>> if it's all good will merge into CURATOR-3.0
> >>> cheers
> >>>
> >>> On Tue, May 31, 2016 at 12:32 PM, Jordan Zimmerman <
> >>> jordan@jordanzimmerman.com> wrote:
> >>>
> >>>> Actually - I don’t remember if branch CURATOR-332 is merged yet. I
> >>>> made/pushed my changes in CURATOR-332
> >>>>
> >>>> -jordan
> >>>>
> >>>>> On May 26, 2016, at 10:04 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> I'm still seeing 6 failed tests that seem related to the same stuff
> >>>> after
> >>>>> merging your fix:
> >>>>>
> >>>>> Failed tests:
> >>>>>
> >>>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>> Run 1: TestPathChildrenCache.testBasics:863 One or more child
> watchers
> >>>>> are still registered: [/test]
> >>>>> Run 2: TestPathChildrenCache.testBasics:863 One or more child
> watchers
> >>>>> are still registered: [/test]
> >>>>>
> >>>>>
> >>>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>> Run 1:
> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> >>>>> One or more child watchers are still registered: [/test]
> >>>>> Run 2:
> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> >>>>> One or more child watchers are still registered: [/test]
> >>>>>
> >>>>>
> >>>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>> Run 1: TestPathChildrenCache.testEnsurePath:363 One or more child
> >>>>> watchers are still registered: [/one/two/three]
> >>>>> Run 2: TestPathChildrenCache.testEnsurePath:363 One or more child
> >>>>> watchers are still registered: [/one/two/three]
> >>>>>
> >>>>> TestInterProcessSemaphoreCluster.testCluster:294 expected [true]
but
> >>>>> found [false]
> >>>>>
> >>>>
> org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount)
> >>>>> Run 1: PASS
> >>>>> Run 2: TestSharedCount.testMultiClientVersioned:256 One or more
data
> >>>>> watchers are still registered: [/count]
> >>>>>
> >>>>>
> >>>>
> org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount)
> >>>>> Run 1: TestSharedCount.testSimple:174 One or more data watchers
are
> >>>> still
> >>>>> registered: [/count]
> >>>>> Run 2: TestSharedCount.testSimple:174 One or more data watchers
are
> >>>> still
> >>>>> registered: [/count]
> >>>>>
> >>>>>
> >>>>> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0
> >>>>>
> >>>>>
> >>>>> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman <
> >>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>
> >>>>>> I see the problem. The fix is not simple though so I’ll spend
some
> >>>> time on
> >>>>>> it. The TL;DR is that exists watchers are still supposed to
get set
> >>>> when
> >>>>>> there is a KeeperException.NoNode and the code isn’t handling
it.
> But,
> >>>>>> while I was looking at the code I realized there are some
> significant
> >>>>>> additional problems. Curator, here, is trying to mirror what
> >>>> ZooKeeper does
> >>>>>> internally which is insanely complicated. In hindsight, the
whole ZK
> >>>>>> watcher mechanism should’ve been decoupled from the mutator
APIs.
> >>>> But, of
> >>>>>> course, that’s easy for me to say now.
> >>>>>>
> >>>>>> -Jordan
> >>>>>>
> >>>>>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie <
> >>>> mckenzie.cam@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Thanks Scott,
> >>>>>>> Those tests are now passing for me.
> >>>>>>>
> >>>>>>> Jordan, testNodeCache:testBasics() is failing consistently
on the
> 3.0
> >>>>>>> branch. It appears that this is actually potentially a bug
in the
> >>>>>>> NodeCache. It ends up leaking a Watcher reference. I've
had a quick
> >>>> look
> >>>>>>> through, but I haven't dived in in any detail. It's the
> >>>>>>> WatcherRemovalManager stuff I think. If you've got time,
can you
> >>>> have a
> >>>>>>> look? If not, let me know and I'll do some more digging.
> >>>>>>>
> >>>>>>> cheers
> >>>>>>>
> >>>>>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <
> >>>>>> mckenzie.cam@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks Scott.
> >>>>>>>>
> >>>>>>>> Push the fix to master and merge it into 3.0.
> >>>>>>>>
> >>>>>>>> Then I guess, I'll push new versions of 2.11 and 3.2
onto Nexus.
> >>>>>>>> cheers
> >>>>>>>>
> >>>>>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <
> dragonsinth@gmail.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Alright, I have a fix, but it wants to be applied
to both master
> >>>> and
> >>>>>> 3.0.
> >>>>>>>>> Where should I push the fix?
> >>>>>>>>>
> >>>>>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie
<
> >>>>>> mckenzie.cam@gmail.com
> >>>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks Scott,
> >>>>>>>>>> If you just checkout the CURATOR-3.0 branch,
they are failing
> >>>> there.
> >>>>>>>>>> cheers
> >>>>>>>>>>
> >>>>>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum
<
> >>>> dragonsinth@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Sure, what SHA are they failing at Cam?
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan
Zimmerman <
> >>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Scott can you take a look?
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Jordan
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron
McKenzie <
> >>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Tree cache tests are still failing.
I've tried a few times
> but
> >>>> no
> >>>>>>>>>> love:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151
> >>>>>>>>>>>> actual 6
> >>>>>>>>>>>>> expected -31:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I will have a look into what's going
on in the morning. Given
> >>>> that
> >>>>>>>>>>> these
> >>>>>>>>>>>>> may take some messing about to fix
up, do we just want to
> vote
> >>>> on
> >>>>>>>>>>> 2.11.0
> >>>>>>>>>>>>> separately, as that is all ready
to go?
> >>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM,
Jordan Zimmerman <
> >>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Great news. Thanks.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ====================
> >>>>>>>>>>>>>> Jordan Zimmerman
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On May 25, 2016, at 12:37
AM, Cameron McKenzie <
> >>>>>>>>>>> mckenzie.cam@gmail.com
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have fixed up the test
case, all good now.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, May 25, 2016 at
1:45 PM, Cameron McKenzie <
> >>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Looks like it was introduced
with the schema validation
> >>>> stuff.
> >>>>>>>>> It
> >>>>>>>>>>> now
> >>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>> an ACL check prior to
the backgrounding call. Because the
> >>>> unit
> >>>>>>>>>> test
> >>>>>>>>>>>>>> uses a
> >>>>>>>>>>>>>>>> bogus ACL provider it
just throws an exception
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>    final String adjustedPath
=
> >>>>>>>>>>>>>>>> adjustPath(client.fixForNamespace(givenPath,
> >>>>>>>>>>>>>> createMode.isSequential()));
> >>>>>>>>>>>>>>>>    List<ACL> aclList
= acling.getAclList(adjustedPath);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode,
> >>>>>>>>>>>>>> data,
> >>>>>>>>>>>>>>>> aclList);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>    String returnPath
= null;
> >>>>>>>>>>>>>>>>    if ( backgrounding.inBackground()
)
> >>>>>>>>>>>>>>>>    {
> >>>>>>>>>>>>>>>>        pathInBackground(adjustedPath,
data, givenPath);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> So, I guess the answer
is to get the test to force a
> failure
> >>>>>>>>> in a
> >>>>>>>>>>>>>>>> different way. With
the UnhandledErrorListener, the
> >>>>>>>>> expectation is
> >>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> this only gets called
on backgrounding operations?
> >>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, May 25, 2016
at 1:39 PM, Cameron McKenzie <
> >>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Same on the master
branch, but it passes there, so maybe
> >>>>>>>>>> something
> >>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>> legitimately broken
the test. Will let you know if I get
> >>>>>>>>> stuck.
> >>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, May 25,
2016 at 1:36 PM, Jordan Zimmerman <
> >>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com>
wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Let me know
if you need help.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> It might be
a bad merge. Have you compared it to the
> >>>> master
> >>>>>>>>>>> branch?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -JZ
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On May
24, 2016, at 10:31 PM, Cameron McKenzie <
> >>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Guys,
> >>>>>>>>>>>>>>>>>>> There's
a test
> TestFrameworkBackground:testErrorListener
> >>>>>>>>> that
> >>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> failing
> >>>>>>>>>>>>>>>>>>> consistently
on the CURATOR-3.0 branch.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I can't
see how it has ever worked. It seems to try and
> >>>>>>>>> provoke
> >>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>> error
> >>>>>>>>>>>>>>>>>>> via a bad
ACL provider.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> But this
ACL provider is called by the
> CreateBuilderImpl
> >>>>>>>>> prior
> >>>>>>>>>> to
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> backgrounding
call, which means that the exception that
> >>>> it
> >>>>>>>>>> throws
> >>>>>>>>>>>>>>>>>> happens
> >>>>>>>>>>>>>>>>>>> in the main
Thread of the unit test. So, it just throws
> >>>> an
> >>>>>>>>>>>>>>>>>>> UnsupportedOperationException
which is propogated up
> the
> >>>>>>>>> stack
> >>>>>>>>>> at
> >>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>> point the
unit test fails.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> So, I will
look at fixing this, but I just don't
> >>>> understand
> >>>>>>>>> how
> >>>>>>>>>>> it
> >>>>>>>>>>>>>> ever
> >>>>>>>>>>>>>>>>>>> worked?
> >>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message