curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: CURATOR-3.0 tests
Date Fri, 27 May 2016 03:04:25 GMT
I'm still seeing 6 failed tests that seem related to the same stuff after
merging your fix:

Failed tests:
org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
  Run 1: TestPathChildrenCache.testBasics:863 One or more child watchers
are still registered: [/test]
  Run 2: TestPathChildrenCache.testBasics:863 One or more child watchers
are still registered: [/test]

org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
  Run 1: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
One or more child watchers are still registered: [/test]
  Run 2: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
One or more child watchers are still registered: [/test]

org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
  Run 1: TestPathChildrenCache.testEnsurePath:363 One or more child
watchers are still registered: [/one/two/three]
  Run 2: TestPathChildrenCache.testEnsurePath:363 One or more child
watchers are still registered: [/one/two/three]

  TestInterProcessSemaphoreCluster.testCluster:294 expected [true] but
found [false]
org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount)
  Run 1: PASS
  Run 2: TestSharedCount.testMultiClientVersioned:256 One or more data
watchers are still registered: [/count]

org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount)
  Run 1: TestSharedCount.testSimple:174 One or more data watchers are still
registered: [/count]
  Run 2: TestSharedCount.testSimple:174 One or more data watchers are still
registered: [/count]


Tests run: 491, Failures: 6, Errors: 0, Skipped: 0


On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> I see the problem. The fix is not simple though so I’ll spend some time on
> it. The TL;DR is that exists watchers are still supposed to get set when
> there is a KeeperException.NoNode and the code isn’t handling it. But,
> while I was looking at the code I realized there are some significant
> additional problems. Curator, here, is trying to mirror what ZooKeeper does
> internally which is insanely complicated. In hindsight, the whole ZK
> watcher mechanism should’ve been decoupled from the mutator APIs. But, of
> course, that’s easy for me to say now.
>
> -Jordan
>
> > On May 26, 2016, at 1:10 AM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
> >
> > Thanks Scott,
> > Those tests are now passing for me.
> >
> > Jordan, testNodeCache:testBasics() is failing consistently on the 3.0
> > branch. It appears that this is actually potentially a bug in the
> > NodeCache. It ends up leaking a Watcher reference. I've had a quick look
> > through, but I haven't dived in in any detail. It's the
> > WatcherRemovalManager stuff I think. If you've got time, can you have a
> > look? If not, let me know and I'll do some more digging.
> >
> > cheers
> >
> > On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> > wrote:
> >
> >> Thanks Scott.
> >>
> >> Push the fix to master and merge it into 3.0.
> >>
> >> Then I guess, I'll push new versions of 2.11 and 3.2 onto Nexus.
> >> cheers
> >>
> >> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <dragonsinth@gmail.com>
> >> wrote:
> >>
> >>> Alright, I have a fix, but it wants to be applied to both master and
> 3.0.
> >>> Where should I push the fix?
> >>>
> >>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie <
> mckenzie.cam@gmail.com
> >>>>
> >>> wrote:
> >>>
> >>>> Thanks Scott,
> >>>> If you just checkout the CURATOR-3.0 branch, they are failing there.
> >>>> cheers
> >>>>
> >>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum <dragonsinth@gmail.com>
> >>> wrote:
> >>>>
> >>>>> Sure, what SHA are they failing at Cam?
> >>>>>
> >>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman <
> >>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>
> >>>>>> Scott can you take a look?
> >>>>>>
> >>>>>> -Jordan
> >>>>>>
> >>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie <
> >>>> mckenzie.cam@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Tree cache tests are still failing. I've tried a few times
but no
> >>>> love:
> >>>>>>>
> >>>>>>> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151
> >>>>>> actual 6
> >>>>>>> expected -31:
> >>>>>>>
> >>>>>>> I will have a look into what's going on in the morning.
Given that
> >>>>> these
> >>>>>>> may take some messing about to fix up, do we just want to
vote on
> >>>>> 2.11.0
> >>>>>>> separately, as that is all ready to go?
> >>>>>>> cheers
> >>>>>>>
> >>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan Zimmerman <
> >>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>
> >>>>>>>> Great news. Thanks.
> >>>>>>>>
> >>>>>>>> ====================
> >>>>>>>> Jordan Zimmerman
> >>>>>>>>
> >>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron McKenzie <
> >>>>> mckenzie.cam@gmail.com
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I have fixed up the test case, all good now.
> >>>>>>>>>
> >>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron McKenzie
<
> >>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Looks like it was introduced with the schema
validation stuff.
> >>> It
> >>>>> now
> >>>>>>>> does
> >>>>>>>>>> an ACL check prior to the backgrounding call.
Because the unit
> >>>> test
> >>>>>>>> uses a
> >>>>>>>>>> bogus ACL provider it just throws an exception
> >>>>>>>>>>
> >>>>>>>>>>      final String adjustedPath =
> >>>>>>>>>> adjustPath(client.fixForNamespace(givenPath,
> >>>>>>>> createMode.isSequential()));
> >>>>>>>>>>      List<ACL> aclList = acling.getAclList(adjustedPath);
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode,
> >>>>>>>> data,
> >>>>>>>>>> aclList);
> >>>>>>>>>>
> >>>>>>>>>>      String returnPath = null;
> >>>>>>>>>>      if ( backgrounding.inBackground() )
> >>>>>>>>>>      {
> >>>>>>>>>>          pathInBackground(adjustedPath, data,
givenPath);
> >>>>>>>>>>
> >>>>>>>>>> So, I guess the answer is to get the test to
force a failure
> >>> in a
> >>>>>>>>>> different way. With the UnhandledErrorListener,
the
> >>> expectation is
> >>>>>> that
> >>>>>>>>>> this only gets called on backgrounding operations?
> >>>>>>>>>> cheers
> >>>>>>>>>>
> >>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron McKenzie
<
> >>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Same on the master branch, but it passes
there, so maybe
> >>>> something
> >>>>>> has
> >>>>>>>>>>> legitimately broken the test. Will let you
know if I get
> >>> stuck.
> >>>>>>>>>>> cheers
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan
Zimmerman <
> >>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Let me know if you need help.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It might be a bad merge. Have you compared
it to the master
> >>>>> branch?
> >>>>>>>>>>>>
> >>>>>>>>>>>> -JZ
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM,
Cameron McKenzie <
> >>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Guys,
> >>>>>>>>>>>>> There's a test TestFrameworkBackground:testErrorListener
> >>> that
> >>>> is
> >>>>>>>>>>>> failing
> >>>>>>>>>>>>> consistently on the CURATOR-3.0
branch.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I can't see how it has ever worked.
It seems to try and
> >>> provoke
> >>>>> an
> >>>>>>>>>>>> error
> >>>>>>>>>>>>> via a bad ACL provider.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But this ACL provider is called
by the CreateBuilderImpl
> >>> prior
> >>>> to
> >>>>>> the
> >>>>>>>>>>>>> backgrounding call, which means
that the exception that it
> >>>> throws
> >>>>>>>>>>>> happens
> >>>>>>>>>>>>> in the main Thread of the unit test.
So, it just throws an
> >>>>>>>>>>>>> UnsupportedOperationException which
is propogated up the
> >>> stack
> >>>> at
> >>>>>>>> which
> >>>>>>>>>>>>> point the unit test fails.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So, I will look at fixing this,
but I just don't understand
> >>> how
> >>>>> it
> >>>>>>>> ever
> >>>>>>>>>>>>> worked?
> >>>>>>>>>>>>> cheers
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message