curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: CURATOR-3.0 tests
Date Thu, 02 Jun 2016 03:59:18 GMT
The counter is just being used to check if semaphores are still being
acquired. Essentially it just runs in a loop acquiring semaphores (and
incrementing the counter when they are acquired).

Then it shuts down the server, waits until it the session is lost, then
restarts the server and then checks that semaphores are being acquired
correctly again (by checking that the counter is being incremented).

This is just a simplified version of the test that is failing.

When the test fails, all of the threads are attempting to get a lease on
the semaphore, but none of them get it, then the test times out while
waiting.



On Thu, Jun 2, 2016 at 1:29 PM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> I also had to add:
>
> while(!lost.get() && (counter.get() > 0))
> {
>     Thread.sleep(1000);
> }
> Which seems more correct to me.
>
> > On Jun 1, 2016, at 9:07 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
> >
> > I have just pushed an interprocess_mutex_issue branch. The test case is
> in
> > TestInterprocessMutexNotReconnecting
> >
> > For me it's failing around 20% of the time.
> > cheers
> >
> > On Thu, Jun 2, 2016 at 11:17 AM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> > wrote:
> >
> >> Yep, just let me confirm that it's actually getting the same problem.
> I'm
> >> sure it was before, but I've just run it a bunch of times and
> everything's
> >> been fine.
> >>
> >> On Thu, Jun 2, 2016 at 11:15 AM, Jordan Zimmerman <
> >> jordan@jordanzimmerman.com> wrote:
> >>
> >>> Can you push your unit test somewhere?
> >>>
> >>>> On Jun 1, 2016, at 7:37 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
> >>> wrote:
> >>>>
> >>>> Indeed. There seems to be a problem with InterProcessSemaphoreV2
> though.
> >>>> I've written a simplified unit test that just has a bunch of clients
> >>>> attempting to grab a lease on the semaphore. When I shutdown and
> >>> restart ZK
> >>>> about 25% of the time, none of the clients can reacquire the
> semaphore.
> >>>>
> >>>> Still trying to work out what's going on, but I'm probably not going
> to
> >>>> have a lot of time today to look at it.
> >>>> cheers
> >>>>
> >>>> On Thu, Jun 2, 2016 at 10:30 AM, Jordan Zimmerman <
> >>>> jordan@jordanzimmerman.com> wrote:
> >>>>
> >>>>> Odd - SemaphoreClient does seem wrong.
> >>>>>
> >>>>>> On Jun 1, 2016, at 1:43 AM, Cameron McKenzie <
> mckenzie.cam@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> It looks like under some circumstances (which I haven't worked out
> >>> yet)
> >>>>>> that the InterprocessMutex acquire() is not working correctly when
> >>>>>> reconnecting to ZK. Still digging into why this is.
> >>>>>>
> >>>>>> There also seems to be a bug in the SemaphoreClient, unless I'm
> >>> missing
> >>>>>> something. At lines 126 and 140 it does compareAndSet() calls but
> >>> throws
> >>>>> an
> >>>>>> exception if they return true. As far as I can work out, this means
> >>> that
> >>>>>> whenever the lock is acquired, an exception gets thrown indicating
> >>> that
> >>>>>> there are Multiple acquirers.
> >>>>>>
> >>>>>> This test is failing fairly consistently. It seems to be the
> remaining
> >>>>> test
> >>>>>> that keeps failing in the Jenkins build also
> >>>>>> cheers
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jun 1, 2016 at 3:10 PM, Cameron McKenzie <
> >>> mckenzie.cam@gmail.com
> >>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Looks like I was incorrect. The NoWatcherException is being thrown
> on
> >>>>>>> success as well, and the problem is not in the cluster restart.
> Will
> >>>>> keep
> >>>>>>> digging.
> >>>>>>>
> >>>>>>> On Wed, Jun 1, 2016 at 2:52 PM, Cameron McKenzie <
> >>>>> mckenzie.cam@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> TestInterProcessSemaphoreCluster.testCluster() is failling
> >>> (assertion
> >>>>> at
> >>>>>>>> line 294). Again, it seems like some sort of race condition with
> the
> >>>>>>>> watcher removal.
> >>>>>>>>
> >>>>>>>> When I run it in Eclipse, it fails maybe 25% of the time. When it
> >>> fails
> >>>>>>>> it seems that it's got something to do with watcher removal. When
> >>> the
> >>>>> test
> >>>>>>>> passes, this error is not logged.
> >>>>>>>>
> >>>>>>>> org.apache.zookeeper.KeeperException$NoWatcherException:
> >>>>> KeeperErrorCode
> >>>>>>>> = No such watcher for /foo/bar/lock/leases
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.ZooKeeper$ZKWatchManager.containsWatcher(ZooKeeper.java:377)
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.ZooKeeper$ZKWatchManager.removeWatcher(ZooKeeper.java:252)
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.WatchDeregistration.unregister(WatchDeregistration.java:58)
> >>>>>>>> at
> org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:712)
> >>>>>>>> at org.apache.zookeeper.ClientCnxn.access$1500(ClientCnxn.java:97)
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:948)
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:99)
> >>>>>>>> at
> >>>>>>>>
> >>>>>
> >>>
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
> >>>>>>>> at
> >>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236)
> >>>>>>>>
> >>>>>>>> Is it possible it's something to do with the way that the cluster
> is
> >>>>>>>> restarted at line 282? The old cluster is not shutdown, a new one
> is
> >>>>> just
> >>>>>>>> created.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Jun 1, 2016 at 10:44 AM, Jordan Zimmerman <
> >>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>
> >>>>>>>>> I’ll try to address this as part of CURATOR-333
> >>>>>>>>>
> >>>>>>>>>> On May 31, 2016, at 7:08 PM, Cameron McKenzie <
> >>>>> mckenzie.cam@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Maybe we need to look at some way of providing a hook for tests
> to
> >>>>> wait
> >>>>>>>>>> reliably for asynch tasks to finish?
> >>>>>>>>>>
> >>>>>>>>>> The latest round of tests ran OK. One test failed on an
> unrelated
> >>>>> thing
> >>>>>>>>>> (ConnectionLoss), but this appears to be a transient thing as
> it's
> >>>>>>>>> worked
> >>>>>>>>>> ok the next time around.
> >>>>>>>>>>
> >>>>>>>>>> I will start getting a release together. Thanks for you help
> with
> >>> the
> >>>>>>>>>> updated tests.
> >>>>>>>>>> cheers
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Jun 1, 2016 at 9:12 AM, Jordan Zimmerman <
> >>>>>>>>> jordan@jordanzimmerman.com
> >>>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> The problem is in-flight watchers and async background calls.
> >>>>> There’s
> >>>>>>>>> no
> >>>>>>>>>>> way to cancel these and they can take time to occur - even
> after
> >>> a
> >>>>>>>>> recipe
> >>>>>>>>>>> instance is closed.
> >>>>>>>>>>>
> >>>>>>>>>>> -Jordan
> >>>>>>>>>>>
> >>>>>>>>>>>> On May 31, 2016, at 5:11 PM, Cameron McKenzie <
> >>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Ok, running it again now.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is the problem that the watcher clean up for the recipes is
> done
> >>>>>>>>>>>> asynchronously after they are closed?
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Jun 1, 2016 at 1:35 AM, Jordan Zimmerman <
> >>>>>>>>>>> jordan@jordanzimmerman.com
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> OK - please try now. I added a loop in the “no watchers”
> >>> checker.
> >>>>> If
> >>>>>>>>>>> there
> >>>>>>>>>>>>> are remaining watchers, it will sleep a bit and try again.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> -Jordan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On May 31, 2016, at 1:33 AM, Cameron McKenzie <
> >>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Looks like these failures are intermittent. Running them
> >>> directly
> >>>>>>>>> in
> >>>>>>>>>>>>>> Eclipse they seem to be passing. I will run the whole thing
> >>> again
> >>>>>>>>> in
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> morning and see how it goes.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, May 31, 2016 at 2:29 PM, Cameron McKenzie <
> >>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> There are still 2 tests failing for me:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> FAILURE! - in
> >>>>>>>>>>>>>>>
> >>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>>>>>>>>>>>> Time elapsed: 17.488 sec  <<< FAILURE!
> >>>>>>>>>>>>>>> java.lang.AssertionError: One or more child watchers are
> >>> still
> >>>>>>>>>>>>> registered:
> >>>>>>>>>>>>>>> [/test]
> >>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.imps.TestCleanState.closeAndTestClean(TestCleanState.java:53)
> >>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(TestPathChildrenCache.java:707)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> FAILURE! - in
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> testCluster(org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster)
> >>>>>>>>>>>>>>> Time elapsed: 96.641 sec  <<< FAILURE!
> >>>>>>>>>>>>>>> java.lang.AssertionError: expected [true] but found [false]
> >>>>>>>>>>>>>>> at org.testng.Assert.fail(Assert.java:94)
> >>>>>>>>>>>>>>> at org.testng.Assert.failNotEquals(Assert.java:494)
> >>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:42)
> >>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:52)
> >>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster.testCluster(TestInterProcessSemaphoreCluster.java:294)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Failed tests:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testKilledSession:707 One or
> >>> more
> >>>>>>>>> child
> >>>>>>>>>>>>>>> watchers are still registered: [/test]
> >>>>>>>>>>>>>>> Run 2: PASS
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294 expected
> >>> [true]
> >>>>>>>>> but
> >>>>>>>>>>>>>>> found [false]
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Tests run: 495, Failures: 2, Errors: 0, Skipped: 0
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:52 PM, Cameron McKenzie <
> >>>>>>>>>>>>> mckenzie.cam@gmail.com
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks, CURATOR-332 wasn't pushed. I will run the tests
> >>> against
> >>>>>>>>> that,
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> if it's all good will merge into CURATOR-3.0
> >>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:32 PM, Jordan Zimmerman <
> >>>>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Actually - I don’t remember if branch CURATOR-332 is
> merged
> >>>>>>>>> yet. I
> >>>>>>>>>>>>>>>>> made/pushed my changes in CURATOR-332
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> -jordan
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On May 26, 2016, at 10:04 PM, Cameron McKenzie <
> >>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I'm still seeing 6 failed tests that seem related to the
> >>> same
> >>>>>>>>> stuff
> >>>>>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>>>>> merging your fix:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Failed tests:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testBasics:863 One or more
> >>> child
> >>>>>>>>>>>>> watchers
> >>>>>>>>>>>>>>>>>> are still registered: [/test]
> >>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testBasics:863 One or more
> >>> child
> >>>>>>>>>>>>> watchers
> >>>>>>>>>>>>>>>>>> are still registered: [/test]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>>>>>>>>>>>>>>> Run 1:
> >>>>>>>>>>>>>
> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> >>>>>>>>>>>>>>>>>> One or more child watchers are still registered: [/test]
> >>>>>>>>>>>>>>>>>> Run 2:
> >>>>>>>>>>>>>
> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> >>>>>>>>>>>>>>>>>> One or more child watchers are still registered: [/test]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> >>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testEnsurePath:363 One or
> >>> more
> >>>>>>>>> child
> >>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three]
> >>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testEnsurePath:363 One or
> >>> more
> >>>>>>>>> child
> >>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294
> expected
> >>>>>>>>> [true]
> >>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>> found [false]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount)
> >>>>>>>>>>>>>>>>>> Run 1: PASS
> >>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testMultiClientVersioned:256 One
> or
> >>>>> more
> >>>>>>>>>>> data
> >>>>>>>>>>>>>>>>>> watchers are still registered: [/count]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >>>
> org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount)
> >>>>>>>>>>>>>>>>>> Run 1: TestSharedCount.testSimple:174 One or more data
> >>>>>>>>> watchers are
> >>>>>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>>>>> registered: [/count]
> >>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testSimple:174 One or more data
> >>>>>>>>> watchers are
> >>>>>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>>>>> registered: [/count]
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman <
> >>>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I see the problem. The fix is not simple though so I’ll
> >>>>> spend
> >>>>>>>>> some
> >>>>>>>>>>>>>>>>> time on
> >>>>>>>>>>>>>>>>>>> it. The TL;DR is that exists watchers are still
> supposed
> >>> to
> >>>>>>>>> get
> >>>>>>>>>>> set
> >>>>>>>>>>>>>>>>> when
> >>>>>>>>>>>>>>>>>>> there is a KeeperException.NoNode and the code isn’t
> >>>>> handling
> >>>>>>>>> it.
> >>>>>>>>>>>>> But,
> >>>>>>>>>>>>>>>>>>> while I was looking at the code I realized there are
> some
> >>>>>>>>>>>>> significant
> >>>>>>>>>>>>>>>>>>> additional problems. Curator, here, is trying to mirror
> >>> what
> >>>>>>>>>>>>>>>>> ZooKeeper does
> >>>>>>>>>>>>>>>>>>> internally which is insanely complicated. In hindsight,
> >>> the
> >>>>>>>>> whole
> >>>>>>>>>>> ZK
> >>>>>>>>>>>>>>>>>>> watcher mechanism should’ve been decoupled from the
> >>> mutator
> >>>>>>>>> APIs.
> >>>>>>>>>>>>>>>>> But, of
> >>>>>>>>>>>>>>>>>>> course, that’s easy for me to say now.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> -Jordan
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie <
> >>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks Scott,
> >>>>>>>>>>>>>>>>>>>> Those tests are now passing for me.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Jordan, testNodeCache:testBasics() is failing
> >>> consistently
> >>>>>>>>> on the
> >>>>>>>>>>>>> 3.0
> >>>>>>>>>>>>>>>>>>>> branch. It appears that this is actually potentially a
> >>> bug
> >>>>>>>>> in the
> >>>>>>>>>>>>>>>>>>>> NodeCache. It ends up leaking a Watcher reference.
> I've
> >>>>> had a
> >>>>>>>>>>> quick
> >>>>>>>>>>>>>>>>> look
> >>>>>>>>>>>>>>>>>>>> through, but I haven't dived in in any detail. It's
> the
> >>>>>>>>>>>>>>>>>>>> WatcherRemovalManager stuff I think. If you've got
> time,
> >>>>> can
> >>>>>>>>> you
> >>>>>>>>>>>>>>>>> have a
> >>>>>>>>>>>>>>>>>>>> look? If not, let me know and I'll do some more
> digging.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <
> >>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks Scott.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Push the fix to master and merge it into 3.0.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Then I guess, I'll push new versions of 2.11 and 3.2
> >>> onto
> >>>>>>>>> Nexus.
> >>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <
> >>>>>>>>>>>>> dragonsinth@gmail.com
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Alright, I have a fix, but it wants to be applied to
> >>> both
> >>>>>>>>>>> master
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> 3.0.
> >>>>>>>>>>>>>>>>>>>>>> Where should I push the fix?
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie <
> >>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Thanks Scott,
> >>>>>>>>>>>>>>>>>>>>>>> If you just checkout the CURATOR-3.0 branch, they
> are
> >>>>>>>>> failing
> >>>>>>>>>>>>>>>>> there.
> >>>>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum <
> >>>>>>>>>>>>>>>>> dragonsinth@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Sure, what SHA are they failing at Cam?
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman
> <
> >>>>>>>>>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Scott can you take a look?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> -Jordan
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie <
> >>>>>>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Tree cache tests are still failing. I've tried a
> >>> few
> >>>>>>>>> times
> >>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>> no
> >>>>>>>>>>>>>>>>>>>>>>> love:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>
> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151
> >>>>>>>>>>>>>>>>>>>>>>>>> actual 6
> >>>>>>>>>>>>>>>>>>>>>>>>>> expected -31:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I will have a look into what's going on in the
> >>>>> morning.
> >>>>>>>>>>> Given
> >>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>> these
> >>>>>>>>>>>>>>>>>>>>>>>>>> may take some messing about to fix up, do we
> just
> >>>>> want
> >>>>>>>>> to
> >>>>>>>>>>>>> vote
> >>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>>>>> 2.11.0
> >>>>>>>>>>>>>>>>>>>>>>>>>> separately, as that is all ready to go?
> >>>>>>>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan
> Zimmerman
> >>> <
> >>>>>>>>>>>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Great news. Thanks.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> ====================
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Jordan Zimmerman
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron
> McKenzie <
> >>>>>>>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I have fixed up the test case, all good now.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron
> >>> McKenzie <
> >>>>>>>>>>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looks like it was introduced with the schema
> >>>>>>>>> validation
> >>>>>>>>>>>>>>>>> stuff.
> >>>>>>>>>>>>>>>>>>>>>> It
> >>>>>>>>>>>>>>>>>>>>>>>> now
> >>>>>>>>>>>>>>>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> an ACL check prior to the backgrounding call.
> >>>>>>>>> Because
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> unit
> >>>>>>>>>>>>>>>>>>>>>>> test
> >>>>>>>>>>>>>>>>>>>>>>>>>>> uses a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> bogus ACL provider it just throws an
> exception
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> final String adjustedPath =
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> adjustPath(client.fixForNamespace(givenPath,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> createMode.isSequential()));
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> List<ACL> aclList =
> >>>>>>>>> acling.getAclList(adjustedPath);
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> data,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> aclList);
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> String returnPath = null;
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> if ( backgrounding.inBackground() )
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> {
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>   pathInBackground(adjustedPath, data,
> >>>>>>>>> givenPath);
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I guess the answer is to get the test to
> >>>>> force a
> >>>>>>>>>>>>> failure
> >>>>>>>>>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> different way. With the
> UnhandledErrorListener,
> >>>>> the
> >>>>>>>>>>>>>>>>>>>>>> expectation is
> >>>>>>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this only gets called on backgrounding
> >>> operations?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron
> >>> McKenzie
> >>>>> <
> >>>>>>>>>>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Same on the master branch, but it passes
> >>> there,
> >>>>> so
> >>>>>>>>>>> maybe
> >>>>>>>>>>>>>>>>>>>>>>> something
> >>>>>>>>>>>>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> legitimately broken the test. Will let you
> >>> know
> >>>>> if
> >>>>>>>>> I
> >>>>>>>>>>> get
> >>>>>>>>>>>>>>>>>>>>>> stuck.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan
> >>>>> Zimmerman <
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let me know if you need help.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It might be a bad merge. Have you compared
> >>> it to
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>> master
> >>>>>>>>>>>>>>>>>>>>>>>> branch?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -JZ
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM, Cameron
> >>>>> McKenzie <
> >>>>>>>>>>>>>>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Guys,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There's a test
> >>>>>>>>>>>>> TestFrameworkBackground:testErrorListener
> >>>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> failing
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistently on the CURATOR-3.0 branch.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can't see how it has ever worked. It
> >>> seems to
> >>>>>>>>> try
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> provoke
> >>>>>>>>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via a bad ACL provider.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> But this ACL provider is called by the
> >>>>>>>>>>>>> CreateBuilderImpl
> >>>>>>>>>>>>>>>>>>>>>> prior
> >>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backgrounding call, which means that the
> >>>>>>>>> exception
> >>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>> throws
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> happens
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in the main Thread of the unit test. So,
> it
> >>>>> just
> >>>>>>>>>>> throws
> >>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedOperationException which is
> >>>>>>>>> propogated up
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> stack
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> point the unit test fails.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I will look at fixing this, but I just
> >>>>> don't
> >>>>>>>>>>>>>>>>> understand
> >>>>>>>>>>>>>>>>>>>>>> how
> >>>>>>>>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>>>>>> ever
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> worked?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message