curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: CURATOR-3.0 tests
Date Tue, 31 May 2016 15:35:41 GMT
OK - please try now. I added a loop in the “no watchers” checker. If there are remaining
watchers, it will sleep a bit and try again.

-Jordan

> On May 31, 2016, at 1:33 AM, Cameron McKenzie <mckenzie.cam@gmail.com> wrote:
> 
> Looks like these failures are intermittent. Running them directly in
> Eclipse they seem to be passing. I will run the whole thing again in the
> morning and see how it goes.
> 
> On Tue, May 31, 2016 at 2:29 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
> 
>> There are still 2 tests failing for me:
>> 
>> FAILURE! - in
>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache
>> testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
>> Time elapsed: 17.488 sec  <<< FAILURE!
>> java.lang.AssertionError: One or more child watchers are still registered:
>> [/test]
>> at
>> org.apache.curator.framework.imps.TestCleanState.closeAndTestClean(TestCleanState.java:53)
>> at
>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(TestPathChildrenCache.java:707)
>> 
>> FAILURE! - in
>> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster
>> testCluster(org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster)
>> Time elapsed: 96.641 sec  <<< FAILURE!
>> java.lang.AssertionError: expected [true] but found [false]
>> at org.testng.Assert.fail(Assert.java:94)
>> at org.testng.Assert.failNotEquals(Assert.java:494)
>> at org.testng.Assert.assertTrue(Assert.java:42)
>> at org.testng.Assert.assertTrue(Assert.java:52)
>> at
>> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster.testCluster(TestInterProcessSemaphoreCluster.java:294)
>> 
>> Failed tests:
>> 
>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
>>  Run 1: TestPathChildrenCache.testKilledSession:707 One or more child
>> watchers are still registered: [/test]
>>  Run 2: PASS
>> 
>>  TestInterProcessSemaphoreCluster.testCluster:294 expected [true] but
>> found [false]
>> 
>> Tests run: 495, Failures: 2, Errors: 0, Skipped: 0
>> 
>> 
>> 
>> 
>> 
>> On Tue, May 31, 2016 at 12:52 PM, Cameron McKenzie <mckenzie.cam@gmail.com
>>> wrote:
>> 
>>> Thanks, CURATOR-332 wasn't pushed. I will run the tests against that, and
>>> if it's all good will merge into CURATOR-3.0
>>> cheers
>>> 
>>> On Tue, May 31, 2016 at 12:32 PM, Jordan Zimmerman <
>>> jordan@jordanzimmerman.com> wrote:
>>> 
>>>> Actually - I don’t remember if branch CURATOR-332 is merged yet. I
>>>> made/pushed my changes in CURATOR-332
>>>> 
>>>> -jordan
>>>> 
>>>>> On May 26, 2016, at 10:04 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
>>>> wrote:
>>>>> 
>>>>> I'm still seeing 6 failed tests that seem related to the same stuff
>>>> after
>>>>> merging your fix:
>>>>> 
>>>>> Failed tests:
>>>>> 
>>>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
>>>>> Run 1: TestPathChildrenCache.testBasics:863 One or more child watchers
>>>>> are still registered: [/test]
>>>>> Run 2: TestPathChildrenCache.testBasics:863 One or more child watchers
>>>>> are still registered: [/test]
>>>>> 
>>>>> 
>>>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
>>>>> Run 1: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
>>>>> One or more child watchers are still registered: [/test]
>>>>> Run 2: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
>>>>> One or more child watchers are still registered: [/test]
>>>>> 
>>>>> 
>>>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
>>>>> Run 1: TestPathChildrenCache.testEnsurePath:363 One or more child
>>>>> watchers are still registered: [/one/two/three]
>>>>> Run 2: TestPathChildrenCache.testEnsurePath:363 One or more child
>>>>> watchers are still registered: [/one/two/three]
>>>>> 
>>>>> TestInterProcessSemaphoreCluster.testCluster:294 expected [true] but
>>>>> found [false]
>>>>> 
>>>> org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount)
>>>>> Run 1: PASS
>>>>> Run 2: TestSharedCount.testMultiClientVersioned:256 One or more data
>>>>> watchers are still registered: [/count]
>>>>> 
>>>>> 
>>>> org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount)
>>>>> Run 1: TestSharedCount.testSimple:174 One or more data watchers are
>>>> still
>>>>> registered: [/count]
>>>>> Run 2: TestSharedCount.testSimple:174 One or more data watchers are
>>>> still
>>>>> registered: [/count]
>>>>> 
>>>>> 
>>>>> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0
>>>>> 
>>>>> 
>>>>> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman <
>>>>> jordan@jordanzimmerman.com> wrote:
>>>>> 
>>>>>> I see the problem. The fix is not simple though so I’ll spend some
>>>> time on
>>>>>> it. The TL;DR is that exists watchers are still supposed to get set
>>>> when
>>>>>> there is a KeeperException.NoNode and the code isn’t handling it.
But,
>>>>>> while I was looking at the code I realized there are some significant
>>>>>> additional problems. Curator, here, is trying to mirror what
>>>> ZooKeeper does
>>>>>> internally which is insanely complicated. In hindsight, the whole
ZK
>>>>>> watcher mechanism should’ve been decoupled from the mutator APIs.
>>>> But, of
>>>>>> course, that’s easy for me to say now.
>>>>>> 
>>>>>> -Jordan
>>>>>> 
>>>>>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie <
>>>> mckenzie.cam@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Thanks Scott,
>>>>>>> Those tests are now passing for me.
>>>>>>> 
>>>>>>> Jordan, testNodeCache:testBasics() is failing consistently on
the 3.0
>>>>>>> branch. It appears that this is actually potentially a bug in
the
>>>>>>> NodeCache. It ends up leaking a Watcher reference. I've had a
quick
>>>> look
>>>>>>> through, but I haven't dived in in any detail. It's the
>>>>>>> WatcherRemovalManager stuff I think. If you've got time, can
you
>>>> have a
>>>>>>> look? If not, let me know and I'll do some more digging.
>>>>>>> 
>>>>>>> cheers
>>>>>>> 
>>>>>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <
>>>>>> mckenzie.cam@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks Scott.
>>>>>>>> 
>>>>>>>> Push the fix to master and merge it into 3.0.
>>>>>>>> 
>>>>>>>> Then I guess, I'll push new versions of 2.11 and 3.2 onto
Nexus.
>>>>>>>> cheers
>>>>>>>> 
>>>>>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <dragonsinth@gmail.com
>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Alright, I have a fix, but it wants to be applied to
both master
>>>> and
>>>>>> 3.0.
>>>>>>>>> Where should I push the fix?
>>>>>>>>> 
>>>>>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie <
>>>>>> mckenzie.cam@gmail.com
>>>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Scott,
>>>>>>>>>> If you just checkout the CURATOR-3.0 branch, they
are failing
>>>> there.
>>>>>>>>>> cheers
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum <
>>>> dragonsinth@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Sure, what SHA are they failing at Cam?
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman
<
>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Scott can you take a look?
>>>>>>>>>>>> 
>>>>>>>>>>>> -Jordan
>>>>>>>>>>>> 
>>>>>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron
McKenzie <
>>>>>>>>>> mckenzie.cam@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Tree cache tests are still failing. I've
tried a few times but
>>>> no
>>>>>>>>>> love:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151
>>>>>>>>>>>> actual 6
>>>>>>>>>>>>> expected -31:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I will have a look into what's going
on in the morning. Given
>>>> that
>>>>>>>>>>> these
>>>>>>>>>>>>> may take some messing about to fix up,
do we just want to vote
>>>> on
>>>>>>>>>>> 2.11.0
>>>>>>>>>>>>> separately, as that is all ready to go?
>>>>>>>>>>>>> cheers
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan
Zimmerman <
>>>>>>>>>>>>> jordan@jordanzimmerman.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Great news. Thanks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ====================
>>>>>>>>>>>>>> Jordan Zimmerman
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On May 25, 2016, at 12:37 AM,
Cameron McKenzie <
>>>>>>>>>>> mckenzie.cam@gmail.com
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I have fixed up the test case,
all good now.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:45
PM, Cameron McKenzie <
>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Looks like it was introduced
with the schema validation
>>>> stuff.
>>>>>>>>> It
>>>>>>>>>>> now
>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>> an ACL check prior to the
backgrounding call. Because the
>>>> unit
>>>>>>>>>> test
>>>>>>>>>>>>>> uses a
>>>>>>>>>>>>>>>> bogus ACL provider it just
throws an exception
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>    final String adjustedPath
=
>>>>>>>>>>>>>>>> adjustPath(client.fixForNamespace(givenPath,
>>>>>>>>>>>>>> createMode.isSequential()));
>>>>>>>>>>>>>>>>    List<ACL> aclList
= acling.getAclList(adjustedPath);
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode,
>>>>>>>>>>>>>> data,
>>>>>>>>>>>>>>>> aclList);
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>    String returnPath = null;
>>>>>>>>>>>>>>>>    if ( backgrounding.inBackground()
)
>>>>>>>>>>>>>>>>    {
>>>>>>>>>>>>>>>>        pathInBackground(adjustedPath,
data, givenPath);
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So, I guess the answer is
to get the test to force a failure
>>>>>>>>> in a
>>>>>>>>>>>>>>>> different way. With the UnhandledErrorListener,
the
>>>>>>>>> expectation is
>>>>>>>>>>>> that
>>>>>>>>>>>>>>>> this only gets called on
backgrounding operations?
>>>>>>>>>>>>>>>> cheers
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:39
PM, Cameron McKenzie <
>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Same on the master branch,
but it passes there, so maybe
>>>>>>>>>> something
>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>> legitimately broken the
test. Will let you know if I get
>>>>>>>>> stuck.
>>>>>>>>>>>>>>>>> cheers
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Wed, May 25, 2016
at 1:36 PM, Jordan Zimmerman <
>>>>>>>>>>>>>>>>> jordan@jordanzimmerman.com>
wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Let me know if you
need help.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> It might be a bad
merge. Have you compared it to the
>>>> master
>>>>>>>>>>> branch?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> -JZ
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On May 24,
2016, at 10:31 PM, Cameron McKenzie <
>>>>>>>>>>>>>> mckenzie.cam@gmail.com>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Guys,
>>>>>>>>>>>>>>>>>>> There's a test
TestFrameworkBackground:testErrorListener
>>>>>>>>> that
>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> failing
>>>>>>>>>>>>>>>>>>> consistently
on the CURATOR-3.0 branch.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I can't see how
it has ever worked. It seems to try and
>>>>>>>>> provoke
>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>> error
>>>>>>>>>>>>>>>>>>> via a bad ACL
provider.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> But this ACL
provider is called by the CreateBuilderImpl
>>>>>>>>> prior
>>>>>>>>>> to
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> backgrounding
call, which means that the exception that
>>>> it
>>>>>>>>>> throws
>>>>>>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>>> in the main Thread
of the unit test. So, it just throws
>>>> an
>>>>>>>>>>>>>>>>>>> UnsupportedOperationException
which is propogated up the
>>>>>>>>> stack
>>>>>>>>>> at
>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>> point the unit
test fails.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> So, I will look
at fixing this, but I just don't
>>>> understand
>>>>>>>>> how
>>>>>>>>>>> it
>>>>>>>>>>>>>> ever
>>>>>>>>>>>>>>>>>>> worked?
>>>>>>>>>>>>>>>>>>> cheers
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>> 


Mime
View raw message