lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomoko Uchida <tomoko.uchida.1...@gmail.com>
Subject Re: Question about usage of LuceneTestCase
Date Wed, 22 Aug 2018 15:13:15 GMT
Can I ask one more question.

4> If MIke's intuition that it's one of the file system randomizations
that occasionally gets hit _and_ you determine that that's an invalid
test case (and for Luke requiring that the FS-basesd tests are all
that are necessary may be fine) I'm pretty sure you you can disable
that randomization for your specific tests.

As you may know, Luke calls relatively low Lucene APIs (such as
o.a.l.u.IndexCommit or SegmentInfos) to show commit points, segment files,
etc. ("Commits" tab do this.)
I am not sure about when we could/should disable randomization, could you
give me any cues for this? Or, real test cases that disable randomization
are helpful for me, I will search Lucene/Solr code base.

Thanks,
Tomoko

2018年8月22日(水) 21:58 Tomoko Uchida <tomoko.uchida.1111@gmail.com>:

> Thanks for your kind explanations,
>
> sorry of course I know what is the randomization seed,
> but your description and instruction is exactly what I wanted.
>
> > The randomization can cause different
> > combinations of "stuff" to happen. Say the locale is randomized to
> > Turkish and a token is also randomly generated that breaks _only_ with
> > that combination. You'd never explicitly be able to test all of those
> > kinds of combinations, thus the random() function. And there may be
> > many calls to random() by the time a test is run.
>
> My understanding at this point is (though it may be a repeat of your
> words,)
> first we should find out the combinations behind the failures.
> If there are any particular patterns, there could be bugs, so we should
> fix it.
>
> Thanks,
> Tomoko
>
> 2018年8月22日(水) 14:59 Erick Erickson <erickerickson@gmail.com>:
>
>> The pseudo-random generator in the Lucene test framework is used to
>> randomize lots of test conditions, we're talking about the file system
>> implementation here, but there are lots of others. Whenever you see a
>> call to random().whatever, that's the call to the framework's method.
>>
>> But here's the thing. The randomization can cause different
>> combinations of "stuff" to happen. Say the locale is randomized to
>> Turkish and a token is also randomly generated that breaks _only_ with
>> that combination. You'd never explicitly be able to test all of those
>> kinds of combinations, thus the random() function. And there may be
>> many calls to random() by the time a test is run.
>>
>> Here's the key. When "seeded" with the same number, the calls to
>> random() produce the exact same output every time. So say with seed1 I
>> get
>> nextInt() - 1
>> nextInt() - 67
>> nextBool() - true
>>
>> Whenever I use 1 as the seed, I'll get exactly the above. However, if
>> I use 2 as a seed, I might get
>> nextInt() - 93
>> nextInt() - 63
>> nextBool() - false
>>
>> So the short form is
>>
>> 1. randomization is used to try out various combinations.
>>
>> 2. using a particular seed guarantees that the randomization is
>> repeatable.
>>
>> 3.  when a test fails with a particular seed, running the test with
>> the _same_ seed will produce the same conditions, hopefully allowing
>> that particular error resulting from that particular combination to be
>> reproduced reliably (and fixed).
>>
>> 4. at least that's the theory and in practice it works quite well.
>> There is no _guarantee_ that the test will fail using the same seed,
>> sometimes the failures are a result of subtle timing etc, which is not
>> under control of the randomization. I breathe a sigh of relief,
>> though, when a test _does_ reproduce with a particular seed 'cause
>> then I have a hope of knowing the issue is actually fixed ;).
>>
>>
>> Best,
>> Erick
>>
>> On Tue, Aug 21, 2018 at 3:56 PM, Tomoko Uchida
>> <tomoko.uchida.1111@gmail.com> wrote:
>> > Thanks a lot for your information & insights,
>> >
>> > I will try to reproduce the errors and investigate the results.
>> > And, maybe I should learn more about internal of the test framework,
>> > I'm not familiar with it and still do not understand what does "seed"
>> means
>> > exactly in this context.
>> >
>> > Regards,
>> > Tomoko
>> >
>> > 2018年8月22日(水) 1:05 Erick Erickson <erickerickson@gmail.com>:
>> >
>> >> Couple of things (and I know you've been around for a while, so pardon
>> >> me if it's all old hat to you):
>> >>
>> >> 1> if you run the entire "reproduce with" line and can get a
>> >> consistent failure, then you are half way there, nothing is as
>> >> frustrating as not getting failures reliably. The critical bit is
>> >> often the -Dtests.seed. As Michael mentioned, there are various
>> >> randomizations done for _many_ things in Lucene tests using a random
>> >> generator.  tests.seed, well, seeds that generator so it produces the
>> >> same numbers every time it's run with that seed. You'll see lots of
>> >> calls to a static ramdom() method calls. I'll add that if you want to
>> >> use randomness in your tests, use that method and do _not_ use a local
>> >> instance of Java's Random.
>> >>
>> >> 2> MIke: You say IntelliJ succeeds. But that'll use a new random()
>> >> seed. Once you run a test, in the upper right (on my version at
>> >> least), IntelliJ will show you a little box with the test name and you
>> >> can "edit configurations" on it. I often have luck by editing the
>> >> configuration and adding the test seed to the "VM option" box for the
>> >> test, just the "-Dtests.seed=35AF58F652536895" part. You can add all
>> >> of the -D flags in the "reproduce with" line if you want, but often
>> >> just the seed works for me. If that works and you track it down, do
>> >> remember to take that seed _out_ of the "VM options" box rather than
>> >> forget it as I have done ;)
>> >>
>> >> 3> Mark Miller's beasting script can be used to run a zillion tests
>> >> over night: https://gist.github.com/markrmiller/dbdb792216dc98b018ad
>> >>
>> >> 4> If MIke's intuition that it's one of the file system randomizations
>> >> that occasionally gets hit _and_ you determine that that's an invalid
>> >> test case (and for Luke requiring that the FS-basesd tests are all
>> >> that are necessary may be fine) I'm pretty sure you you can disable
>> >> that randomization for your specific tests.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Tue, Aug 21, 2018 at 7:47 AM, Tomoko Uchida
>> >> <tomoko.uchida.1111@gmail.com> wrote:
>> >> > Hi, Mike
>> >> >
>> >> > Thanks for sharing your experiments.
>> >> >
>> >> >> CommitsImplTest.testListCommits
>> >> >> CommitsImplTest.testGetCommit_generation_notfound
>> >> >> CommitsImplTest.testGetSegments
>> >> >> DocumentsImplTest.testGetDocumentFIelds
>> >> >
>> >> > I also found CommitsImplTest and DocumentsImplTest fail frequently,
>> >> > especially CommitsImplTest is unhappy with lucene test framework (I
>> >> pointed
>> >> > that in my previous post.)
>> >> >
>> >> >> I wonder if this is somehow related to running mvn from command
>> line vs
>> >> > running in IntelliJ since previously I was doing the latter
>> >> >
>> >> > In my personal experience, when I was running those suspicious tests
>> on
>> >> > IntelliJ IDEA, they were always green - but I am not sure that `mvn
>> test`
>> >> > is the cause.
>> >> >
>> >> > Thanks,
>> >> > Tomoko
>> >> >
>> >> > 2018年8月21日(火) 22:53 Michael Sokolov <msokolov@gmail.com>:
>> >> >
>> >> >> I was running these luke tests a bunch and found the following
tests
>> >> fail
>> >> >> intermittently; pretty frequently. Once I @Ignore them I can get
a
>> >> >> consistent pass:
>> >> >>
>> >> >>
>> >> >> CommitsImplTest.testListCommits
>> >> >> CommitsImplTest.testGetCommit_generation_notfound
>> >> >> CommitsImplTest.testGetSegments
>> >> >> DocumentsImplTest.testGetDocumentFIelds
>> >> >>
>> >> >> I did not attempt to figure out why the tests were failing, but
to
>> do
>> >> that,
>> >> >> I would:
>> >> >>
>> >> >> Run repeatedly until you get a failure -- save the test "seed"
from
>> this
>> >> >> run that should be printed out in the failure message Then you
>> should be
>> >> >> able to reliably reproduce this failure by re-running with system
>> >> property
>> >> >> "tests.seed" set to that value. This is used to initialize the
>> >> >> randomization that LuceneTestCase does.
>> >> >>
>> >> >> My best guess is that the failures may have to do with randomly
>> using
>> >> some
>> >> >> Directory implementation or other Lucene feature that Luke doesn't
>> >> properly
>> >> >> handle?
>> >> >>
>> >> >> Hmm I was trying this again to see if I could get an example, and
>> >> strangely
>> >> >> these tests are no longer failing for me after several runs, when
>> >> >> previously they failed quite often. I wonder if this is somehow
>> related
>> >> to
>> >> >> running mvn from command line vs running in IntelliJ since
>> previously I
>> >> was
>> >> >> doing the latter
>> >> >>
>> >> >> -Mike
>> >> >>
>> >> >> On Tue, Aug 21, 2018 at 9:01 AM Tomoko Uchida <
>> >> >> tomoko.uchida.1111@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> > Hello,
>> >> >> >
>> >> >> > Could you give me some advice or comments about usage of
>> >> LuceneTestCase.
>> >> >> >
>> >> >> > Some of our unit tests extending LuceneTestCase fail by assertion
>> >> error
>> >> >> --
>> >> >> > sometimes, randomly.
>> >> >> > I suppose we use LuceneTestCase in inappropriate way, but
cannot
>> find
>> >> out
>> >> >> > how to fix it.
>> >> >> >
>> >> >> > Here is some information about failed tests.
>> >> >> >
>> >> >> >  * The full test code is here:
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> https://github.com/DmitryKey/luke/blob/master/src/test/java/org/apache/lucene/luke/models/commits/CommitsImplTest.java
>> >> >> >  * We run tests by `mvn test` on Mac PC or Travis CI (oracle
>> >> jdk8/9/10,
>> >> >> > openjdk 8/9/10), assertion errors occur regardless of platform
or
>> jdk
>> >> >> > version.
>> >> >> >  * Stack trace of an assertion error is at the end of this
mail.
>> >> >> >
>> >> >> > Any advice are appreciated. Please tell me if more information
is
>> >> needed.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Tomoko
>> >> >> >
>> >> >> >
>> >> >> > -------------------------------------------------------
>> >> >> >  T E S T S
>> >> >> > -------------------------------------------------------
>> >> >> > Running org.apache.lucene.luke.models.commits.CommitsImplTest
>> >> >> > NOTE: reproduce with: ant test  -Dtestcase=CommitsImplTest
>> >> >> > -Dtests.method=testGetSegmentAttributes
>> -Dtests.seed=35AF58F652536895
>> >> >> > -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=de
>> >> >> > -Dtests.timezone=Africa/Kigali -Dtests.asserts=true
>> >> >> > -Dtests.file.encoding=UTF-8
>> >> >> > NOTE: leaving temporary files on disk at:
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> /private/var/folders/xr/mrs6w1m15y1f4wkgfhn_x1dm0000gp/T/lucene.luke.models.commits.CommitsImplTest_35AF58F652536895-001
>> >> >> > NOTE: test params are:
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> codec=HighCompressionCompressingStoredFields(storedFieldsFormat=CompressingStoredFieldsFormat(compressionMode=HIGH_COMPRESSION,
>> >> >> > chunkSize=6, maxDocsPerChunk=7, blockSize=2),
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> termVectorsFormat=CompressingTermVectorsFormat(compressionMode=HIGH_COMPRESSION,
>> >> >> > chunkSize=6, blockSize=2)), sim=RandomSimilarity(queryNorm=true):
>> {},
>> >> >> > locale=de, timezone=Africa/Kigali
>> >> >> > NOTE: Mac OS X 10.13.6 x86_64/Oracle Corporation 1.8.0_181
>> >> >> > (64-bit)/cpus=4,threads=1,free=201929064,total=257425408
>> >> >> > NOTE: All tests run in this JVM: [CommitsImplTest]
>> >> >> > Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed:
>> 1.44
>> >> sec
>> >> >> > <<< FAILURE!
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> testGetSegmentAttributes(org.apache.lucene.luke.models.commits.CommitsImplTest)
>> >> >> > Time elapsed: 0.047 sec  <<< FAILURE!
>> >> >> > java.lang.AssertionError
>> >> >> > at
>> >> >>
>> __randomizedtesting.SeedInfo.seed([35AF58F652536895:AE37E8467BC01918]:0)
>> >> >> > at org.junit.Assert.fail(Assert.java:92)
>> >> >> > at org.junit.Assert.assertTrue(Assert.java:43)
>> >> >> > at org.junit.Assert.assertTrue(Assert.java:54)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.luke.models.commits.CommitsImplTest.testGetSegmentAttributes(CommitsImplTest.java:151)
>> >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >> >> > at java.lang.reflect.Method.invoke(Method.java:498)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> >> >> > at
>> >> >> >
>> >> >> >
>> >> >>
>> >>
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
>> >> >> > at java.lang.Thread.run(Thread.java:748)
>> >> >> >
>> >> >>
>> >> >
>> >> >
>> >> > --
>> >> > Tomoko Uchida
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>> > --
>> > Tomoko Uchida
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> --
> Tomoko Uchida
>


-- 
Tomoko Uchida

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message