lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (Updated) (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3501) random sampler is not random (and so facet SamplingWrapperTest occasionally fails)
Date Sun, 09 Oct 2011 15:41:30 GMT


Doron Cohen updated LUCENE-3501:

    Attachment: LUCENE-3501.patch

Before applying this patch should do:
svn mv modules/facet/src/java/org/apache/lucene/facet/util/ modules/facet/src/java/org/apache/lucene/facet/search/sampling/

I looked at this, and also discussed with Gilad, and it seems that:

* The test is broken as it claims to do N trials in case of failure but it does not, because
its try/catch does not catch AssertionError, and so only one trial is attempted. (Few trials
make sense because with sampling, there is always a possibility that the selected sample set
of docs would not contain the "correct" best facets even with a high over sampling ratio (over
sampling means that for the selected set of docs more best facets are collected).

* Even after fixing the test to actually try more than once, it still fails, because there
is no randomness in RandomSample...  surprising but true.

In this patch:
* Sampler made an abstract class.
* RandomSample renamed to RepeatableSampler which extends RandomSampler.
* RandomSampler was added - it too extends Sampler - this is a simple *random* implementation,
which is now the default (used by default in SamplingWrapperAccumulator).
* The test randomly selects between the two sampler implementations. If you want to see the
behavior that created the bug, remove that latter randomness by setting to false the variable
*useRandomSampler* of *BaseSampleTestTopK.testCountUsingSamping()*.

I think this is ready to commit. 
Wasn't sure though, where should the Changes entry go?
> random sampler is not random (and so facet SamplingWrapperTest occasionally fails)
> ----------------------------------------------------------------------------------
>                 Key: LUCENE-3501
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: modules/facet
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>         Attachments: LUCENE-3501.patch
> RandomSample is not random at all:
> It does not even import java.util.Random, and its behavior is deterministic.
> in addition, the test testCountUsingSamping() never retries as it was supposed to (for
taking care of the hoped-for randomness).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message