lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-3501) random sampler is not random (and so facet SamplingWrapperTest occasionally fails)
Date Sun, 09 Oct 2011 15:41:30 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Doron Cohen updated LUCENE-3501:
--------------------------------

    Attachment: LUCENE-3501.patch

Before applying this patch should do:
{noformat}
svn mv modules/facet/src/java/org/apache/lucene/facet/util/RandomSample.java modules/facet/src/java/org/apache/lucene/facet/search/sampling/RepeatableSampler.java
{noformat}

I looked at this, and also discussed with Gilad, and it seems that:

* The test is broken as it claims to do N trials in case of failure but it does not, because
its try/catch does not catch AssertionError, and so only one trial is attempted. (Few trials
make sense because with sampling, there is always a possibility that the selected sample set
of docs would not contain the "correct" best facets even with a high over sampling ratio (over
sampling means that for the selected set of docs more best facets are collected).

* Even after fixing the test to actually try more than once, it still fails, because there
is no randomness in RandomSample...  surprising but true.

In this patch:
* Sampler made an abstract class.
* RandomSample renamed to RepeatableSampler which extends RandomSampler.
* RandomSampler was added - it too extends Sampler - this is a simple *random* implementation,
which is now the default (used by default in SamplingWrapperAccumulator).
* The test randomly selects between the two sampler implementations. If you want to see the
behavior that created the bug, remove that latter randomness by setting to false the variable
*useRandomSampler* of *BaseSampleTestTopK.testCountUsingSamping()*.

I think this is ready to commit. 
Wasn't sure though, where should the Changes entry go?
                
> random sampler is not random (and so facet SamplingWrapperTest occasionally fails)
> ----------------------------------------------------------------------------------
>
>                 Key: LUCENE-3501
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3501
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: modules/facet
>            Reporter: Doron Cohen
>            Assignee: Doron Cohen
>            Priority: Minor
>         Attachments: LUCENE-3501.patch
>
>
> RandomSample is not random at all:
> It does not even import java.util.Random, and its behavior is deterministic.
> in addition, the test testCountUsingSamping() never retries as it was supposed to (for
taking care of the hoped-for randomness).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message