lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (JIRA)" <>
Subject [jira] [Commented] (SOLR-5344) SpellCheckCollatorTest.testEstimatedHitCounts fails in jenkins from time to time
Date Wed, 02 Nov 2016 17:56:58 GMT


James Dyer commented on SOLR-5344:

ok, I think I know what's going here.  This feature is supposed to estimate hit counts for
spelling corrections for cases where the client doesn't care about the exact # of hits, only
that a partcular collation, if re-queried, would return something.  To gets estimates, you
tell it the max # of documents you would like it to collect before quitting.  It then estimates
how many hits it would have counted with this:

maximum-doc-id * number-of-docs-collected / (# visited docs + last-doc-id + 1)

In the failing test, we ask it to collect between 5 and 20 documents.  The max-doc-id is always
17 (there are 17 documents and no deletions).

But the denominator is controlled by the # of visited documents, and also the doc id of the
one that happened to be visited last.  But in the face of randomized testing and release-specific
index behavior, I think the best we can hope for is a worse-case scenario, between 2 and 15.
 The actual correct value is 8.

So unless there are objections, I am going to relax the requirement of 6 <= hits <=
10 , and use 2 <= hits <= 15.  Maybe we could do better than this, but I would think
anyone using this feature probably does not need to know more than whether or not hits can
be produced, or the relative # between several collations returned.

> SpellCheckCollatorTest.testEstimatedHitCounts fails in jenkins from time to time
> --------------------------------------------------------------------------------
>                 Key: SOLR-5344
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Assignee: James Dyer
> Doesn't happen very often, but maybe one I can fix. I'll look into it.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message