lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: IllegalArgumentException: docID must be >= 0 and < maxDoc=48736112 (got docID=2147483647)
Date Sat, 30 May 2015 14:06:47 GMT
Hi Robert,

Great info. I prevented corner cases in similarities 
that were producing NaN or Negative Infinity scores.

All is well with -ea now.

Thanks,
Ahmet



On Friday, May 29, 2015 3:32 PM, Robert Muir <rcmuir@gmail.com> wrote:
Hi Ahmet,

Its due to the use of sentinel values by your collector in its
priority queue by default.

TopScoreDocCollector warns about this, and if you turn on assertions
(-ea) you will hit them in your tests:

* <p><b>NOTE</b>: The values {@link Float#NaN} and
* {@link Float#NEGATIVE_INFINITY} are not valid scores.  This
* collector will not properly collect hits with such
* scores.
*/
public abstract class TopScoreDocCollector extends TopDocsCollector<ScoreDoc> {

I don't think a fix is simple, I only know of the following ideas:
* somehow sneaky use of NaN as sentinels instead of -Inf, to allow
-Inf to be used. It seems a bit scary!
* remove the sentinels optimization. I am not sure if collectors could
easily have the same performance without them.

To me, such scores seem always undesirable and only bugs, and the
current assertions are a good tradeoff.


On Fri, May 29, 2015 at 8:18 AM, Ahmet Arslan <iorixxx@yahoo.com.invalid> wrote:
> Hello List,
>
> When a similarity returns NEGATIVE_INFINITY, hits[i].doc becomes 2147483647.
> Thus, exception is thrown in the following code:
>
> for (int i = 0; i < hits.length; i++) {
> int docId = hits[i].doc;
> Document doc = searcher.doc(docId);
> }
>
> I know it is an awkward to return infinity (comes from log(0)), but exception looks like
equally
> awkward and uniformative.
>
> Do you think is this something improvable? Can we do better handling here?
>
> Thanks,
> Ahmet
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org

>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message