lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3390) Incorrect sort by Numeric values for documents missing the sorting field
Date Tue, 20 Sep 2011 10:03:09 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108490#comment-13108490
] 

Doron Cohen commented on LUCENE-3390:
-------------------------------------

Hi Uwe, thanks for catching this. 
I agree that this is a bug, and needs to be fixed.
Just to make sure that we agree on what the problem is, let me describe it again: in current
3x code in setNextReader() we extract the values from the cache, e.g. by {code}FieldCache.DEFAULT.getDoubles(reader,
field, parser);{code} and, if a missing value was set, we iterate the unvalued docs and set
them to that missing value. However this settings takes place at the same array just obtained
from the cache, and so this is (1) inefficient as it will happen again in the next sort with
same field, (2) incorrect as if two sorts of *same* field have different missing value they
will collide, and (3) unsafe as you indicated.
I was very happy with the reuse of the cache for caching the missing values so I would like
to try to solve this with that "frame"... More later...

> Incorrect sort by Numeric values for documents missing the sorting field
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-3390
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3390
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.3
>            Reporter: Gilad Barkai
>            Assignee: Doron Cohen
>            Priority: Minor
>              Labels: double, float, int, long, numeric, sort
>             Fix For: 3.4
>
>         Attachments: LUCENE-3390-fix-like-trunk.patch, LUCENE-3390-fix-like-trunk.patch,
LUCENE-3390-fix-like-trunk.patch, LUCENE-3390.patch, SortByDouble.java
>
>
> While sorting results over a numeric field, documents which do not contain a value for
the sorting field seem to get 0 (ZERO) value in the sort. (Tested against Double, Float, Int
& Long numeric fields ascending and descending order).
> This behavior is unexpected, as zero is "comparable" to the rest of the values. A better
solution would either be allowing the user to define such a "non-value" default, or always
bring those document results as the last ones.
> Example scenario:
> Adding 3 documents, 1st with value 3.5d, 2nd with -10d, and 3rd without any value.
> Searching with MatchAllDocsQuery, with sort over that field in descending order yields
the docid results of 0, 2, 1.
> Asking for the top 2 documents brings the document without any value as the 2nd result
- which seems as a bug?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message