On Wed, Jun 24, 2009 at 3:32 PM, Marcel Reutegger <marcel.reutegger@gmx.net> wrote:
Hi,

On Wed, Jun 24, 2009 at 14:47, Ard Schrijvers<a.schrijvers@onehippo.com> wrote:
> Hello,
>
> I am confused regarding the order of search results (jr 1.5.2 core).
> First of all, I have configured <param name="respectDocumentOrder"
> value="false"/>.
>
> Thus, I would expect if I do someting like:
>
> //*[jcr:contains(.,'foo')]
>
> that my ordering would be by @jcr:score (lucene score) descending.

no, it just means they the result nodes aren't necessarily in document order.


what do you mean 'no'...I said I would expect it, and IMO it makes sense. If in lucene you do not specify an order, you'll get back the highest scores first.
 


> If
> I print the scores, they seem instead to be random. So, this is not
> what i would expect.
>
> Secondly, when I do
>
> //*[jcr:contains(.,'foo')]  order by @jcr:score
>
> it makes sense to get results back in descending score order.

why? that's contrary to how order by is defined.


Yeah I understand, and I think the definition would have been way better if it would have made an exception for the score: Who ever want to have the results with the lowest score first? Those queries must be really rare imo.
 


> Unfortunately, they are in ascending order. This seems to be inline
> with the spec, 6.6.3.5 Ordering Specifier ('If neither ascending nor
> descending is specified after a property name (or jcr:score(...)
> function), the default is ascending.'), but, it does not make sense
> for the jcr:score. It is strange.

no, it isn't :) it's just what you requested. order the result by
their score value in ascending order.

If I do not specify an order, I get the lowest score first. I try to say that that is really an odd default for score. I think this is quite straightforward (furthermore, though I must check it,it is if i recall correctly, far more expensive with a lucene query to get the lowest scores instead of the highest)
 


> And beyond that, it will lead to
> really strange behavior in combination with setLimit i think: IIUC,
> order by @jcr:score is just the default lucene scoring.

no, that's not correct. order by @jcr:score descending is the default
lucene scoring.

Sry, I was not clear enough: what I am saying, is that if in SearchIndex you do:

hits = searcher.search(query);
 
then afaik, you'll get a Hits object in lucene sorted on score (descending, as this makes sense as a default). For some reason, the ordering when I get back the nodes is shuffled. I don't understand why. An order by score would make a perfect default if respectDocumentOrder is set to false.



> This means, if
> I sort on 'order by @jcr:score' then, the first hit (lowest score)
> depends on my setLimit. If I do setLimit(1), I get the first
> authorized lucene hit, which has the highest score possible.

are you sure, this is the case? if yes, then this is a bug and should
be fixed. it should return the least relevant node.

I was not sure, I only did it by reasoning, but now i see:

if (NameConstants.JCR_SCORE.equals(orderProps[i])) {
                // order on jcr:score does not use the natural order as
                // implemented in lucene. score ascending in lucene means that
                // higher scores are first. JCR specs that lower score values
                // are first.
                sortFields.add(new SortField(null, SortField.SCORE, orderSpecs[i]));
            }


so my claim is incorrect



> If I do
> setLimit(1000000000) I first get the lowest score as spec defaults to
> inverting (ascending) the lucene order. So the combination jcr:score
> which defaults to ascending, is not usefull imo, and with a setLimit()
> returns quite unexpected results.
>
> I am not sure whether jsr-283 has some changes regarding this?

no, this is still the same. higher score value means more relevant,
hence you have to sort descending to get the most relevant first.

Perhaps a matter of taste...I would have always opted for the highest scores first as a default, and thus make an explicit exception for score.

Bottom line, I'll default to adding 'order by @jcr:score descending' when no order is specified :-))

thx for your replies Marcel

Ard



regards
 marcel