lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject content disappears in the index
Date Mon, 12 Nov 2012 13:19:17 GMT
Hi list,
a user reported wrong sorting of our search service running on solr.
While chasing this issue I traced it back through lucene into the index.
I have a text field for sorting (stored,indexed,tokenized,omitNorms,sortMissingLast)
and three docs with author names.

If I trace at org.apache.lucene.document.Document.add(IndexableField) while
indexing I can see all three author names added as field to each documents.

After searching with *:* for the three docs and doing a sort the sorting is wrong
because one of the author names is reduced to the first char, all other chars are lost.

So having the authors names (Alexander, Arslanagic, Brennmoen) indexed, the result
of sorting ascending is (Arslanagic, Alexander, Brennmoen) which is wrong.
But this happens because the author "Arslanagic" is reduced to "a" during indexing (???)
and if sorted "a" is before "alexander".

Currently I use 4.0 but have the same issue with 3.6.1.

Without tracing through tons of code:
- which is the last breakpoint for debugging to see the docs right before they go into the
index
- which is the first breakpoint for debugging to see the docs coming right out of the index

Regards
Bernd

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message