lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Storing same field twice (analyzed+not-analyzed), sorting
Date Fri, 27 Apr 2012 12:08:16 GMT
Hmmm, putting analyzed and unanalyzed values in
the same field seems like it'd be difficult to get right. In
the Solr world, two separate fields are usually used.


Sorting is right out, the results are unpredictable. What does
it mean to sort on a field with multiple tokens? For a doc
with "aardvark" and "zebra", where should it fall in the
result list?

If you're sorting, it's best to use a single value per doc.

Best
Erick

On Fri, Apr 27, 2012 at 6:17 AM, Francisco A. Lozano <flozano@gmail.com> wrote:
> Hi,
>
> I'm storing a field two times, one analyzed and other non-analyzed, in
> order to be able to query for terms and for exact keyword:
>
>                        // Analyzed version
>                        d.add(new Field(key, value, Store.NO, Index.ANALYZED,
>                                        TermVector.YES));
>                        // Not-analyzed version
>                        d.add(new Field(key, value, Store.NO, Index.NOT_ANALYZED));
>
> My first question is if this is supposed to cause problems somehow or
> if it's OK.
>
> The problem is that I'm getting strange results when sorting, most of
> the documents seem correctly sorted but some of them appear at the
> end. Am I doing something wrong?
>
> Francisco A. Lozano
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message