lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Sorting
Date Tue, 05 Aug 2008 20:18:59 GMT
Your sort object has no relation to your query in terms of fields. You can
search on anything you want, but construct your Sort object with your
untokenized field.

Best
Erick

On Tue, Aug 5, 2008 at 4:12 PM, Andre Rubin <andre.rubin@gmail.com> wrote:

> Sounds like a good alternative. But how do I perform the search on the
> tokenized filed and sort on the un_tokenized one?
>
> Thanks,
>
>
> Andre
>
> On Tue, Aug 5, 2008 at 12:51 PM,  <Robert.Hastings@ancept.com> wrote:
> > This is what I did and it works fine.  My untokenized fields where named:
> > "__AMSUNTOK__" + fieldName.
> > Where fieldName was the name of the tokenized field.
> >
> >
> > Bob Hastings
> > Ancept Inc.
> >
> >
> >
> >
> > Mark Miller <markrmiller@gmail.com>
> > 08/05/2008 02:38 PM
> > Please respond to
> > java-user@lucene.apache.org
> >
> >
> > To
> > java-user@lucene.apache.org
> > cc
> >
> > Subject
> > Re: Sorting
> >
> >
> >
> >
> >
> >
> > Hey Andre,
> >
> > The reason the javadoc says the field should not be tokenized stems from
> > the issue you point out. What you want to do is possible of course, but
> > making the Lucene code change would complicate a process that can be
> > quite memory and cpu intensive on large collections. Done right, it
> > might make a good patch though.
> >
> > A compromise that you can make outside of the Lucene code is to index a
> > separate field with the same contents but untokenized. Sorting on this
> > field instead, Lucene will treat "North Carolina" as one token and sort
> > as you'd expect. The downside to this approach is that you will have to
> > juggle the two fields in the future.
> >
> > - Mark
> >
> > Andre Rubin wrote:
> >> Hi there!
> >>
> >> I'm new to Lucene, so forgive any misconceptions on my part.
> >>
> >> I created an Index and now I want to search on it based on a field.
> >> The field is a String field and Field.Store.YES and
> >> Field.Index.TOKENIZED. No problems with the search.
> >>
> >> Now, I wanted to sort the results, and according to the Sort javadoc
> >> the field "should not be tokenized". But I decided to try it anyway,
> >> and it worked. However, the results showed that the tokens were
> >> sorted, not the full string in the field.
> >>
> >> Just to make myself more clear, here's an example. Let's say I have
> >> these strings indexed:
> >>
> >> "North Carolina"
> >> "British Columbia"
> >> "Canada"
> >>
> >> Now I search (with sort) for the token 'c*'
> >>
> >> The result I get is (sorted by the token found):
> >>
> >> 1) Canada
> >> 2) North Carolina
> >> 3) British Columbia
> >>
> >> The result I wanted was (sorted by the whole String)"
> >>
> >> 1) British Columbia
> >> 2) Canada
> >> 3) North Carolina
> >>
> >> Is there a way to do this?
> >>
> >>
> >> Another option would be to sort the index itself, since this field is
> >> the only field that we'd be searching on. But I'm just guessing here,
> >> cause I have no idea if this is possible at all!
> >>
> >> Thanks,
> >>
> >>
> >> Andre
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message