lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <>
Subject Re: Grouping and tokens
Date Tue, 19 Feb 2013 07:21:16 GMT
Okay, so, fields that would normally need to be tokenized must be stored as 
both raw strings for grouping and tokenized text for keyword search. Simply 
use copyField to copy from one to the other.

-- Jack Krupansky

-----Original Message----- 
From: Ramprakash Ramamoorthy
Sent: Monday, February 18, 2013 11:13 PM
Subject: Re: Grouping and tokens

On Mon, Feb 18, 2013 at 9:47 PM, Jack Krupansky 

> Please clarify exactly what you want to group by - give a specific example
> that makes it clear what terms should affect grouping and which shouldn't.

Assume I am indexing a library data. Say there are the following fields for
a particular book.
1. Published
2. Language
3. Genre
4. Author
5. Title

     While search time, the user can ask to group by any of the above
fields, which means all of them are not supposed to be tokenized. So as I
had told earlier, there is a book titled "Fifty shades of gray" and the
user searches for "shades". The result turns up in case the field is
tokenized. But here it doesn't, since it isn't tokenized. Hope I am clear?

     In a nutshell, how do I use a groupby on a field that is also

> -- Jack Krupansky
> -----Original Message----- From: Ramprakash Ramamoorthy
> Sent: Monday, February 18, 2013 6:12 AM
> To:
> Subject: Grouping and tokens
> Hello all,
>     From the grouping javadoc, I read that fields that are supposed to be
> grouped should not be tokenized. I have an use case where the user has the
> freedom to group by any field during search time.
>     Now that only tokenized fields are eligible for grouping, this is
> creating an issue with my search. Say for instance the book "*Fifty shades
> of grey*" when tokenized and searched for "*shades*" turns up in the
> result. However this is not the case when I have it as a non-tokenized
> field (using StandardAnalyzer-Version4.1).
>     How do I go about this? Is indexing a tokenized and non-tokenized
> version of the same field the only go? I am afraid its way too costly!
> Thanks in advance for your valuable inputs.
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> India,
> +91 9626975420
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**<>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<>

With Thanks and Regards,
Ramprakash Ramamoorthy,
+91 9626975420 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message