lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Grouping and tokens
Date Tue, 19 Feb 2013 15:39:27 GMT
Well, you don't need to "store" both copies since they will be the same. 
They both need to be "indexed" (string form for grouping, text form for 
keyword search), but only one needs to be "stored".

-- Jack Krupansky

-----Original Message----- 
From: Ramprakash Ramamoorthy
Sent: Tuesday, February 19, 2013 1:07 AM
To: java-user@lucene.apache.org
Subject: Re: Grouping and tokens

On Tue, Feb 19, 2013 at 12:57 PM, Jack Krupansky 
<jack@basetechnology.com>wrote:

> Oops, sorry for the "Solr" answer. In Lucene you need to simply index the
> same value, once as a raw string and a second time as a tokenized text
> field. Grouping would use the raw string version of the data.
>
> Yeah, thanks Jack. Was just wondering if there would be a better alternate
rather than 2x storing. But I don't see any. Thanks again.

> -- Jack Krupansky
>
> -----Original Message----- From: Jack Krupansky
> Sent: Monday, February 18, 2013 11:21 PM
>
> To: java-user@lucene.apache.org
> Subject: Re: Grouping and tokens
>
> Okay, so, fields that would normally need to be tokenized must be stored 
> as
> both raw strings for grouping and tokenized text for keyword search. 
> Simply
> use copyField to copy from one to the other.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Ramprakash Ramamoorthy
> Sent: Monday, February 18, 2013 11:13 PM
> To: java-user@lucene.apache.org
> Subject: Re: Grouping and tokens
>
> On Mon, Feb 18, 2013 at 9:47 PM, Jack Krupansky
> <jack@basetechnology.com>**wrote:
>
>  Please clarify exactly what you want to group by - give a specific 
> example
>> that makes it clear what terms should affect grouping and which 
>> shouldn't.
>>
>>
> Assume I am indexing a library data. Say there are the following fields 
> for
> a particular book.
> 1. Published
> 2. Language
> 3. Genre
> 4. Author
> 5. Title
> 6. ISBN
>
>     While search time, the user can ask to group by any of the above
> fields, which means all of them are not supposed to be tokenized. So as I
> had told earlier, there is a book titled "Fifty shades of gray" and the
> user searches for "shades". The result turns up in case the field is
> tokenized. But here it doesn't, since it isn't tokenized. Hope I am clear?
>
>     In a nutshell, how do I use a groupby on a field that is also
> tokenized?
>
>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Ramprakash Ramamoorthy
>> Sent: Monday, February 18, 2013 6:12 AM
>> To: java-user@lucene.apache.org
>> Subject: Grouping and tokens
>>
>>
>> Hello all,
>>
>>     From the grouping javadoc, I read that fields that are supposed to be
>> grouped should not be tokenized. I have an use case where the user has 
>> the
>> freedom to group by any field during search time.
>>
>>     Now that only tokenized fields are eligible for grouping, this is
>> creating an issue with my search. Say for instance the book "*Fifty 
>> shades
>> of grey*" when tokenized and searched for "*shades*" turns up in the
>>
>> result. However this is not the case when I have it as a non-tokenized
>> field (using StandardAnalyzer-Version4.1).
>>
>>     How do I go about this? Is indexing a tokenized and non-tokenized
>> version of the same field the only go? I am afraid its way too costly!
>> Thanks in advance for your valuable inputs.
>>
>> --
>> With Thanks and Regards,
>> Ramprakash Ramamoorthy,
>> India,
>> +91 9626975420
>>
>> ------------------------------****----------------------------**
>> --**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.****apache.org<
>> java-user-**unsubscribe@lucene.apache.org<java-user-unsubscribe@lucene.apache.org>
>> >
>> For additional commands, e-mail: java-user-help@lucene.apache.****org<
>> java-user-help@lucene.**apache.org <java-user-help@lucene.apache.org>>
>>
>>
>>
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> India.
> +91 9626975420
>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscribe@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<java-user-help@lucene.apache.org>
>
>


-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
India,
+91 9626975420 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message