lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Case-sensitive search
Date Mon, 22 Aug 2005 20:32:54 GMT

On Aug 22, 2005, at 11:43 AM, Rajesh Munavalli wrote:

> At the query time I was thinking of two queries ORed toegether. One  
> with
> user entered query and the other case insensitive query. For example:
>
> The user query "Java Virtual machine" would be translated into
> "Java Virtual machine" OR "java virtual machine".
>
> Eventhough the user mistyped the case ("machine" instead of  
> "Machine"),
> the query would retrieve documents. I am not sure about the  
> performance
> though. Erik would be the right person to help us understand  
> performance
> constraints in doing so.

Performance issues aside (as they aren't really a factor in this  
equation), I still don't understand what good ORing queries with  
various cases does in the scenario you describe.  If you're going to  
index lower case, and then OR in the lowercase, then there is really  
no need to even have the case-sensitive parts there in the index or  
in the query.

     Erik



>
> Rajesh Munavalli
>
>
>> -----Original Message-----
>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>> Sent: Monday, August 22, 2005 10:20 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Case-sensitive search
>>
>>
>> On Aug 22, 2005, at 11:10 AM, Rajesh Munavalli wrote:
>>
>>> You could also treat the case-sensitive and case-insensitive as
>>> Synonyms and index them at the same position. This would be
>>>
>> helpful in
>>
>>> phrase queries.
>>>
>>
>> You wouldn't be able to selectively toggle between
>> case-sensitive and -insensitive searching this way, though,
>> so I'm not sure if there is merit in doing both cases in the
>> same position or not.
>>
>>      Erik
>>
>>
>>
>>>
>>> Rajesh Munavalli
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
>>>> Sent: Monday, August 22, 2005 10:04 AM
>>>> To: java-user@lucene.apache.org
>>>> Subject: Re: Case-sensitive search
>>>>
>>>>
>>>> On Aug 22, 2005, at 10:40 AM, tareque@controldocs.com wrote:
>>>>
>>>>
>>>>> Is there any way to index as case-sensitive and then, while
>>>>>
>>>>>
>>>> searching,
>>>>
>>>>
>>>>> making the search case-sensitive and case-insensitive using
>>>>>
>>>>>
>>>> the same
>>>>
>>>>
>>>>> index as needed?
>>>>>
>>>>>
>>>>
>>>> Not really.  Terms in the index are ordered lexicographically,
>>>> including case.  It certainly would be possible to write
>>>>
>> customized
>>
>>>> Query subclasses to do this sort of thing at the expense of
>>>> performance.
>>>>
>>>> The only techniques I'm aware of are to either build
>>>>
>> separate indexes
>>
>>>> or index the same information into separate fields of the same
>>>> documents using different analyzers per field.
>>>>
>>>>      Erik
>>>>
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>>
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>> ---------------------------------------------------------------------
>>
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message