lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Make syntax highlighter caseinsensitive
Date Fri, 25 Feb 2011 14:02:35 GMT
(11/02/25 18:30), Tarjei Huse wrote:
> Hi,
> On 02/25/2011 02:06 AM, Koji Sekiguchi wrote:
>> (11/02/24 20:18), Tarjei Huse wrote:
>>> Hi,
>>>
>>> I got an index where I have two fields, body and caseInsensitiveBody.
>>> Body is indexed and stored while caseInsensitiveBody is just indexed.
>>>
>>> The idea is that by not storing the caseInsensitiveBody I save some
>>> space and gain some performance. So I query against the
>>> caseInsensitiveBody and generate highlighting from the case sensitive
>>> one.
>>>
>>> The problem is that as a result, I am missing highlighting terms. For
>>> example, when I search for solr and get a match in caseInsensitiveBody
>>> for solr but that it is Solr in the original document, no highlighting
>>> is done.
>>>
>>> Is there a way around this? Currently I am using the following
>>> highlighting params:
>>>           'hl' =>   'on',
>>>           'hl.fl' =>   'header,body',
>>>           'hl.usePhraseHighlighter' =>   'true',
>>>           'hl.highlightMultiTerm' =>   'true',
>>>           'hl.fragsize' =>   200,
>>>           'hl.regex.pattern' =>   '[-\w ,/\n\"\']{20,200}',
>>
>> Tarjei,
>>
>> Maybe silly question, but why no you make body field case insensitive
>> and eliminate caseInsensitiveBody field, and then query and highlight on
>> just body field?
> Not silly. I need to support usage scenarios where case matters as well
> as scenarios where case doesn't matter.
>
> The best part would be if I could use one field for this, store it and
> handle case sensitivity in the query phase, but as I understand it, that
> is not possible.

Hi Tarjei,

If I understand it correctly, you want to highlight case insensitive way.
If so, it is easy. You have:

body: indexed but not stored
caseInsensitiveBody: indexed and stored

and request hl.fl=caseInsensitiveBody ?

Koji
-- 
http://www.rondhuit.com/en/

Mime
View raw message