lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weiwei Wang <ww.wang...@gmail.com>
Subject Offset Problem
Date Mon, 14 Dec 2009 05:01:15 GMT
The offset is incorrect for PatternReplaceCharFilter so the hilighting
result is wrong.

How to fix it?

On Mon, Dec 14, 2009 at 11:43 AM, Weiwei Wang <ww.wang.cs@gmail.com> wrote:

> All solr souce downloaded, and I found PatternReplaceCharFilter is very
> useful for my project.
>
> Thanks
>
>
> On Mon, Dec 14, 2009 at 11:14 AM, Weiwei Wang <ww.wang.cs@gmail.com>wrote:
>
>> I need the source file not the patch file, where can i download it?
>>
>>
>> On Mon, Dec 14, 2009 at 1:15 AM, Koji Sekiguchi <koji@r.email.ne.jp>wrote:
>>
>>> Koji Sekiguchi wrote:
>>>
>>>> Paul Taylor wrote:
>>>>
>>>>> I want my search to treat 'No. 1' and 'No.1' the same, because in our
>>>>> context its one token I want 'No. 1' to become 'No.1',  I need to do
this
>>>>> before tokenizing because the tokenizer would split one value into two
terms
>>>>> and one into just one term. I already use a NormalizeMapFilter to map
&' to
>>>>> 'and' but I think it only takes literal text and I need to
>>>>>
>>>>> 1. be case insensitive (but lowercasefilter is only applied after
>>>>> tokenizing)
>>>>>
>>>>> 2. cope with all numbers e.g no. 109
>>>>>
>>>>> So I was going to subclass BaseCharFilter and do my matches with a
>>>>> regular expression like ([Nn]+[Oo]+\\.) ([0-9]+) but I'm struggling to
>>>>> understand the offset methods you have to do once you get a match. Has
>>>>> anyone already got a regular expression Charfilter OR am I approaching
this
>>>>> all wrong
>>>>>
>>>>> thanks Paul
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>  Hi Paul,
>>>>
>>>> I've written a patch for this kind of purpose. See:
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-1653
>>>>
>>>> Koji
>>>>
>>>>  Oops. I thought this is solr-user list, but it was java-user. :-D
>>>
>>>
>>> Koji
>>>
>>> --
>>> http://www.rondhuit.com/en/
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Weiwei Wang
>> Alex Wang
>> 王巍巍
>> Room 403, Mengmin Wei Building
>> Computer Science Department
>> Gulou Campus of Nanjing University
>> Nanjing, P.R.China, 210093
>>
>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>
>
>
>
> --
> Weiwei Wang
> Alex Wang
> 王巍巍
> Room 403, Mengmin Wei Building
> Computer Science Department
> Gulou Campus of Nanjing University
> Nanjing, P.R.China, 210093
>
> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>



-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message