lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "k.sayama" <sake-gin...@nifty.com>
Subject Re: Highligheter fails using JapaneseAnalyzer
Date Thu, 02 Jul 2009 22:28:18 GMT
Hi

Tokenizer is not standard Lucene class.
but to acquire startOffset and endOffset correctly, I edited Tokenizer. 
It is operating correctly now. 

I want to verify more patterns. 

thanks

----- Original Message ----- 
From: "Mark Harwood" <markharw00d@yahoo.co.uk>
To: <java-user@lucene.apache.org>
Sent: Thursday, July 02, 2009 6:25 AM
Subject: Re: Highligheter fails using JapaneseAnalyzer


> 
> On 1 Jul 2009, at 17:39, k.sayama wrote:
> 
>> I could verify Token byte offsets
>>
>> The sytsem outputs
>> aaa:0:3
>> bbb:0:3
>> ccc:4:7
>>
> 
> That explains the highlighter behaviour. Clearly BBB is not at  
> position 0-3 in the String you supplied
> 
>>>> String CONTENTS = "AAA :BBB CCC";
> 
> Looks like the Tokenizer needs fixing. Is this yours or a standard  
> Lucene class? If the latter, raising a JIRA bug with a Junit test  
> would be the best way to get things moving.
> 
> 
> Cheers
> Mark
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message