lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hall <mh...@informatics.jax.org>
Subject Re: Highligheter fails using JapaneseAnalyzer
Date Thu, 02 Jul 2009 12:59:33 GMT
Out of curiosity, when you try your other test string "aaa _bbb ccc" 
what do the token byte offsets show?

Matt

Mark Harwood wrote:
>
> On 1 Jul 2009, at 17:39, k.sayama wrote:
>
>> I could verify Token byte offsets
>>
>> The sytsem outputs
>> aaa:0:3
>> bbb:0:3
>> ccc:4:7
>>
>
> That explains the highlighter behaviour. Clearly BBB is not at 
> position 0-3 in the String you supplied
>
>>>> String CONTENTS = "AAA :BBB CCC";
>
> Looks like the Tokenizer needs fixing. Is this yours or a standard 
> Lucene class? If the latter, raising a JIRA bug with a Junit test 
> would be the best way to get things moving.
>
>
> Cheers
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


-- 
Matthew Hall
Software Engineer
Mouse Genome Informatics
mhall@informatics.jax.org
(207) 288-6012



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message