lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "k.sayama" <sake-gin...@nifty.com>
Subject Re: Highligheter fails using JapaneseAnalyzer
Date Wed, 01 Jul 2009 16:39:29 GMT
I could verify Token byte offsets

The sytsem outputs
aaa:0:3
bbb:0:3
ccc:4:7

offset is initialized

Is this problem Analyzer? Or, is it Tokenizer?

----- Original Message ----- 
From: "mark harwood" <markharw00d@yahoo.co.uk>
To: <java-user@lucene.apache.org>
Sent: Thursday, July 02, 2009 12:55 AM
Subject: Re: Highligheter fails using JapaneseAnalyzer



>>How should I verify  it?

Make sure the Token.startOffset and endOffset properties of Tokens produced 
by your TokenStream correctly define the location of Token.termBuffer in the 
original text.



----- Original Message ----
From: k.sayama <sake-ginjyo@nifty.com>
To: java-user@lucene.apache.org
Sent: Wednesday, 1 July, 2009 16:13:17
Subject: Re: Highligheter fails using JapaneseAnalyzer

Sorry
I can not verify the Token byte offsets produced by JapaneseAnalyzer
How should I verify  it?

----- Original Message ----- 
From: "mark harwood" <markharw00d@yahoo.co.uk>
To: <java-user@lucene.apache.org>
Sent: Wednesday, July 01, 2009 11:31 PM
Subject: Re: Highligheter fails using JapaneseAnalyzer



Can you verify the Token byte offsets produced by this particular analyzer
are correct?



----- Original Message ----
From: k.sayama <sake-ginjyo@nifty.com>
To: java-user@lucene.apache.org
Sent: Wednesday, 1 July, 2009 15:22:37
Subject: Re: Highligheter fails using JapaneseAnalyzer

hi

I verified it by using SimpleAnalyzer, StandardAnalyzer, and CJKAnalyzer.
but, The problem did not happen.

I think the problem of JapaneseAnalyzer.
Can this problem be solved?

> Does the same thing happen when you use SimpleAnalyzer, or
> StandardAnalyzer?
>
> I have a sneaking suspicion that the : in your contents string is what's
> causing your issue here, as : is a reserved character that denotes a
> field specification. But I could be wrong.
>
> Try swapping analyzers, if you no longer have the same issue with
> Simple, try Standard. Assuming the same problem shows up there, I think
> you might need to do something about the :.
>
> Matt
>
> k.sayama wrote:
>> hello.
>>
>> i've tried to highlight string using Highligheter(2.4.1) and
>> JapaneseAnalyzer
>> but the following code extract show the problem
>>
>> String F = "f";
>> String CONTENTS = "AAA :BBB CCC";
>> JapaneseAnalyzer analyzer = new JapaneseAnalyzer();
>> QueryParser qp = new QueryParser( F, analyzer );
>> Query query = qp.parse( "BBB" );
>> Highlighter h = new Highlighter( new QueryScorer( query, F ) );
>>
>> System.out.println( h.getBestFragment( analyzer, F, CONTENTS ) );
>>
>> The sytsem outputs
>> <B>AAA</B> :BBB CCC
>>
>> When you change CONTENTS to "AAA _BBB CCC"
>> the system outputs
>>
>> AAA _<B>BBB</B> CCC
>>
>> Are there any problems?
>> Thanks in advance
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> -- 
> Matthew Hall
> Software Engineer
> Mouse Genome Informatics
> mhall@informatics.jax.org
> (207) 288-6012
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message