lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Unexpected highlighted text
Date Mon, 06 Apr 2009 13:41:33 GMT
This problem is filed at:

https://issues.apache.org/jira/browse/LUCENE-1489

You may want to take a look at LUCENE-1522 for highlighting N-gram tokens:

https://issues.apache.org/jira/browse/LUCENE-1522

Koji


ito hayato wrote:
> Hi All, 
> My name is Hayato.
>
> I have a question for Highlighter.
>
> I indexed following text and use Tokenizer.
>
> text     : abracadabra
> Tokenizer: NGramAnalyzer
>
> and requested following query 
>
> query    : ab
>
> expected result and actual result is following
>
> expect   : <B>ab</B>racad<B>ab</B>ra
> actual   : <B>abracadab</B>ra
>
>
> To be more specific, i try this testcase, but failed.
>
> is this behavior valid?
> if valid, please teach me why this result...
>
> -----------------
>     public static class NGramAnalyzer extends Analyzer {
>         int minGram;
>
>         int maxGram;
>
>         public NGramAnalyzer(int minGram, int maxGram) {
>             super();
>             this.maxGram = maxGram;
>             this.minGram = minGram;
>         }
>
>         public TokenStream tokenStream(String fieldName,
> java.io.Reader reader) {
>             return new NGramTokenizer(reader, minGram,
> maxGram);
>         }
>     }
>
>     @Test
>     public void testGetBestTextFragments2() throws
> IOException, ParseException {
>         String CONTENT = "abracadabra";
>         String QUERY_STRING = "ab";
>         String F = "f";
>         Analyzer analyzer = new NGramAnalyzer(2,2);
>         TokenStream tokenStream =
> analyzer.tokenStream("f", new StringReader(CONTENT));
>
>         QueryParser qp = new QueryParser(F, analyzer);
>         Query query = null;
>         query = qp.parse(QUERY_STRING);
>
>         Scorer scorer = new QueryScorer(query, F);
>         Highlighter h = new Highlighter(scorer);
>
>         Assert.assertEquals("<B>ab</B>racad<B>ab</B>ra",
> h.getBestTextFragments(tokenStream, CONTENT, true,
> 10)[0].markedUpText
>             .toString());
>
>     }
>
>
> --------------------------------------
> Power up the Internet with Yahoo! Toolbar.
> http://pr.mail.yahoo.co.jp/toolbar/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message