lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ito hayato <hayato_1...@yahoo.co.jp>
Subject Unexpected highlighted text
Date Mon, 06 Apr 2009 11:02:32 GMT
Hi All, 
My name is Hayato.

I have a question for Highlighter.

I indexed following text and use Tokenizer.

text     : abracadabra
Tokenizer: NGramAnalyzer

and requested following query 

query    : ab

expected result and actual result is following

expect   : <B>ab</B>racad<B>ab</B>ra
actual   : <B>abracadab</B>ra


To be more specific, i try this testcase, but failed.

is this behavior valid?
if valid, please teach me why this result...

-----------------
    public static class NGramAnalyzer extends Analyzer {
        int minGram;

        int maxGram;

        public NGramAnalyzer(int minGram, int maxGram) {
            super();
            this.maxGram = maxGram;
            this.minGram = minGram;
        }

        public TokenStream tokenStream(String fieldName,
java.io.Reader reader) {
            return new NGramTokenizer(reader, minGram,
maxGram);
        }
    }

    @Test
    public void testGetBestTextFragments2() throws
IOException, ParseException {
        String CONTENT = "abracadabra";
        String QUERY_STRING = "ab";
        String F = "f";
        Analyzer analyzer = new NGramAnalyzer(2,2);
        TokenStream tokenStream =
analyzer.tokenStream("f", new StringReader(CONTENT));

        QueryParser qp = new QueryParser(F, analyzer);
        Query query = null;
        query = qp.parse(QUERY_STRING);

        Scorer scorer = new QueryScorer(query, F);
        Highlighter h = new Highlighter(scorer);

        Assert.assertEquals("<B>ab</B>racad<B>ab</B>ra",
h.getBestTextFragments(tokenStream, CONTENT, true,
10)[0].markedUpText
            .toString());

    }


--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message