lucene-lucene-net-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Martz <benma...@gmail.com>
Subject Re: Highlighter withField.Store.NO
Date Mon, 09 Mar 2009 18:00:53 GMT
I use the Highlighter class in a shipping product in which I do not store
values in the index. Instead I independently load the contents from my own
cache and pass that to Highlighter.GetBestFragments(). The only disadvantage
is that depending on the size of your contents and the speed of your
contents cache this can make Highlighting a very expensive operation so pay
very careful attention to how and when you load your contents data.

On Mon, Mar 9, 2009 at 8:14 AM, Pál Barnabás <pbarni@gmail.com> wrote:

> Hi,
> I'm trying to highlight the keyword in the search result.
> This is my code:
> ------------------------------------------------------------------
> string indexdir = @"D:\temp\index_testing";
>            if (System.IO.Directory.Exists(indexdir))
>                System.IO.Directory.Delete(indexdir, true);
>
>            IndexWriter writer = new IndexWriter(indexdir, new
> Lucene.Net.Analysis.Standard.StandardAnalyzer(), true);
>            // demo text
>            string scontent = "First, we parse the user-entered query string
> indicating that we want to match ...";
>
>            for (int i = 0; i < 100; i++)
>            {
>                Document doc = new Document();
>
>                doc.Add(new Field("ID", i.ToString(), Field.Store.YES,
> Field.Index.UN_TOKENIZED));
>                doc.Add(new Field("CONTENT", scontent, Field.Store.YES,
> Field.Index.TOKENIZED));
>
>                writer.AddDocument(doc);
>            }
>
>            writer.Close();
>
>            IndexReader reader = IndexReader.Open(indexdir);
>            Searcher searcher = new IndexSearcher(reader);
>            Analyzer analyzer = new
> Lucene.Net.Analysis.Standard.StandardAnalyzer();
>
>            MultiFieldQueryParser parser = new MultiFieldQueryParser(new
> string[] { "CONTENT" }, analyzer);
>
>            Query query = parser.Parse("indicating");
>            query = query.Rewrite(reader);
>            Trace.WriteLine("Searching for: " + query.ToString());
>
>            Lucene.Net.Search.Hits hits = searcher.Search(query);
>
>            SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("<b
> class='term'>", "</b>");
>
>            QueryScorer scorer = new QueryScorer(query);
>
>            Highlighter highlighter = new Highlighter(formatter, scorer);
>            highlighter.SetTextFragmenter(new SimpleFragmenter(2000));
>
>            for (int i = 0; i < hits.Length(); i++)
>            {
>                Document resdoc = hits.Doc(i);
>
>                string s = resdoc.Get("CONTENT");
>                // s is null if Field.Store is NO
>                TokenStream tsTitle = analyzer.TokenStream("CONTENT", new
> System.IO.StringReader(s));
>                string hl = highlighter.GetBestFragment(tsTitle, s);
>            }
> ------------------------------------------------------------------
>
> The problem is when the content is not stored in the index
> (Field.Store.NO), the result document does not contain the value. Is
> it possible to use the
> Highlighter class in this case ? or what's the best way to highlight the
> search result? is it possible to get all tokens for the hits.Doc(i)?
>



-- 
13:37 - Someone stole the precinct toilet. The cops have nothing to go on.
14:37 - Officers dispatched to a daycare where a three-year-old was
resisting a rest.
21:11 - Hole found in nudist camp wall. Officers are looking into it.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message