Return-Path: Delivered-To: apmail-incubator-lucene-net-user-archive@minotaur.apache.org Received: (qmail 35551 invoked from network); 9 Mar 2009 18:01:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Mar 2009 18:01:25 -0000 Received: (qmail 15824 invoked by uid 500); 9 Mar 2009 18:01:24 -0000 Delivered-To: apmail-incubator-lucene-net-user-archive@incubator.apache.org Received: (qmail 15808 invoked by uid 500); 9 Mar 2009 18:01:24 -0000 Mailing-List: contact lucene-net-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-user@incubator.apache.org Delivered-To: mailing list lucene-net-user@incubator.apache.org Received: (qmail 15797 invoked by uid 99); 9 Mar 2009 18:01:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Mar 2009 11:01:24 -0700 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of benmartz@gmail.com designates 209.85.200.175 as permitted sender) Received: from [209.85.200.175] (HELO wf-out-1314.google.com) (209.85.200.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Mar 2009 18:01:16 +0000 Received: by wf-out-1314.google.com with SMTP id 25so2112639wfc.21 for ; Mon, 09 Mar 2009 11:00:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=ad20hOg1vAV+ivcMsMn7oy7ywhZC2/VvN9YzHPQppRo=; b=botdSaQZl7bmvpy73Vny4zR08XbGmVSwkn7Mt6HBz5naXDknIfgVHG05mgq2fMpwMW m8v26MZLHTscv5EjdtB246mr7lHQ/qnQzxIiNCfys1lJKMYGX2HpyU+osEPYEj9V8ZGQ PJojfbHa+/pxDY8/IuxRkOmDR6JZn8bKxAJfw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=BYgLi03KXya0PfY6Rrr6iesJ9/UzQrZ3t1P/ffNSKPVZmLgMy6nSFfbkBa8clrFtrN oRIpFZs6oVk/G5UrpZiI1PFRl4IGO4mzTWpJK3g9MV7DCv4wAauj4rwVYEQx8KlPYR2n 9vnON3boVRMQz+hPOxy7R+EuqyoAuX9oNtvfI= MIME-Version: 1.0 Received: by 10.142.237.19 with SMTP id k19mr2648698wfh.68.1236621653949; Mon, 09 Mar 2009 11:00:53 -0700 (PDT) In-Reply-To: References: Date: Mon, 9 Mar 2009 11:00:53 -0700 Message-ID: <380bd16a0903091100m567f7f98w35e5d5d140362294@mail.gmail.com> Subject: Re: Highlighter withField.Store.NO From: Ben Martz To: lucene-net-user@incubator.apache.org Content-Type: multipart/alternative; boundary=000e0cd2423267be7d0464b36a44 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd2423267be7d0464b36a44 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I use the Highlighter class in a shipping product in which I do not store values in the index. Instead I independently load the contents from my own cache and pass that to Highlighter.GetBestFragments(). The only disadvantag= e is that depending on the size of your contents and the speed of your contents cache this can make Highlighting a very expensive operation so pay very careful attention to how and when you load your contents data. On Mon, Mar 9, 2009 at 8:14 AM, P=E1l Barnab=E1s wrote: > Hi, > I'm trying to highlight the keyword in the search result. > This is my code: > ------------------------------------------------------------------ > string indexdir =3D @"D:\temp\index_testing"; > if (System.IO.Directory.Exists(indexdir)) > System.IO.Directory.Delete(indexdir, true); > > IndexWriter writer =3D new IndexWriter(indexdir, new > Lucene.Net.Analysis.Standard.StandardAnalyzer(), true); > // demo text > string scontent =3D "First, we parse the user-entered query st= ring > indicating that we want to match ..."; > > for (int i =3D 0; i < 100; i++) > { > Document doc =3D new Document(); > > doc.Add(new Field("ID", i.ToString(), Field.Store.YES, > Field.Index.UN_TOKENIZED)); > doc.Add(new Field("CONTENT", scontent, Field.Store.YES, > Field.Index.TOKENIZED)); > > writer.AddDocument(doc); > } > > writer.Close(); > > IndexReader reader =3D IndexReader.Open(indexdir); > Searcher searcher =3D new IndexSearcher(reader); > Analyzer analyzer =3D new > Lucene.Net.Analysis.Standard.StandardAnalyzer(); > > MultiFieldQueryParser parser =3D new MultiFieldQueryParser(new > string[] { "CONTENT" }, analyzer); > > Query query =3D parser.Parse("indicating"); > query =3D query.Rewrite(reader); > Trace.WriteLine("Searching for: " + query.ToString()); > > Lucene.Net.Search.Hits hits =3D searcher.Search(query); > > SimpleHTMLFormatter formatter =3D new SimpleHTMLFormatter(" class=3D'term'>", ""); > > QueryScorer scorer =3D new QueryScorer(query); > > Highlighter highlighter =3D new Highlighter(formatter, scorer)= ; > highlighter.SetTextFragmenter(new SimpleFragmenter(2000)); > > for (int i =3D 0; i < hits.Length(); i++) > { > Document resdoc =3D hits.Doc(i); > > string s =3D resdoc.Get("CONTENT"); > // s is null if Field.Store is NO > TokenStream tsTitle =3D analyzer.TokenStream("CONTENT", ne= w > System.IO.StringReader(s)); > string hl =3D highlighter.GetBestFragment(tsTitle, s); > } > ------------------------------------------------------------------ > > The problem is when the content is not stored in the index > (Field.Store.NO), the result document does not contain the value. Is > it possible to use the > Highlighter class in this case ? or what's the best way to highlight the > search result? is it possible to get all tokens for the hits.Doc(i)? > --=20 13:37 - Someone stole the precinct toilet. The cops have nothing to go on. 14:37 - Officers dispatched to a daycare where a three-year-old was resisting a rest. 21:11 - Hole found in nudist camp wall. Officers are looking into it. --000e0cd2423267be7d0464b36a44--