Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3898 invoked from network); 3 Sep 2010 12:06:48 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 3 Sep 2010 12:06:48 -0000 Received: (qmail 63086 invoked by uid 500); 3 Sep 2010 12:06:46 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 62826 invoked by uid 500); 3 Sep 2010 12:06:43 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 62818 invoked by uid 99); 3 Sep 2010 12:06:42 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 12:06:42 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [134.96.191.147] (HELO smtp.dfki.de) (134.96.191.147) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Sep 2010 12:06:17 +0000 Received: from smtp.dfki.de (localhost [127.0.0.1]) by imss.7 (Postfix) with ESMTP id 48554312C3 for ; Fri, 3 Sep 2010 14:05:54 +0200 (CEST) Received: from mail.dfki.de (lnv-104.sb.dfki.de [134.96.191.146]) by smtp.dfki.de (Postfix) with ESMTP id 30F17312BC for ; Fri, 3 Sep 2010 14:05:54 +0200 (CEST) Received: from bledsoe.dfki.uni-sb.de (bledsoe.dfki.uni-sb.de [134.96.184.151]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.dfki.de (Postfix) with ESMTPSA id 18B42310E7 for ; Fri, 3 Sep 2010 14:05:54 +0200 (CEST) From: Paul Libbrecht Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: trying to use the highlighter Date: Fri, 3 Sep 2010 14:05:53 +0200 Message-Id: <725AEC83-D03D-41D2-B3B7-5CF670809994@activemath.org> To: java-user@lucene.apache.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) X-Virus-Checked: Checked by ClamAV on apache.org Hello list, I'm strugging again with the highlighter. I don't understand why I = obtain sporadically InvalidTokenOffsetsException. The mission: given a query, detect which field was matched, among the = names of the concepts: there can be several names for a given concept, = also in one language. Concepts are documents and names are in fields = name-xx where xx is the two-letter-language. Here's the method I'm using: public String computeMatchedField(int docNum, Document doc, Analyzer = analyzer, Query query) throws IOException { //System.out.println("----- computing matched field for query " = + query + " on document " + doc.get("uri")); query =3D query.rewrite(this.reader); String found =3D null; float maxScore =3D 0; try { for(Field f: (List) doc.getFields()) { QueryScorer scorer =3D new = QueryScorer(query,reader,f.name()); if(!f.name().startsWith("name-")) continue; //System.out.println("Measuring field " + f.name() + ": = " + f.stringValue()); String text =3D f.stringValue(); TokenStream tokenStream =3D = TokenSources.getAnyTokenStream(reader,docNum, f.name(), doc, analyzer); SimpleHTMLFormatter htmlFormatter =3D new = SimpleHTMLFormatter(); Highlighter highlighter =3D new = Highlighter(htmlFormatter, scorer); TextFragment[] frags =3D = highlighter.getBestTextFragments(tokenStream, text, false, 1); if(frags=3D=3Dnull || frags.length=3D=3D0) continue; float score =3D frags[0].getScore(); //System.out.println("Score: " + score); if(score > maxScore) { maxScore =3D score; found =3D frags[0].toString(); } } } catch(Exception ex) {ex.printStackTrace();} return found; } Unfortunately, I have to catch InvalidTokenOffsetsException which does = happen sometimes, not always. When it occurs, it stops the highlighting (the detected field is "null") = and also costs quite some time. What am I doing wrong? I tried making my own tokenStream with no difference. thanks in advance paul --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org