Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 82653 invoked from network); 30 Sep 2010 06:14:07 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Sep 2010 06:14:07 -0000 Received: (qmail 92607 invoked by uid 500); 30 Sep 2010 06:14:05 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 92163 invoked by uid 500); 30 Sep 2010 06:14:01 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 92155 invoked by uid 99); 30 Sep 2010 06:14:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 06:14:00 +0000 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [70.182.184.44] (HELO mailsecurity.capitallegals.com) (70.182.184.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 Sep 2010 06:13:55 +0000 X-AuditID: c0a8032b-b7bd6ae000000d34-d8-4ca42a8dac02 Received: from Cmail.Capitallegals.com (cmail.capitallegals.com [192.168.3.18]) by mailsecurity.capitallegals.com (Symantec Mail Security) with SMTP id C2.60.03380.D8A24AC4; Thu, 30 Sep 2010 02:13:33 -0400 (EDT) Received: from EXCHSVR.IN.Capitallegals.com ([10.1.2.10]) by Cmail.Capitallegals.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 30 Sep 2010 02:13:33 -0400 Content-class: urn:content-classes:message Subject: RE: Problem searching in the same sentence MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Thu, 30 Sep 2010 11:43:25 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Problem searching in the same sentence Thread-Index: ActgNWctJeWTojyhSmqY58SzSVO9xwAMJvpw References: <1284658758220-1501269.post@n3.nabble.com> <1285806047151-1605904.post@n3.nabble.com> From: "Jagdish Vasani IN" To: X-OriginalArrivalTime: 30 Sep 2010 06:13:33.0351 (UTC) FILETIME=[9C2FF370:01CB6066] X-Brightmail-Tracker: AAAAAA== For highlighting to work you need to store position information of each token... So while field creation you need to call following constructor.. =20 Field field =3D new Field(fieldName, validFieldValue, (store) ? Field.Store.YES : Field.Store.NO, (tokenize) ? Field.Index.ANALYZED : Field.Index.NOT_ANALYZED, TermVector.WITH_POSITIONS_OFFSET); Hope this will solve your issue.. Thanks, Jagdish -----Original Message----- From: Sirish Vadala [mailto:sirishreddy@gmail.com]=20 Sent: Thursday, September 30, 2010 5:51 AM To: java-user@lucene.apache.org Subject: Re: Problem searching in the same sentence Hello All: I am performing the sentence specific phrase search, by adding sentence by sentence to the same field as suggested below. Everything works fine, but when I display my results, highlighter is not able to find the search text phrase. The following is my code: SentenceScanner sentenceScanner =3D new SentenceScanner(doc.getText().replaceAll("\\s+", " ")); ArrayList sentencesList =3D sentenceScanner.getAllSentences(); for (String sentence : sentencesList){ addFieldToDocument(document, IFIELD_TEXT, sentence, true, true); } private void addFieldToDocument(Document document, String fieldName, String fieldValue, Boolean store, Boolean tokenize) { String validFieldValue =3D Utility.validateString(fieldValue); Field field =3D new Field(fieldName, validFieldValue, (store) ? Field.Store.YES : Field.Store.NO, (tokenize) ? Field.Index.ANALYZED : Field.Index.NOT_ANALYZED); document.add(field); } My custom standard analyzer: public class MyStandardAnalyzer extends StandardAnalyzer implements IndexFields { public MyStandardAnalyzer(Version matchVersion) { super(matchVersion); } public int getPositionIncrementGap(String fieldName) { int incrementGap =3D super.getPositionIncrementGap(fieldName); if (fieldName.equals(IFIELD_TEXT)) { incrementGap +=3D 10; } return incrementGap; } } My highlighter code: //analyzer instantiated as 'MyStandardAnalyzer' in the constructor public String highlight(String text) { String highlightedText =3D ""; TokenStream tokenStream =3D analyzer.tokenStream(IndexFields.IFIELD_TEXT, new StringReader(text)); highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); try { return highlighter.getBestFragments(tokenStream, text, maxFragments, delimiter); } catch (Exception e) { e.printStackTrace(); }=20 return highlightedText; } Everything works fine except for the highlighter. Highlighter doesn't return me the text snippets while retrieving the results. Before this sentence specific implementation, it worked well. Any hints or help on this would be highly appreciated. --=20 View this message in context: http://lucene.472066.n3.nabble.com/Problem-searching-in-the-same-sentenc e-tp1501269p1605904.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org