Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39BF26ED9 for ; Tue, 28 Jun 2011 11:48:47 +0000 (UTC) Received: (qmail 79190 invoked by uid 500); 28 Jun 2011 11:48:41 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 79033 invoked by uid 500); 28 Jun 2011 11:48:40 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 79011 invoked by uid 99); 28 Jun 2011 11:48:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jun 2011 11:48:40 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jun 2011 11:48:38 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 53F784364E9 for ; Tue, 28 Jun 2011 11:48:17 +0000 (UTC) Date: Tue, 28 Jun 2011 11:48:17 +0000 (UTC) From: "Jahangir Anwari (JIRA)" To: dev@lucene.apache.org Message-ID: <772799380.1460.1309261697340.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (LUCENE-1824) FastVectorHighlighter truncates words at beginning and end of fragments MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056455#comment-13056455 ] Jahangir Anwari commented on LUCENE-1824: ----------------------------------------- Is there any chance of the patch being applied to the 3.x branch? > FastVectorHighlighter truncates words at beginning and end of fragments > ----------------------------------------------------------------------- > > Key: LUCENE-1824 > URL: https://issues.apache.org/jira/browse/LUCENE-1824 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/highlighter > Environment: any > Reporter: Alex Vigdor > Assignee: Koji Sekiguchi > Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-1824.patch > > > FastVectorHighlighter does not take word boundaries into consideration when building fragments, so that in most cases the first and last word of a fragment are truncated. This makes the highlights less legible than they should be. I will attach a patch to BaseFragmentBuilder that resolves this by expanding the start and end boundaries of the fragment to the first whitespace character on either side of the fragment, or the beginning or end of the source text, whichever comes first. This significantly improves legibility, at the cost of returning a slightly larger number of characters than specified for the fragment size. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org