Return-Path: X-Original-To: apmail-incubator-lucy-user-archive@www.apache.org Delivered-To: apmail-incubator-lucy-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 52A739706 for ; Mon, 12 Dec 2011 11:18:57 +0000 (UTC) Received: (qmail 59655 invoked by uid 500); 12 Dec 2011 11:18:57 -0000 Delivered-To: apmail-incubator-lucy-user-archive@incubator.apache.org Received: (qmail 59620 invoked by uid 500); 12 Dec 2011 11:18:56 -0000 Mailing-List: contact lucy-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucy-user@incubator.apache.org Delivered-To: mailing list lucy-user@incubator.apache.org Received: (qmail 59612 invoked by uid 99); 12 Dec 2011 11:18:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Dec 2011 11:18:56 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [196.23.180.100] (HELO zen.co.za) (196.23.180.100) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Dec 2011 11:18:44 +0000 Received: from zenmail.co.za (localhost.localdomain [127.0.0.1]) by zen.co.za (8.13.8/8.13.8) with ESMTP id pBCBHiNV027066 for ; Mon, 12 Dec 2011 13:17:45 +0200 Received: from 196.23.180.209 (proxying for 196.25.167.162) by zenmail.co.za with HTTP; Mon, 12 Dec 2011 13:17:45 +0200 Message-ID: <997b1bc81bd59ee219a957f192e9080d.squirrel@zenmail.co.za> In-Reply-To: <20111211210102.GA15477@rectangular.com> References: <24613f56021463b8666d7406a1143b06.squirrel@zenmail.co.za> <20111211210102.GA15477@rectangular.com> Date: Mon, 12 Dec 2011 13:17:45 +0200 From: "Henry C." To: lucy-user@incubator.apache.org User-Agent: SquirrelMail/1.5.2 [SVN] MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Subject: Re: [lucy-user] Highlighting/excerpt on URLs On Sun, December 11, 2011 23:01, Marvin Humphrey wrote: > I'm not sure I understand exactly. Are you saying that if you've set > excerpt_length to N, URLs which are over N characters will return an ellipsis > rather than truncate? > > # excerpt_length => 20 > http://www.foo.com/ => http://www.foo.com/ # correct > http://www.foo.com/stuff.html => http://www.foo.com/… # desired > http://www.foo.com/stuff.html => … # actual Correct. If I explicitly specify excerpt_length: my $hl = Lucy::Highlight::Highlighter->new( searcher => $searcher, query => $query_compiler, field => 'site', excerpt_length => 60, ); ...and the field content is longer than 60, then $page_highlighter->create_excerpt($hit); returns '...'. Content which is short than 60, returns the highlighted excerpt as expected. If I comment out "excerpt_length => 60," above, then it returns the full non-truncated excerpt with highlighting as expected. Some >60char samples which return …/"...", searching for [iol.co.za] or [news24.com] (brackets are mine): [www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220] [http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html] [www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html] The following return double-ellipses ("......" - ……), searching for [adsl mweb.com]: [http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx] [http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx] Hope that helps. -- Regards Henry