incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry C." <he...@cityweb.co.za>
Subject Re: [lucy-user] Highlighting/excerpt on URLs
Date Mon, 12 Dec 2011 11:17:45 GMT
On Sun, December 11, 2011 23:01, Marvin Humphrey wrote:
> I'm not sure I understand exactly.  Are you saying that if you've set
> excerpt_length to N, URLs which are over N characters will return an ellipsis
> rather than truncate?
>
> # excerpt_length => 20
> http://www.foo.com/            => http://www.foo.com/    # correct
> http://www.foo.com/stuff.html  => http://www.foo.com/…   # desired
> http://www.foo.com/stuff.html  => …                      # actual

Correct.

If I explicitly specify excerpt_length:

my $hl             = Lucy::Highlight::Highlighter->new(
   searcher       => $searcher,
   query          => $query_compiler,
   field          => 'site',
   excerpt_length => 60,
);

...and the field content is longer than 60, then

$page_highlighter->create_excerpt($hit);

returns '...'.

Content which is short than 60, returns the highlighted excerpt as expected.

If I comment out "excerpt_length => 60," above, then it returns the full
non-truncated excerpt with highlighting as expected.

Some >60char samples which return &#8230;/"...", searching for [iol.co.za] or
[news24.com] (brackets are mine):

[www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220]
[http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html]
[www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html]

The following return double-ellipses ("......" - &#8230;&#8230;), searching
for [adsl mweb.com]:

[http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx]
[http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx]


Hope that helps.

-- 
Regards
Henry


Mime
View raw message