incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From goran kent <gorank...@gmail.com>
Subject Re: [lucy-user] ClusterSearcher and excerpt/highlighting
Date Sun, 27 Nov 2011 08:43:58 GMT
On Sat, Nov 26, 2011 at 8:34 PM, Marvin Humphrey <marvin@rectangular.com> wrote:
> What problems are you seeing?
>
> You might try applying Nick Wellnhofer's patch for LUCY-182 to the
> ClusterSearcher node.

I'll give the patch a try.

Searching across the cluster with
ClusterSearcher+SortSpec+QueryParser+make_compiler for [safaris
vacation packages] yields (no highlighting at all):

"... Accommodation Home Hotels & Accommodation Tours & Safaris
Vacation Packages Travel Guides My Trip (0 Items) My Account
Reservations Help …"

So the excerpt is working well with the relevant terms nice and
central.  However, no highlighting at all.  This seems to be quite
common, with the first 10 results hardly showing any highlighting.


Performing the same search with PolySearcher (no query compiler)
locally (r1203082), yields:

"... Home Hotels & Accommodation Tours & <strong>Safaris</strong>
<strong>Vacation</strong> <strong>Packages</strong> Travel Guides
My
Trip (0 Items) My Account  Reservations Help Advertise Contact Us
..."

Very similar excerpt, with correct highlighting.

I then modified the latter to also use the query compiler:

my $query_compiler = $parsed_query->make_compiler( searcher => $poly_searcher );
my $hits = $poly_searcher->hits(
    query      => $query_compiler,
    #query      => $parsed_query,
    sort_spec  => $sort_spec,
    offset     => 0,
    num_wanted => 10,
);

and highlighting goes a bit postal:

"x%" is an internal delimiter used to prevent phrase searching
crossing borders (it's not displayed, but I've retained it here for
illustrative purposes since the highlighter is focusing on it as well,
for some reason):

"... x% Home x% Hotels & Accommodation x% Tours &
<strong>Safaris</strong><strong> x%
</strong><strong>Vacation</strong><strong>
</strong><strong>Packages</strong> x% Travel Guides x% My Trip (0
Items) x% My Account  x%  Reservations Help  x%  Advertise  x%
Contact Us  x%..."

After cleanup, the excerpt above becomes:

"... Home Hotels & Accommodation Tours &
<strong>Safaris</strong><strong>
</strong><strong>Vacation</strong><strong>
</strong><strong>Packages</strong> Travel Guides My Trip (0 Items) My
Account  Reservations Help  Advertise  Contact Us ..."

-- 
Regards,
gk

Mime
View raw message