Return-Path: X-Original-To: apmail-incubator-lucy-user-archive@www.apache.org Delivered-To: apmail-incubator-lucy-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 84D9E72A8 for ; Sun, 27 Nov 2011 08:44:28 +0000 (UTC) Received: (qmail 39512 invoked by uid 500); 27 Nov 2011 08:44:28 -0000 Delivered-To: apmail-incubator-lucy-user-archive@incubator.apache.org Received: (qmail 39460 invoked by uid 500); 27 Nov 2011 08:44:26 -0000 Mailing-List: contact lucy-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucy-user@incubator.apache.org Delivered-To: mailing list lucy-user@incubator.apache.org Received: (qmail 39452 invoked by uid 99); 27 Nov 2011 08:44:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Nov 2011 08:44:26 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gorankent@gmail.com designates 209.85.220.175 as permitted sender) Received: from [209.85.220.175] (HELO mail-vx0-f175.google.com) (209.85.220.175) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 27 Nov 2011 08:44:19 +0000 Received: by vcbfo13 with SMTP id fo13so1166662vcb.6 for ; Sun, 27 Nov 2011 00:43:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=ZxRKkrLXfe13Ez+5prCqoKb9awXuNZ3MPl0qhXshfko=; b=LhIZQPqG4Nk2BlfppHjmwZnwnH9NEWOXFttj6YtFXbhQsBT5BW5Wf+0ZQtyIHMa70P rnYdmxL2UF3gptSVpYTRdoFPFj44EheBfNtI181Ix8Sx1G9ygoZPUnuAjo6Q32cQc+dK 3kWspBW65NrvBbs4oT6ffQhFyP2YGhqbEFHMo= MIME-Version: 1.0 Received: by 10.220.115.79 with SMTP id h15mr3847257vcq.272.1322383438222; Sun, 27 Nov 2011 00:43:58 -0800 (PST) Received: by 10.52.188.10 with HTTP; Sun, 27 Nov 2011 00:43:58 -0800 (PST) In-Reply-To: <20111126183422.GB22818@rectangular.com> References: <20111126183422.GB22818@rectangular.com> Date: Sun, 27 Nov 2011 10:43:58 +0200 Message-ID: From: goran kent To: lucy-user@incubator.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Subject: Re: [lucy-user] ClusterSearcher and excerpt/highlighting On Sat, Nov 26, 2011 at 8:34 PM, Marvin Humphrey w= rote: > What problems are you seeing? > > You might try applying Nick Wellnhofer's patch for LUCY-182 to the > ClusterSearcher node. I'll give the patch a try. Searching across the cluster with ClusterSearcher+SortSpec+QueryParser+make_compiler for [safaris vacation packages] yields (no highlighting at all): "... Accommodation Home Hotels & Accommodation Tours & Safaris Vacation Packages Travel Guides My Trip (0 Items) My Account Reservations Help =85" So the excerpt is working well with the relevant terms nice and central. However, no highlighting at all. This seems to be quite common, with the first 10 results hardly showing any highlighting. Performing the same search with PolySearcher (no query compiler) locally (r1203082), yields: "... Home Hotels & Accommodation Tours & Safaris Vacation Packages Travel Guides My Trip (0 Items) My Account Reservations Help Advertise Contact Us ..." Very similar excerpt, with correct highlighting. I then modified the latter to also use the query compiler: my $query_compiler =3D $parsed_query->make_compiler( searcher =3D> $poly_se= archer ); my $hits =3D $poly_searcher->hits( query =3D> $query_compiler, #query =3D> $parsed_query, sort_spec =3D> $sort_spec, offset =3D> 0, num_wanted =3D> 10, ); and highlighting goes a bit postal: "x%" is an internal delimiter used to prevent phrase searching crossing borders (it's not displayed, but I've retained it here for illustrative purposes since the highlighter is focusing on it as well, for some reason): "... x% Home x% Hotels & Accommodation x% Tours & Safaris x% Vacation Packages x% Travel Guides x% My Trip (0 Items) x% My Account x% Reservations Help x% Advertise x% Contact Us x%..." After cleanup, the excerpt above becomes: "... Home Hotels & Accommodation Tours & Safaris Vacation Packages Travel Guides My Trip (0 Items) My Account Reservations Help Advertise Contact Us ..." --=20 Regards, gk