incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Collapsing search results based on a field
Date Sat, 17 Sep 2011 10:56:15 GMT
On Sat, Sep 17, 2011 at 07:46:12AM +0200, goran kent wrote:
> On Sat, Sep 17, 2011 at 12:56 AM, Marvin Humphrey
> <marvin@rectangular.com> wrote:
> > On Fri, Sep 16, 2011 at 03:00:21PM +0200, goran kent wrote:
> >> Any support for collapsing duplicate documents based on a field?
> >
> > I wrote a DedupingSearcher class for KinoSearch a while ago that did exactly
> > this, and I'd be happy to contribute it to the ASF.  It will take some
> > modernizing to get it compatible with Lucy, though.
> 
> Any possibility of squeezing that into your schedule?

Contributing it is no problem.  I won't get to the modernization myself, but
if someone wants to take it on I'll be happy to collaborate with them.

> > The algorithm is to rerun the search if there is not sufficient diversity in
> > the search results, adding exclusions to the query each time to suppress the
> > unwanted hits.
> 
> ouch, that doesn't sound good for performance.  Am I right?

Haven't measured.

Marvin Humphrey


Mime
View raw message