lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miles Barr <>
Subject Re: Removing similar documents from search results
Date Mon, 14 Mar 2005 18:08:59 GMT
Hi Dawid,

On Mon, 2005-03-14 at 18:55 +0100, Dawid Weiss wrote:
> I can imagine if you apply clustering to search results anyway then the 
> information about clusters can help you determine 'similar' results and 
> reorder the output list.

That's an interesting idea. How easy is it to 'tighten' the clustering
clones? So say we take a very narrow cone around each result and any
other documents within that cone can be considered similar enough, and
hence not displayed. Then we'd take the document closest to the centre
of the cloud and make that the 'original' copy and display it.

Or would that approach be too expensive to calculate for each search?

Miles Barr <>
Runtime Collective Ltd.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message