lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Re: Returning a minimum number of clusters
Date Mon, 01 May 2006 19:04:07 GMT

On May 1, 2006, at 10:21 AM, Grant Ingersoll wrote:

> You might be interested in the Carrot project, which has some  
> Lucene support.  I don't know if it solves your second problem, but  
> it already implements clustering and may allow you to get to an  
> answer for the second problem quicker.  I have, just recently,  
> started using it for a clustering task I am working on related to  
> search results.

I tracked down this demo...

 From what I can tell, it doesn't use Lucene's term vectors.  I think  
it should be possible to exploit those Term Vectors, perhaps yielding  
a good result without having to build a summary for each document.   
Dunno if the benefits justify the development effort.  :) I have to  
implement host-deduping in a KinoSearch-based app anyway, though, so  
I think I'll try this technique and see how well things work if I  
extend it for use with non-keyword fields.

Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message