lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: Returning a minimum number of clusters
Date Mon, 01 May 2006 19:04:07 GMT

On May 1, 2006, at 10:21 AM, Grant Ingersoll wrote:

> You might be interested in the Carrot project, which has some  
> Lucene support.  I don't know if it solves your second problem, but  
> it already implements clustering and may allow you to get to an  
> answer for the second problem quicker.  I have, just recently,  
> started using it for a clustering task I am working on related to  
> search results.

I tracked down this demo...

http://www.cs.put.poznan.pl/dweiss/tmp/carrot2-lucene.zip

 From what I can tell, it doesn't use Lucene's term vectors.  I think  
it should be possible to exploit those Term Vectors, perhaps yielding  
a good result without having to build a summary for each document.   
Dunno if the benefits justify the development effort.  :) I have to  
implement host-deduping in a KinoSearch-based app anyway, though, so  
I think I'll try this technique and see how well things work if I  
extend it for use with non-keyword fields.

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message