lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <>
Subject Re: Lucene search clusters
Date Wed, 08 Jun 2005 12:11:18 GMT

You should state your requirements clearly:

1. What data you want to cluster? (whole index/ search results)
2. What is the role of the extension? How is it going to be used? 
(front-end clusters, query refinement, etc)
3. Do you need the implementation or an API for clustering in the
source code? (I'd personally stick to the API; there are many products 
out there that perform clustering. Carrot2 is no exception -- there is 
an excellent (in my humble opinion :) open source clustering algorithm 
Lingo, but there is also a commercial component that is much faster and 
more customizable. You can start off with an open source clusterer then 
and switch to a commercial product if you want higher scalability or 
different functionality. I implemented such an API in Nutch -- take a 
look in its source code for hints).


Lorenzo wrote:
> I see some noise about clustering and lucene, but I'm still waiting for 
> someone that will help me creating a clustering extension.
> I know both carrot2 and weka (the first can be integrated with Lucene, the 
> latter may be - Falko can you tell me?) but would like to write something 
> that could be included in the sandbox (or similar) with an implementation 
> that we'll find the better for a general purpose environment. Maybe carrot2 
> or other will be the best one (I really hope, I'm a lazy coder;-) ) and so 
> we will simply ask David to extend his code, but first want to make some 
> tests.
> bye
> Lorenzo

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message