lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanislaw Osinski <stac...@gmail.com>
Subject Re: questions about Clustering
Date Sat, 23 May 2009 13:28:10 GMT
>
> Hmm, I saw the comment in ClusteringDocumentList.java of Carrot2:
>
> /*
> * If you know what query generated the documents you're about to cluster,
> pass
> * the query to the algorithm, which will usually increase clustering
> quality.
> */
> attributes.put(AttributeNames.QUERY, "data mining");
>
> So I'm worried about clustering quality when Carrot2 got string
> "MatchAllDocsQuery".


The query is just a hint, without the query you should still be able to get
decent clusters (at least for English, we've not tested Carrot2 much with
Japanese).

Cheers,

Staszek

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message