lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "ClusteringComponent" by StanislawOsinski
Date Tue, 08 Sep 2009 19:25:38 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by StanislawOsinski:
http://wiki.apache.org/solr/ClusteringComponent

The comment on the change is:
Updates to the Carrot2 clustering tuning procedure

------------------------------------------------------------------------------
  
   1. [http://project.carrot2.org/download.html Download Carrot2 Document Clustering Workbench]
for your platform.
   2. [http://download.carrot2.org/head/manual/#section.getting-started.solr Attach] your
Solr instance as a document source in the Workbench.
-  3. [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning Fine tune]
stop words, stop labels and possibly [http://download.carrot2.org/head/manual/#section.component.lingo
other attributes] of the clustering algorithms to suit your needs.
+  3. [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning.stop-words
Fine tune stop words], [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning.stop-regexps
stop labels] and possibly [http://download.carrot2.org/head/manual/#section.component.lingo
other attributes] of the clustering algorithms to suit your needs.
+  4. To transfer the modified `stopwords.*` and `stoplabels.*` files to your Solr instance,
simply make the modified files accessible in the classpath. If you're using the Solr example
scripts, try putting the files in the `example/resources` folder (Jetty starter from `start.jar`
adds all files from that folder to the classpath). Alternatively, you can overwrite the corresponding
`stopwords.*` and `stoplabels.*` files directly in `carrot2-mini-*.jar`.
-  4. To transfer the modified stopwords.* and stoplabels.* files to your Solr instance, simply
make the modified files accessible in the classpath. If you're using the Solr example scripts,
try:
- 
- {{{
- java -cp <dir-with-your-modified-stopwords> -Dsolr.solr.home=./clustering/solr -jar
start.jar
- }}}
  
  
  = Document Clustering =

Mime
View raw message