lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wang Guangchen <guangchen...@gmail.com>
Subject Re: SOLR-769 clustering
Date Tue, 08 Sep 2009 18:20:30 GMT
Hi Staszek,

I tried your quick and dirty hack too. It didn't work also. phase like
"Carbon Atoms in the Group" with "in" still appear in my clustering labels.

What i did is,

1. use "java uf carrot2-mini.jar stoplabels.en" command to replace the
stoplabel.en file.
2. apply clustering patch. re-complie the solr with the new
carrot2-mini.jar.
3. deploy the new apache-solr-1.4-dev.war to tomcat.

I am using the nightly build version of the solr.

following is clustering setting in solrconfig.xml , pretty standard:

*<lst name="defaults">
<str name="echoParams">explicit</str>
       <str name="clustering.engine">default</str>
       <bool name="clustering.results">true</bool>
       <str name="carrot.title">name</str>
       <str name="carrot.snippet">abstract</str>
       <str name="carrot.url">id</str>
       <bool name="carrot.produceSummary">true</bool>
       <bool name="carrot.outputSubClusters">false</bool>
</lst>


<searchComponent
class="org.apache.solr.handler.clustering.ClusteringComponent"
name="clustering">
  <lst name="engine">
    <str name="name">default</str>
    <str
name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
    <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
    <float name="carrot.lingo.threshold.clusterAssignment">0.150</float>
    <float
name="carrot.lingo.threshold.candidateClusterThreshold">0.775</float>

  </lst>
 </searchComponent>


*I am wondering is there any extra setting that i need to configure in my
solrconfig.xml or schema.xml? or any special parameters that i need to
enable in the solrconfig.xml?*

thanks

-GC
*



On Tue, Sep 8, 2009 at 11:04 PM, Stanislaw Osinski <stachoo@gmail.com>wrote:

> Hi there,
>
> I try to apply the stoplabels with the instructions that you given in the
> > solr clustering Wiki. But it didn't work.
> >
> > I am runing the patched solr on tomcat. So to enable the stop label. I
> add
> > "-cp <dir-with-your-modified-stopwords>" in to my system's CATALINA_OPTS.
> I
> > tried to change the file name from stoplabels.txt to stoplabel.en also .
> It
> > didn't work too.
> >
> > Then I also find out that in carrot manual page
> > (
> >
> >
> http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning.stop-words
> > ).
> > It suggested to edit the stopwords files inside the carrot2-core.jar. I
> > tried this but it didn't work too.
> >
> > I am not sure what is wrong with my set up. will it be caused by any sort
> > of
> > caching?
> >
>
> A quick and dirty hack would be to simply replace the corresponding files
> (stoplabels.*) in carrot2-mini.jar.
>
> I know the packaging of the clustering contrib has changed a bit, so let me
> see how it currently works and correct the wiki if needed.
>
> Thanks,
>
> Staszek
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message