jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: IndexingConfiguration jr 1.4 release, analyzing, searching and synonymprovider
Date Tue, 21 Aug 2007 14:29:54 GMT

> On 8/21/07, Ard Schrijvers <a.schrijvers@hippo.nl> wrote:
> > ...<analyzers>
> >         <analyzer name="fr" 
> value="org.apache.lucene.analysis.fr.FrenchAnalyzer"/>
> >         <analyzer name="de" 
> value="org.apache.lucene.analysis.de.GermanAnalyzer"/>
> > </analyzers>
> >
> > <index-rule nodeType="nt:unstructured">
> >          <property analyzer="fr">bode_fr</property>
> >        <property analyzer="de">bode_de</property>
> > </index-rule>...
> I prefer this variant, where you define reusable analyzers 
> configurations.
> This starts to look similar to what Solr does, maybe the Solr
> schema.xml could give you some additional ideas? 

Yes, that is indeed a good point Bertrand. I happened to extend the forrest SolrGenerator
for Cocoon a couple of weeks ago for a project :-) ...very nice experience to have everything
up and running within 30 minutes with Solr. Very user friendly. The schema.xml [1] is very

> It is documented at
> http://wiki.apache.org/solr/SchemaXml, and some tutorials and articles
> are linked from http://wiki.apache.org/solr/SolrResources.

So would you like to see parts like chaining of filters for a indexing a property? Think that
shouldn't be to hard to implement. It would enable people to configure their analyzers to
be build from a set of tokenizers and filters, and do not have to implement (reinvent) their
own analyzers. Certainly something like 

<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true"
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>

would ofcourse ease the use of implementing synonyms/stopwords yourself. 

Regards Ard

[1] http://svn.apache.org/viewvc/lucene/solr/trunk/example/solr/conf/schema.xml?content-type=text%2Fplain&view=co

> -Bertrand
View raw message