lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <>
Subject Re: Analyzer thread safety; Stop words
Date Thu, 30 Nov 2006 06:24:27 GMT
Yonik Seeley wrote:
> On 11/29/06, Antony Bowesman <> wrote:
>> Yonik Seeley wrote:
> The GreekAnalyzer is just an example of how you can use existing
> Analyzers (as long as they have a default constructor), but it's not
> the recommended approach.
> TokenFilters are preffered over Analyzers.... you can plug them
> together in any way you see fit to solve your analysis problem.  For
> Solr, an added bonus of using chains of filters  is that Solr can
> "know" about the results after each filter and show you the results on
> an analysis web page (very useful for debugging).
> If I were to analyze greek text, I might do something like this:
> <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
>      <analyzer>
>          <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>          <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>          <filter class="solr.LowerCaseFilterFactory"/>
>          <filter class="solr.StopFilterFactory" words="stopwords.txt"/>
>         <filter class="solr.SnowballPorterFilterFactory" 
> language="Greek" />
> xt"/>
>      </analyzer>
> </fieldtype>
> If you try to put everything in Analyzer constructors, you get
> combinatorial explosion.

I guess you would use methods rather than, as you say, getting into constructor 
hell.  Anyway, I'll have a deeper look at the solr stuff when I get to phase 2. 
  Right now, I've gone as far with analysis as I need to, but I would like to 
get better configuration than I've currently got.  I know it will come back to 

Thanks for your comments Yonik

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message