lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leonardo Dias <leona...@catho.com.br>
Subject Re: Snowball and protected words
Date Thu, 19 Feb 2009 13:37:38 GMT
Hello Walter.

We believe this kind of thing is better managed by a content team that 
works with user feedback. It would be costly everytime we find a word 
that brings irrelevant results the fact that, to correct that, we'd need 
to build a new stemmer. It's a lot better to create a simple interface 
that allows anyone to define which are the protected words we need 
according to user feedback in a simple, easy way.

Erik just said it wouldn't be hard to bring that functionality to 
Snowball. Erik, do you know what needs to be done in order to achieve 
that? Don't you guys have plans for that? I'm sure that I'm not the only 
one with that problem using SOLR with portuguese language (or any other 
idiom).

Thank you very much for your help,

Leonardo.

Walter Underwood escreveu:
> You can define exceptions in the Snowball language and generate
> a new stemmer. See the examples here:
>
> http://snowball.tartarus.org/algorithms/english/stemmer.html
>
> wunder
>
> On 2/18/09 9:56 AM, "Erik Hatcher" <erik@ehatchersolutions.com> wrote:
>
>   
>> On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote:
>>     
>>> Is there a way to make the snowball algorithm work with a
>>> protwords.txt file?
>>>       
>> Currently, and unfortunately, no - the protected words feature is not
>> available the SnowballPorterFilterFactory.    It wouldn't take much
>> effort to bring that capability across though.
>>
>> Erik
>>     
>
>
>
>   




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message