lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pete Lewis" <p...@uptima.co.uk>
Subject Re: PorterStemfilter
Date Tue, 14 Sep 2004 18:40:32 GMT
Hi George

There are lots of problems with Port stemmers, not great for English but get
worse for other languages.

If you look at:

http://snowball.tartarus.org/demo.php

You'll see the Snowball demo - this is basically another instance of Porter.

If you enter "print" and "printer" and submit then the results will be
"print" and "printer" - hence showing the the Porter stemmed versions are
the same as the originals.  Therefore they are both distinct terms in their
own right and searches on one will not hit the other.

Cheers

Pete Lewis

----- Original Message ----- 
From: "Honey George" <honey_george@yahoo.com>
To: <lucene-user@jakarta.apache.org>
Sent: Tuesday, September 14, 2004 6:57 PM
Subject: PorterStemfilter


> Hi,
>  This might be more of a questing related to the
> PorterStemmer algorithm rather than with lucene, but
> if anyone has the knowledge please share.
>
> I am using the PorterStemFilter that some with lucene
> and it turns out that searching for the word 'printer'
> does not return a document containing the text
> 'print'. To narrow down the problem, I have tested the
> PorterStemFilter in a standalone programs and it turns
> out that the stem of printer is 'printer' and not
> 'print'. That is 'printer' is not equal to 'print' +
> 'er', the whole of the word is stem. Can somebody
> explain the behavior.
>
> Thanks & Regards,
>    George
>
>
>
>
>
> ___________________________________________________________ALL-NEW Yahoo!
Messenger - all new features - even more fun!  http://uk.messenger.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message