lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murray Altheim <m.alth...@open.ac.uk>
Subject Re: Question about PorterStemFilter class
Date Fri, 29 Oct 2004 17:56:56 GMT
Erik Hatcher wrote:
> On Oct 29, 2004, at 10:39 AM, Murray Altheim wrote:
> 
>>In short, I *think* I'm using a newer version of the code than the
>>one in the repository, plus I've cleaned it up.
> 
> 
> Can you provide some tests that show differences in how it stems 
> between yours and the built-in one?

Certainly, but the test I used is identical to the one that Martin
Porter provides: an input file and an output file. In running my
modified version against the provided output file (which is just a
long list of words), the output is identical. I made no algorithmic
changes to the code, only formatting and syntax-choice changes to
better conform to Java coding guidelines and the aforementioned
predeclaration of final Strings, which has no effect except for
performance.

The test files are identical to the ones on Martin Porter's web
page:

    http://www.tartarus.org/~martin/PorterStemmer/index.html

> The PorterStemFilter is not used by any built-in Analyzers, so I 
> actually think we should move it out to the Analyzers Sandbox area or 
> deprecate it in favor of the Snowball stemmer.  Thoughts?

None. As I mentioned, I'm new to this project and am not familiar
with the advantages of the Snowball stemmer. In reading through the
pages on SourceForge, e.g.,

    http://snowball.tartarus.org/texts/introduction.html

there are apparently pros and cons. But for myself, I'd leave it up
to those with more history in this project to make these kinds of
decisions.

Murray

......................................................................
Murray Altheim                    http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK               .

    [International terrorism] is a fantasy that has been exaggerated
    and distorted by politicians. It is a dark illusion that has
    spread unquestioned through governments around the world, the
    security services, and the international media. In an age when
    all the grand ideas have lost credibility, fear of a phantom
    enemy is all the politicians have left to maintain their power."

    The making of the terror myth, The Guardian
    http://www.guardian.co.uk/terrorism/story/0,12780,1327904,00.html

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message