lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murray Altheim <m.alth...@open.ac.uk>
Subject Re: Question about PorterStemFilter class
Date Fri, 29 Oct 2004 13:20:34 GMT
Erik Hatcher wrote:
> On Oct 29, 2004, at 4:04 AM, PROYECTA.Fernandez Garcia, Ivan wrote:
> 
>>	We are using it in our Analyzer class and we have the following
>>questions:
>>		1º Why does it change 'y' to 'i' character using parser
>>method?.
>>		    Instance: study -> studi
> 
> 
> That's what stemmers do.  This allows queries for "study" and "studies" 
> to match the same documents, for example.
> 
> 
>>		2º In our case, Lucene has searches 50 hits and is showed
>>the first one only.
>>		    If I comment new PorterStemFilter(ts) from our Analyzer
>>class. All 50 hits is showed. Why?
> 
> You haven't provided enough information.   Please provide a simple 
> short example that shows one document (that currently does not get 
> found) being indexed along with the code for your analyzer, along with 
> a sample query that should match but doesn't.

Erik,

I just this week joined the mailing list, and on this topic thought
I'd mention that I've rewritten the PorterStemmer Java class, cleaning
up whitespace and predeclaring all the Strings for better performance.
It passes the file-in file-out test provided by Martin Porter (iow,
no change from the existing algorithm). The source for mine was taken
from his site -- I'm not sure of the origin of the one in Lucene. I
could also add an Apache license to the top.

What would I need to do to contribute this file? Just fill out the
ASF IP form and then commit the file in CVS?

Thanks,

Murray

......................................................................
Murray Altheim                    http://kmi.open.ac.uk/people/murray/
Knowledge Media Institute
The Open University, Milton Keynes, Bucks, MK7 6AA, UK               .

    [International terrorism] is a fantasy that has been exaggerated
    and distorted by politicians. It is a dark illusion that has
    spread unquestioned through governments around the world, the
    security services, and the international media. In an age when
    all the grand ideas have lost credibility, fear of a phantom
    enemy is all the politicians have left to maintain their power."

    The making of the terror myth, The Guardian
    http://www.guardian.co.uk/terrorism/story/0,12780,1327904,00.html

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message