lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Question about PorterStemFilter class
Date Fri, 29 Oct 2004 14:25:47 GMT

On Oct 29, 2004, at 10:09 AM, Otis Gospodnetic wrote:
> You should open a bug entry in Bugzilla and then attach your code to
> it, with ASL on top.

However, there is a PorterStemFilter built into Lucene.  Please compare 
with that.

	Erik


>
> Thanks,
> Otis
>
> --- Murray Altheim <m.altheim@open.ac.uk> wrote:
>
>> Erik Hatcher wrote:
>>> On Oct 29, 2004, at 4:04 AM, PROYECTA.Fernandez Garcia, Ivan wrote:
>>>
>>>> 	We are using it in our Analyzer class and we have the following
>>>> questions:
>>>> 		1º Why does it change 'y' to 'i' character using parser
>>>> method?.
>>>> 		    Instance: study -> studi
>>>
>>>
>>> That's what stemmers do.  This allows queries for "study" and
>> "studies"
>>> to match the same documents, for example.
>>>
>>>
>>>> 		2º In our case, Lucene has searches 50 hits and is showed
>>>> the first one only.
>>>> 		    If I comment new PorterStemFilter(ts) from our Analyzer
>>>> class. All 50 hits is showed. Why?
>>>
>>> You haven't provided enough information.   Please provide a simple
>>> short example that shows one document (that currently does not get
>>> found) being indexed along with the code for your analyzer, along
>> with
>>> a sample query that should match but doesn't.
>>
>> Erik,
>>
>> I just this week joined the mailing list, and on this topic thought
>> I'd mention that I've rewritten the PorterStemmer Java class,
>> cleaning
>> up whitespace and predeclaring all the Strings for better
>> performance.
>> It passes the file-in file-out test provided by Martin Porter (iow,
>> no change from the existing algorithm). The source for mine was taken
>> from his site -- I'm not sure of the origin of the one in Lucene. I
>> could also add an Apache license to the top.
>>
>> What would I need to do to contribute this file? Just fill out the
>> ASF IP form and then commit the file in CVS?
>>
>> Thanks,
>>
>> Murray
>>
>>
> ......................................................................
>> Murray Altheim
>> http://kmi.open.ac.uk/people/murray/
>> Knowledge Media Institute
>> The Open University, Milton Keynes, Bucks, MK7 6AA, UK
>> .
>>
>>     [International terrorism] is a fantasy that has been exaggerated
>>     and distorted by politicians. It is a dark illusion that has
>>     spread unquestioned through governments around the world, the
>>     security services, and the international media. In an age when
>>     all the grand ideas have lost credibility, fear of a phantom
>>     enemy is all the politicians have left to maintain their power."
>>
>>     The making of the terror myth, The Guardian
>>     http://www.guardian.co.uk/terrorism/story/0,12780,1327904,00.html
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message