lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie <ja...@stimulussoft.com>
Subject Re: Search query problem
Date Sat, 09 Jan 2010 09:00:50 GMT
Hi All

Is there another stemmer we can use that is perhaps not as aggressive as 
the Porter Stemmer. i.e. the stemming could remove ing's, er's, but not 
something so significant as to convert ""Lowe's" to "Low"

Thanks

Jamie

Will Murnane wrote:
> On Fri, Jan 8, 2010 at 16:27, Jamie <jamie@stimulussoft.com> wrote:
>   
>> Hi Ian / Will
>>
>> Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
>> could check the capitalization of the first letter of a word and whether or
>> not the word is the start of sentence. If so, it could choose not apply any
>> stemming. Or am I completely out of whack?
>>     
> Look again: you're downcasing the terms before the Porter filter ever
> sees them (which is, AIUI, necessary).  You might do well to combine
> the tokenizing and downcasing step with some heuristic to find proper
> nouns and not downcase or stem them.
>
> Will
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   


-- 
Stimulus Software - MailArchiva
Email Archiving And Compliance
USA Tel: +1-713-343-8824 ext 100
UK Tel: +44-20-80991035 ext 100
Email:  jamie@stimulussoft.com
Web: http://www.mailarchiva.com
To receive MailArchiva Enterprise Edition product announcements, send a message to: <mailarchiva-enterprise-edition-subscribe@stimulussoft.com>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message