lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Murnane <will.murn...@gmail.com>
Subject Re: Search query problem
Date Fri, 08 Jan 2010 21:39:07 GMT
On Fri, Jan 8, 2010 at 16:27, Jamie <jamie@stimulussoft.com> wrote:
> Hi Ian / Will
>
> Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
> could check the capitalization of the first letter of a word and whether or
> not the word is the start of sentence. If so, it could choose not apply any
> stemming. Or am I completely out of whack?
Look again: you're downcasing the terms before the Porter filter ever
sees them (which is, AIUI, necessary).  You might do well to combine
the tokenizing and downcasing step with some heuristic to find proper
nouns and not downcase or stem them.

Will

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message