lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mailing Lists Account" <mli...@imorph.com>
Subject Re: Phrase query and porter stemmer
Date Thu, 13 Feb 2003 12:07:01 GMT
Hi Eric,

Thanks for the reply.  The option of custom token filter sounds good to
me. I am not sure what is the
advantage of Token.setPositionIncrement() option. Let me look into the
docs before I ask further
questions on this.

regards
Ramesh

Eric Isakson wrote:
> You won't get hits for "security" if you do not use the stemmer. The
> stem of "security" is the token that gets stored in the index.
>
> If you don't use the stemming algorithm when you create the index you
> could search for "security" and only get those documents that contain
> "security".
>
> See the FAQ
>
http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.indexi
> ng&toc=faq#q15
>
> If you have a list of terms you want to treat differently (i.e. you
> know there are certain words you don't want to stem) you could build
> a custom TokenFilter that checks the tokens for those words before
> applying the stemming algorithm then add that TokenFilter to your
> analyzer. You might also consider allowing the tokens to be stemmed
> and adding the original non-stemmed term at the same position using
> Token.setPositionIncrement(0), you might also want to figure out some
> way to boost the score on those non-stemmed tokens when you build
> your query (not sure how you might accomplish that, but some custom
> query parsing code could do the trick).
>
> Eric
>
> -----Original Message-----
> From: Mailing Lists Account [mailto:mlists@imorph.com]
> Sent: Wednesday, February 12, 2003 4:17 AM
> To: lucene-user@jakarta.apache.org
> Subject: Phrase query and porter stemmer
>
>
> Hi,
>
> I use PorterStemmer with my analyzer for indexing the documents.
> And I have been using the same analyzer for searching too.
>
> When I search for a phrase like "security" AND database, I would like
> to avoid matches for
> terms like "secure" or "securities" .  I observed that Google and
> couple of search engines do
> not return such matches.
>
> 1) In otherwords, in a single query, is it possible not to choose
> porter stemmer for phrase queries and
>     use for other queries (such as Term query etc)
>
> 2) As an alternative, is it advisable to manually construct a
> PhraseQuery by adding terms without appling porter
>    stemmer ?
>
> regards
> Ramesh
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message