lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mailing Lists Account" <>
Subject Re: Phrase query and porter stemmer
Date Thu, 13 Feb 2003 12:07:01 GMT
Hi Eric,

Thanks for the reply.  The option of custom token filter sounds good to
me. I am not sure what is the
advantage of Token.setPositionIncrement() option. Let me look into the
docs before I ask further
questions on this.


Eric Isakson wrote:
> You won't get hits for "security" if you do not use the stemmer. The
> stem of "security" is the token that gets stored in the index.
> If you don't use the stemming algorithm when you create the index you
> could search for "security" and only get those documents that contain
> "security".
> See the FAQ
> ng&toc=faq#q15
> If you have a list of terms you want to treat differently (i.e. you
> know there are certain words you don't want to stem) you could build
> a custom TokenFilter that checks the tokens for those words before
> applying the stemming algorithm then add that TokenFilter to your
> analyzer. You might also consider allowing the tokens to be stemmed
> and adding the original non-stemmed term at the same position using
> Token.setPositionIncrement(0), you might also want to figure out some
> way to boost the score on those non-stemmed tokens when you build
> your query (not sure how you might accomplish that, but some custom
> query parsing code could do the trick).
> Eric
> -----Original Message-----
> From: Mailing Lists Account []
> Sent: Wednesday, February 12, 2003 4:17 AM
> To:
> Subject: Phrase query and porter stemmer
> Hi,
> I use PorterStemmer with my analyzer for indexing the documents.
> And I have been using the same analyzer for searching too.
> When I search for a phrase like "security" AND database, I would like
> to avoid matches for
> terms like "secure" or "securities" .  I observed that Google and
> couple of search engines do
> not return such matches.
> 1) In otherwords, in a single query, is it possible not to choose
> porter stemmer for phrase queries and
>     use for other queries (such as Term query etc)
> 2) As an alternative, is it advisable to manually construct a
> PhraseQuery by adding terms without appling porter
>    stemmer ?
> regards
> Ramesh
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message