lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: QueryParser, phrases and stopwords
Date Thu, 16 Jun 2005 02:17:34 GMT

On Jun 15, 2005, at 12:12 PM, Mike Barry wrote:

> I have a situation where a query such as "climate control" is  
> returning
> documents with the phrase "climate of control".  (I'm using  
> QueryParser).
>
> After searching, I found  the similar issue on the mailing list from
> Greg Robertson
> with a patch from Steve Rowe.
>
> Looking at the source repository for StopFilter.java, the patch was  
> applied
> in November of 2003 and then reverted in Dec 2003 (by Erik), with  
> the note:
>
> revert position increment change due to conflict with PhraseQuery
>
> (the patch  incremented the token position to inhibit exact  
> matching across
> removed stopword(s)).
>
> I couldn't find any info on how/why this approach conflicted with
> PhraseQuery.
> Can anyone elighten me on this? Does anyone know of a way to inhibit
> exact matching across removed stopwords(s)?

PhraseQuery originally did not account for gaps left in the terms of  
the phrase.

PhraseQuery was modified last year to allow for this though:

r150509 | goller | 2004-09-15 05:38:50 -0400 (Wed, 15 Sep 2004) | 5  
lines

PhraseQuery and PhrasePrefixQuery are extended. It's now
possible to specify the relative position of a term within
a phrase. This allows gaps and multiple terms at the same
position.
-----

So we could change StopFilter to put the gaps back in safely now, I  
think.

Thoughts?

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message