lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: The logic of QueryParser
Date Mon, 13 Dec 2010 19:35:45 GMT
On Mon, Dec 13, 2010 at 2:10 PM, Brian Hurt <bhurt42@gmail.com> wrote:
> I just encountered an unexpected behavior in query parser.  So, if you pass
> in a query that is multiple terms, like "cat hat", the query that is
> returned uses an or between the two term searches, instead of an and.  That
> is, the query will return all documents with the given field containing
> either "cat" or "hat".  Now, I know about phrase queries, using "\"cat
> hat\"", and I know about +, "+cat +hat".  So there are ways to work around
> the problem- the behavior was just unintuitive for me and several others.  I
> was just wondering what the logic was for defaulting to or instead of and.
>
> I have googled the mailing list archives and didn't find anything.  But if
> this has been discussed to death, please just point me to the threads in the
> archive. rather than stirring up some old flame war.  Or just tell me what
> to google for (the terms I've tried haven't yielded anything useful).
> Thanks.
>

Well its not quite a pure OR query, since it also incorporates
Similarity.coord() which boosts documents that contain more of the
query terms.
But to understand the default, imagine a more natural query of "where
is the cat in the hat".
The default OR query will still give good results, including boosting
documents that contain both 'cat' and 'hat', but with AND you would
get nothing if all of those low-value terms for some reason were not
in that document.

However, if your queries are more restricted, maybe you want to either:

1) adjust Similarity.coord() to make this boost better for your app
(for example, maybe only give a boost if overlap == maxOverlap, and
maybe play with the amount of boost)
 or
2) set your queryParser's default operator to AND with the
.setDefaultOperator() method..., but realize this could exclude very
relevant results that happen to be missing some useless keywords.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message