lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elmer <evanchaste...@gmail.com>
Subject MultiFieldQueryParser with default AND and stopfilter
Date Wed, 08 Jun 2011 08:52:57 GMT
Hi,

I have a use case in which I use the MultiFieldQueryParser (MFQP) on
some fields that use and some fields that don't use a stopfilter. The
default operator of the MFQP is set to AND.
For example, if the search query is 'the project' (with 'the' included
in the stoplist) and the search fields are:

title - not using a stopfilter,
desc - using a stopfilter,

the parsed query becomes:

'+(title:the) +(title:project desc:project)'.

So, the problem is that docs that have the term 'the' only appearing in
their desc field are excluded from the results. So every query, with AND
as default operator, that has a stop word in it that only appears in
fields that use a stop filter will have this problem (or similar, if
there is at least one field X not using a stopfilter -> no match if a
stopword from query doesn't appear in field X). Thus, in this example, a
document with title: 'Lucene project' and desc: 'the open source search
software from Apache' will not be matched. In my opinion this is not the
expected behavior. What I'd like to see is that this doc is matched by
the given query. So, for each token in the query, that appears to be a
stopword in a field (i.e. some filter filters the token out), I want it
to be matched instead of not.

Anyone who knows a way to deal with this? I would prefer to keep using
the MFQP, since I need to support multiple fields, querytime boosting
and lucene syntax. Or is there a disadvantage by doing this?

Thanks in advance.

BR,
Elmer van Chastelet 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message