lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: MultiFieldQueryParser with default AND and stopfilter
Date Wed, 08 Jun 2011 13:27:26 GMT
I guess the base problem is that MFQP only accepts one analyzer.
Presumably you are using different analyzers for your title and desc
fields, and it might do what you wanted if you could pass in a list of
analyzers along with a list of fields.  Sounds like something that
might not be too hard to code, although there may be complications and
catches that I haven't thought of.

You can pass an analyzer to the parse() methods therefore could
perhaps have something like

BooleanQuery bq = new BooleanQuery();
MultiFieldQueryParser mfqp = new ...(...);
Query q1 = mfqp.parse(... title-type-fields[], ..., title-type-analyzer);
Query q2 = mfqp.parse(... desc-type-fields[], ..., desc-type-analyzer);
bq.add(q1);
bq.add(q2);

Failing that, I think you'd have to do it the hard way, building up
the query in code.  Generally not that difficult.


--
Ian.


On Wed, Jun 8, 2011 at 9:52 AM, Elmer <evanchastelet@gmail.com> wrote:
> Hi,
>
> I have a use case in which I use the MultiFieldQueryParser (MFQP) on
> some fields that use and some fields that don't use a stopfilter. The
> default operator of the MFQP is set to AND.
> For example, if the search query is 'the project' (with 'the' included
> in the stoplist) and the search fields are:
>
> title - not using a stopfilter,
> desc - using a stopfilter,
>
> the parsed query becomes:
>
> '+(title:the) +(title:project desc:project)'.
>
> So, the problem is that docs that have the term 'the' only appearing in
> their desc field are excluded from the results. So every query, with AND
> as default operator, that has a stop word in it that only appears in
> fields that use a stop filter will have this problem (or similar, if
> there is at least one field X not using a stopfilter -> no match if a
> stopword from query doesn't appear in field X). Thus, in this example, a
> document with title: 'Lucene project' and desc: 'the open source search
> software from Apache' will not be matched. In my opinion this is not the
> expected behavior. What I'd like to see is that this doc is matched by
> the given query. So, for each token in the query, that appears to be a
> stopword in a field (i.e. some filter filters the token out), I want it
> to be matched instead of not.
>
> Anyone who knows a way to deal with this? I would prefer to keep using
> the MFQP, since I need to support multiple fields, querytime boosting
> and lucene syntax. Or is there a disadvantage by doing this?
>
> Thanks in advance.
>
> BR,
> Elmer van Chastelet
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message