lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: MultiFieldQueryParser with default AND and stopfilter
Date Wed, 08 Jun 2011 14:22:35 GMT
You're right, that's a better place to start....

Erick

On Wed, Jun 8, 2011 at 9:42 AM, Ian Lea <ian.lea@gmail.com> wrote:
> Except that I think he has loads of other fields and wants to keep it simple.
>
> But how about passing a PerFieldAnalyzerWrapper instance as the
> analyzer to MFQP?  Worth a try.
>
>
> --
> Ian.
>
>
> On Wed, Jun 8, 2011 at 2:38 PM, Erick Erickson <erickerickson@gmail.com> wrote:
>> Could you just construct a BooleanQuery with the
>> terms against different fields instead of using MFQP?
>> e.g.
>>
>> bq.add(qp.parse("title:(the AND project)", SHOULD))
>> bq.add(qp.parse("desc:(the AND project)", SHOULD))
>>
>> etc...? If your QueryParser was created with a
>> PerFieldAnalyzerWrapper I think you might get what you
>> want....
>>
>> Note, bad pseudo code there...
>>
>> Best
>> Erick
>>
>> On Wed, Jun 8, 2011 at 4:52 AM, Elmer <evanchastelet@gmail.com> wrote:
>>> Hi,
>>>
>>> I have a use case in which I use the MultiFieldQueryParser (MFQP) on
>>> some fields that use and some fields that don't use a stopfilter. The
>>> default operator of the MFQP is set to AND.
>>> For example, if the search query is 'the project' (with 'the' included
>>> in the stoplist) and the search fields are:
>>>
>>> title - not using a stopfilter,
>>> desc - using a stopfilter,
>>>
>>> the parsed query becomes:
>>>
>>> '+(title:the) +(title:project desc:project)'.
>>>
>>> So, the problem is that docs that have the term 'the' only appearing in
>>> their desc field are excluded from the results. So every query, with AND
>>> as default operator, that has a stop word in it that only appears in
>>> fields that use a stop filter will have this problem (or similar, if
>>> there is at least one field X not using a stopfilter -> no match if a
>>> stopword from query doesn't appear in field X). Thus, in this example, a
>>> document with title: 'Lucene project' and desc: 'the open source search
>>> software from Apache' will not be matched. In my opinion this is not the
>>> expected behavior. What I'd like to see is that this doc is matched by
>>> the given query. So, for each token in the query, that appears to be a
>>> stopword in a field (i.e. some filter filters the token out), I want it
>>> to be matched instead of not.
>>>
>>> Anyone who knows a way to deal with this? I would prefer to keep using
>>> the MFQP, since I need to support multiple fields, querytime boosting
>>> and lucene syntax. Or is there a disadvantage by doing this?
>>>
>>> Thanks in advance.
>>>
>>> BR,
>>> Elmer van Chastelet
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message