lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elmer <evanchaste...@gmail.com>
Subject Re: MultiFieldQueryParser with default AND and stopfilter
Date Wed, 08 Jun 2011 14:35:49 GMT
Thank you,

I already use the PerFieldAnalyzerWrapper (by Hibernate Search) ;)
And that's where the problem comes in: different fields using different
analyzers (some with, some without a stopfilter). For each term
(tokenized by MFQP itself?), it applies the given analyzer on each
field. If the analyzer returns no token (occurs on 'the' when using the
PerFieldAnalyzerWrapper for the desc field), that field will not be
included in the clause for that term. (see/re-read the example, maybe
it's more clear what I mean now).

Unfortunately, the solution that Erick gave won't do the trick
> > bq.add(qp.parse("title:(the AND project)", SHOULD))
> > bq.add(qp.parse("desc:(the AND project)", SHOULD))
This still won't match documents where both 'the' and 'project' appear
in DIFFERENT fields (i.e. a document with title: 'Lucene project' and
desc: 'the open source search software from Apache')

I hope it's clear what I mean :) Otherwise, let me know!

BR,
Elmer



On Wed, 2011-06-08 at 14:42 +0100, Ian Lea wrote:
> Except that I think he has loads of other fields and wants to keep it simple.
> 
> But how about passing a PerFieldAnalyzerWrapper instance as the
> analyzer to MFQP?  Worth a try.
> 
> 
> --
> Ian.
> 
> 
> On Wed, Jun 8, 2011 at 2:38 PM, Erick Erickson <erickerickson@gmail.com> wrote:
> > Could you just construct a BooleanQuery with the
> > terms against different fields instead of using MFQP?
> > e.g.
> >
> > bq.add(qp.parse("title:(the AND project)", SHOULD))
> > bq.add(qp.parse("desc:(the AND project)", SHOULD))
> >
> > etc...? If your QueryParser was created with a
> > PerFieldAnalyzerWrapper I think you might get what you
> > want....
> >
> > Note, bad pseudo code there...
> >
> > Best
> > Erick
> >
> > On Wed, Jun 8, 2011 at 4:52 AM, Elmer <evanchastelet@gmail.com> wrote:
> >> Hi,
> >>
> >> I have a use case in which I use the MultiFieldQueryParser (MFQP) on
> >> some fields that use and some fields that don't use a stopfilter. The
> >> default operator of the MFQP is set to AND.
> >> For example, if the search query is 'the project' (with 'the' included
> >> in the stoplist) and the search fields are:
> >>
> >> title - not using a stopfilter,
> >> desc - using a stopfilter,
> >>
> >> the parsed query becomes:
> >>
> >> '+(title:the) +(title:project desc:project)'.
> >>
> >> So, the problem is that docs that have the term 'the' only appearing in
> >> their desc field are excluded from the results. So every query, with AND
> >> as default operator, that has a stop word in it that only appears in
> >> fields that use a stop filter will have this problem (or similar, if
> >> there is at least one field X not using a stopfilter -> no match if a
> >> stopword from query doesn't appear in field X). Thus, in this example, a
> >> document with title: 'Lucene project' and desc: 'the open source search
> >> software from Apache' will not be matched. In my opinion this is not the
> >> expected behavior. What I'd like to see is that this doc is matched by
> >> the given query. So, for each token in the query, that appears to be a
> >> stopword in a field (i.e. some filter filters the token out), I want it
> >> to be matched instead of not.
> >>
> >> Anyone who knows a way to deal with this? I would prefer to keep using
> >> the MFQP, since I need to support multiple fields, querytime boosting
> >> and lucene syntax. Or is there a disadvantage by doing this?
> >>
> >> Thanks in advance.
> >>
> >> BR,
> >> Elmer van Chastelet
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message