lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: MultiFieldQueryParser and the NOT operator
Date Tue, 29 Jun 2010 14:58:30 GMT
Could you repost this on the users list? This is really the list for
discussing internal development issues. You'll probably get
better/faster answers there...

java-user@lucene.apache.org

Best
Erick

On Mon, Jun 28, 2010 at 10:00 PM, Ivan Brusic <ivan@brusic.com> wrote:

> Thanks for the quick answer.
>
> The regex used to capitalize "not" was not setup properly if it appeared
> at the beginning.  Fixed and the parsing issue is gone.
>
> Now the resulting query is something like:
> +(-(title:microsoft content:microsoft author.name:microsoft
> source.name:microsoft^0.5)) +date:[20100328 TO 20100628]
>
> which is a BooleanQuery with two Must clauses.  The first clause is also a
> Boolean query with only one clause, which happens to be MUST_NOT.  I guess
> that could not work.  How can I rearrange the query in order to work?  Do I
> need to have one clause that contains a records that another clause
> negated?  ex: "-title:microsoft +title:apple".  The preceding works, but
> something like "-title:apple +date:[20100328 TO 20100628]" does not.
>
> Ivan
>
> On Mon, 28 Jun 2010 14:41:50 -0400, Erick Erickson
> <erickerickson@gmail.com> wrote:
> > I'm pretty sure the "not" should be "NOT". All operators need to
> > be all caps according to:
> >
> > http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Boolean
> > operators
> >
> > <http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Boolean
> > operators>Best
> > Erick
> >
> > On Mon, Jun 28, 2010 at 10:54 AM, Ivan Brusic <ivan@brusic.com> wrote:
> >
> >> I inherited some code that does a multi field search based on several
> >> form
> >> values.  One field in particular is used by MultiFieldQueryParser.
> This
> >> field should be able to support a single "not" query.  The result of
> the
> >> MultiFieldQueryParser is then appended with other clauses before a
> search
> >> is actually executed.
> >>
> >> // BEGIN CODE
> >>
> >> private static String[] multiFieldQueryFields = new String[] {
> >>            "title", "content", "author.name", "source.name"};
> >>
> >> private static Map multiFieldQueryFieldsBoosts = new HashMap()
> >> {{put("source.name", 0.5f);}};
> >>
> >> QueryParser luceneQueryParser = new
> >> MultiFieldQueryParser(multiFieldQueryFields, analyzer,
> >> multiFieldQueryFieldsBoosts);
> >>
> luceneQueryParser.setDefaultOperator(MultiFieldQueryParser.AND_OPERATOR);
> >> luceneQueryParser.setAllowLeadingWildcard(true);
> >>
> >> String escapedKeywords = "not microsoft";
> >> Query luceneQuery = luceneQueryParser.parse(escapedKeywords);
> >>
> >> // END CODE
> >>
> >> The result is a BooleanQuery with two clauses: one for "not" and one
> for
> >> "microsoft"
> >>
> >> luceneQuery = {org.apache.lucene.search.BooleanQuery@8497}"+(title:not
> >> content:not author.name:not source.name:not^0.5) +(title:microsoft
> >> content:microsoft author.name:microsoft source.name:microsoft^0.5)"
> >>
> >> I went through the code and it seems like the issues are with
> >> QueryParse.parse() (technically inside TopLevelQuery and Query).  I
> know
> >> that Lucene does not support NOT queries with only one term, but am I
> >> correct to expect "not microsoft" to be one clause and not two?  I do
> >> think
> >> I have other issues with the AND_OPERATOR, but I am trying to solve the
> >> parsing issue first.
> >>
> >> Cheers,
> >>
> >> Ivan
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Mime
View raw message