lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Brusic <i...@brusic.com>
Subject Re: MultiFieldQueryParser and the NOT operator
Date Tue, 29 Jun 2010 02:00:00 GMT
Thanks for the quick answer.

The regex used to capitalize "not" was not setup properly if it appeared
at the beginning.  Fixed and the parsing issue is gone.

Now the resulting query is something like:
+(-(title:microsoft content:microsoft author.name:microsoft
source.name:microsoft^0.5)) +date:[20100328 TO 20100628]

which is a BooleanQuery with two Must clauses.  The first clause is also a
Boolean query with only one clause, which happens to be MUST_NOT.  I guess
that could not work.  How can I rearrange the query in order to work?  Do I
need to have one clause that contains a records that another clause
negated?  ex: "-title:microsoft +title:apple".  The preceding works, but
something like "-title:apple +date:[20100328 TO 20100628]" does not.

Ivan

On Mon, 28 Jun 2010 14:41:50 -0400, Erick Erickson
<erickerickson@gmail.com> wrote:
> I'm pretty sure the "not" should be "NOT". All operators need to
> be all caps according to:
> 
> http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Boolean
> operators
> 
> <http://lucene.apache.org/java/2_4_0/queryparsersyntax.html#Boolean
> operators>Best
> Erick
> 
> On Mon, Jun 28, 2010 at 10:54 AM, Ivan Brusic <ivan@brusic.com> wrote:
> 
>> I inherited some code that does a multi field search based on several
>> form
>> values.  One field in particular is used by MultiFieldQueryParser. 
This
>> field should be able to support a single "not" query.  The result of
the
>> MultiFieldQueryParser is then appended with other clauses before a
search
>> is actually executed.
>>
>> // BEGIN CODE
>>
>> private static String[] multiFieldQueryFields = new String[] {
>>            "title", "content", "author.name", "source.name"};
>>
>> private static Map multiFieldQueryFieldsBoosts = new HashMap()
>> {{put("source.name", 0.5f);}};
>>
>> QueryParser luceneQueryParser = new
>> MultiFieldQueryParser(multiFieldQueryFields, analyzer,
>> multiFieldQueryFieldsBoosts);
>>
luceneQueryParser.setDefaultOperator(MultiFieldQueryParser.AND_OPERATOR);
>> luceneQueryParser.setAllowLeadingWildcard(true);
>>
>> String escapedKeywords = "not microsoft";
>> Query luceneQuery = luceneQueryParser.parse(escapedKeywords);
>>
>> // END CODE
>>
>> The result is a BooleanQuery with two clauses: one for "not" and one
for
>> "microsoft"
>>
>> luceneQuery = {org.apache.lucene.search.BooleanQuery@8497}"+(title:not
>> content:not author.name:not source.name:not^0.5) +(title:microsoft
>> content:microsoft author.name:microsoft source.name:microsoft^0.5)"
>>
>> I went through the code and it seems like the issues are with
>> QueryParse.parse() (technically inside TopLevelQuery and Query).  I
know
>> that Lucene does not support NOT queries with only one term, but am I
>> correct to expect "not microsoft" to be one clause and not two?  I do
>> think
>> I have other issues with the AND_OPERATOR, but I am trying to solve the
>> parsing issue first.
>>
>> Cheers,
>>
>> Ivan
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message