Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E99C9177F9 for ; Wed, 25 Feb 2015 12:23:41 +0000 (UTC) Received: (qmail 67220 invoked by uid 500); 25 Feb 2015 12:23:37 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 67157 invoked by uid 500); 25 Feb 2015 12:23:37 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 67145 invoked by uid 99); 25 Feb 2015 12:23:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Feb 2015 12:23:37 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of apache@elyograg.org designates 166.70.79.219 as permitted sender) Received: from [166.70.79.219] (HELO frodo.elyograg.org) (166.70.79.219) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Feb 2015 12:23:33 +0000 Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id B463EA677 for ; Wed, 25 Feb 2015 05:22:49 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1424866969; bh=fzf4RuI+O6wxk+zNoA5/imXRLpdbSdcCI07qtXnTSHo=; b=LfaBe7m7xgaY JJYvaSC9HPjcDmp0g5uzbUc+k+SGL3k9g88006iOnezclMCwbROn861PKmCBCl77 u+194cYeBedXawa0TLL5g9rnulnRCNd40G/MIFM+3fg3TUgcw18MaHLffnEhDD0e fkLOV61D3nPh1mHD5eTELyOK+PiqO8w= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id dnc0k0dBIZMS for ; Wed, 25 Feb 2015 05:22:49 -0700 (MST) Received: from [192.168.1.102] (102.int.elyograg.org [192.168.1.102]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id 4A864A666 for ; Wed, 25 Feb 2015 05:22:49 -0700 (MST) Message-ID: <54EDBEA6.7070501@elyograg.org> Date: Wed, 25 Feb 2015 05:23:02 -0700 From: Shawn Heisey User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: Problem with queries that includes NOT References: In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org On 2/25/2015 4:04 AM, david.davila@correo.aeat.es wrote: > We have problems with some queries. All of them include the tag NOT, and > in my opinion, the results donĀ“t make any sense. > > First problem: > > This query " NOT Proc:ID01 " returns 95806 results, however this one " > NOT Proc:ID01 OR FileType:PDF_TEXT" returns 11484 results. But it's > impossible that adding a tag OR the query has less number of results. > > Second problem. Here the problem is because of the brackets and the NOT > tag: > > This query: > > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT) AND sys_FileType:PROTOTIPE > returns 0 documents. > > But this query: > > (NOT Proc:"ID01" AND NOT FileType:PDF_TEXT AND sys_FileType:PROTOTIPE) > returns 53 documents, which is correct. So, the problem is the position of > the bracket. I have checked the same query without NOTs, and it works fine > returning the same number of results in both cases. So, I think the > problem is the combination of the bracket positions and the NOT tag. For the first query, there is a difference between "NOT condition1 OR condition2" and "NOT (condition1 OR condition2)" ... I can imagine the first one increasing the document count compared to just "NOT condition1" ... the second one wouldn't increase it. Boolean queries in Solr (and very likely Lucene as well) do not always do what people expect. http://robotlibrarian.billdueber.com/2011/12/solr-and-boolean-operators/ https://lucidworks.com/blog/why-not-and-or-and-not/ As mentioned in the second link above, you'll get better results if you use the prefix operators with explicit parentheses. One word of warning, though -- the prefix operators do not work correctly if you change the default operator to AND. Thanks, Shawn