From java-user-return-37609-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Mon Dec 08 19:06:43 2008 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 62113 invoked from network); 8 Dec 2008 19:06:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Dec 2008 19:06:43 -0000 Received: (qmail 93069 invoked by uid 500); 8 Dec 2008 19:06:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 93035 invoked by uid 500); 8 Dec 2008 19:06:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 93024 invoked by uid 99); 8 Dec 2008 19:06:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2008 11:06:48 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mrs.nospam@gmail.com designates 64.233.170.191 as permitted sender) Received: from [64.233.170.191] (HELO rn-out-0910.google.com) (64.233.170.191) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2008 19:05:18 +0000 Received: by rn-out-0910.google.com with SMTP id j71so1197600rne.4 for ; Mon, 08 Dec 2008 11:06:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=8d2MF2hhnVGqX1GFq4vblsbVT+SGmarYnbypRIZrmb0=; b=DarLry7op5KEZaWLieNhaUlkpD3cBhHSKMTjsjWLcuE3hDnHJxddqjJi58S6i79aqn GeZXctcxy+IngASsG9YfbRhWwV/ssPUgjWcaHgBwOFtbQNWw1qdta+ZtBGPsxJy7LdeT F792Xc7Ugf/nqZeh5vEVo72fe2UuzulYx4Wr4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=A3kM9QMHIOkhNdiUHlLBfa2JaWw4mBC1kExBvl2GAnaUUNnTCzaTNaflWGog23QkHg T9NLG/fvXUCH6e8NLgSf+9dHuuX3mkcLr2N1hlnynaekoB+PatTIbb2IUCdiF7rglF9H E42kkhmzX3VYc8hOp3aGuK2q1mdA47bj5PHSE= Received: by 10.150.135.2 with SMTP id i2mr6874969ybd.229.1228763165799; Mon, 08 Dec 2008 11:06:05 -0800 (PST) Received: by 10.151.133.5 with HTTP; Mon, 8 Dec 2008 11:06:05 -0800 (PST) Message-ID: Date: Mon, 8 Dec 2008 14:06:05 -0500 From: "no spam" To: java-user@lucene.apache.org Subject: Re: lucene search options In-Reply-To: <359a92830812080840s75134a1cu4dcf415c16d8591e@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_61930_18850631.1228763165773" References: <6fc71b230806222321k16dd8574u6feab56912189900@mail.gmail.com> <200806231633.11295.daniel@nuix.com> <6fc71b230806222336g17e26d3dr275b25f2252c852b@mail.gmail.com> <68959F25CDEBCB4F917D5A253291D73E07ECD647D8@BLRKECMBX06.ad.infosys.com> <6fc71b230806230053u1de94c39qd2db576402c61eca@mail.gmail.com> <867513fe0812062006l474b6d27l61a15f523ebb51cb@mail.gmail.com> <359a92830812080840s75134a1cu4dcf415c16d8591e@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_61930_18850631.1228763165773 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Yes I've seen that syntax too used to search for null values. You can do -(reporter:* AND -reporter:[* to *]) which says all values minus docs with a value. Your suggestion did the trick, thanks! On Mon, Dec 8, 2008 at 11:40 AM, Erick Erickson wrote: > That'll teach me to scan e-mail. You can't use MatchAllDocsQuery > that way. > What you're actually searching for is the word "matchalldocsquery" > in the field "summary". Which returns nothing. Then you're subtracting > any documents with reporter *mark*. That isn't what you're after at all. > > If you're doing this programmatically, you want something in the > Lucene code like: > > BooleanQuery bq = new BooleanQuery() > bq.add(new MatchAllDocsQuery(), BooleanClause.occur.MUST) > bq.add(, BooleanClause.occur.MUST_NOT) > > > now pass bq to the search method. This will require some work on your > part to detect when it's appropriate and when it's not. But presumably you > have the ability to know that. > > I've seen referenced (but haven't used) something like > reporter:(* TO *) -reporter:*mark* > > WARNING: I've seen this referenced in, I believe, the SOLR mailing > list. I don't know how it plays in straight Lucene,and I have no idea > what the gotcha's are, nor what version of Lucene supports this syntax > efficiently. Furthermore I'm unclear what the behavior for > a document without the reporter field is...... > > But I do know that you can't do what your example does.... > > FWIW > Erick > > > On Mon, Dec 8, 2008 at 10:24 AM, no spam wrote: > > > The reason our users want to do this is because they want to search for > > instances where certain negative conditions are true. My client is the > > news > > industry and this is metadata for things like reporter, type, etc. > > Sometimes you want -reporter:mark for example and this is the only > > criteria > > to search against the index. > > > > Am I thinking about this wrong? > > > > I did try using the MatchAllDocsQuery class and it expands to something > > like > > this: > > > > summary:matchalldocsquery -reporter:*mark* > > > > but I don't get any results which is not what I expect for my does not > > contain query above. > > > > On Sat, Dec 6, 2008 at 11:06 PM, Anshum wrote: > > > > > Hi, > > > > > > An easy way to do that would be to index a particular term with all > docs > > > e.g. "dummyword" could be indexed for all documents as a value for a > > > dummyfield or an existing field. > > > This way lets assume you want to fetch results for -filed1:jakarta > > > You could search for dummyfield:"dummyword" AND NOT filed1:jakarta > > > > > > This is just one of the solution, though I still would not understand > if > > > there's a logical reason for fetching such results.:) > > > > > > -- > > > ------=_Part_61930_18850631.1228763165773--