Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 69333 invoked from network); 4 Feb 2005 21:03:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 4 Feb 2005 21:03:11 -0000 Received: (qmail 15049 invoked by uid 500); 4 Feb 2005 21:03:07 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 15029 invoked by uid 500); 4 Feb 2005 21:03:07 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 15015 invoked by uid 99); 4 Feb 2005 21:03:07 -0000 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=FORGED_RCVD_HELO X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from smtp-vbr11.xs4all.nl (HELO smtp-vbr11.xs4all.nl) (194.109.24.31) by apache.org (qpsmtpd/0.28) with ESMTP; Fri, 04 Feb 2005 13:03:05 -0800 Received: from k8l.lan (porta.xs4all.nl [80.127.24.69]) by smtp-vbr11.xs4all.nl (8.12.11/8.12.11) with ESMTP id j14L33UM079573 for ; Fri, 4 Feb 2005 22:03:03 +0100 (CET) (envelope-from paul.elschot@xs4all.nl) From: Paul Elschot To: lucene-user@jakarta.apache.org Subject: Re: Searching for doc without a field Date: Fri, 4 Feb 2005 22:03:02 +0100 User-Agent: KMail/1.5.4 References: <17786330dc86b97c0eadfe5c5e0554e0@otherwise.com> <0036435c21ffa029df9cb7f63144bc09@otherwise.com> In-Reply-To: <0036435c21ffa029df9cb7f63144bc09@otherwise.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200502042203.02758.paul.elschot@xs4all.nl> X-Virus-Scanned: by XS4ALL Virus Scanner X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Friday 04 February 2005 17:29, Bill Tschumy wrote: > > On Feb 4, 2005, at 10:19 AM, Bill Tschumy wrote: > > > > > On Feb 3, 2005, at 2:04 PM, Paul Elschot wrote: > > > >> On Thursday 03 February 2005 20:18, Bill Tschumy wrote: > >>> Is there any way to construct a query to locate all documents > >>> without a > >>> specific field? By this I mean the Document was created without ever > >>> having that field added to it. > >> > >> One way is to add an extra document field containing the field > >> names of all (other) indexed fields in the document. > >> Assuming there is always a primary key field the query is then: > >> > >> +fieldnames:primarykeyfield -fieldnames:specificfield > >> > >> Regards, > >> Paul Elschot > > > > Paul, > > > > Thanks for the suggestion, but I need to do this on an existing > > database as it is. > > > > It just occurred to me that I should try a query on the field with a > > value of NULL. Don't know if that will work or not. > > Nope, using null as a search value just result in a > NullPointerException. It's not impossible, but the problem is that the term index is first sorted by field name, then by term text, then by document number, and then by term position within document. That means that the index path is no good to query for field name and document number: you have to check all indexed terms in between. Lucene "only" allows to find the existence of a indexed field, the indexed terms (field name + term text) in sorted order from a given term, and the indexed documents of a term, possibly combined with the with the term positions within each document. The solution above shortcuts the index path by putting the field name in place of the term text for a special field. Regards, Paul Elschot. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org