Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 11310 invoked from network); 19 Jul 2006 14:13:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 19 Jul 2006 14:13:39 -0000 Received: (qmail 18577 invoked by uid 500); 19 Jul 2006 14:13:34 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 18544 invoked by uid 500); 19 Jul 2006 14:13:34 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 18529 invoked by uid 99); 19 Jul 2006 14:13:34 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jul 2006 07:13:34 -0700 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,MSGID_FROM_MTA_HEADER,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of dragon-fly999@hotmail.com designates 65.54.246.236 as permitted sender) Received: from [65.54.246.236] (HELO bay0-omc3-s36.bay0.hotmail.com) (65.54.246.236) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Jul 2006 07:13:33 -0700 Received: from hotmail.com ([207.46.9.102]) by bay0-omc3-s36.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 19 Jul 2006 07:13:12 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Wed, 19 Jul 2006 07:13:12 -0700 Message-ID: Received: from 207.46.9.123 by by119fd.bay119.hotmail.msn.com with HTTP; Wed, 19 Jul 2006 14:13:09 GMT X-Originating-IP: [206.35.175.10] X-Originating-Email: [dragon-fly999@hotmail.com] X-Sender: dragon-fly999@hotmail.com From: "Dragon Fly" To: java-user@lucene.apache.org Subject: Re: Empty fields ... Date: Wed, 19 Jul 2006 10:13:09 -0400 Mime-Version: 1.0 Content-Type: text/plain; format=flowed X-OriginalArrivalTime: 19 Jul 2006 14:13:12.0634 (UTC) FILETIME=[784D89A0:01C6AB3D] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thank you very much. >From: "Erick Erickson" >Reply-To: java-user@lucene.apache.org >To: java-user@lucene.apache.org >Subject: Re: Empty fields ... >Date: Wed, 19 Jul 2006 09:48:04 -0400 > >Try something like > >TermDocs termDocs = reader.termDocs(); >termDocs.seek(new Term("", "")); >while (termDocs.next()) { > bits.set(termDocs.doc()); >} > >I *think* (and I'm remembering things folks wrote, haven't done this >myself) >that the empty string for the Term matches all terms. If not, you might >have >to wrap in in an outer loop that loops through all the elements, something >like > > bits = new BitSet(reader.maxDoc()); > > TermDocs termDocs = reader.termDocs(); > FilteredTermEnum fEnum = new FilteredTermEnum(reader, new >Term(field, "")); > > for (Term term = null; (term = fEnum.term()) != null; fEnum.next()) >{ > termDocs.seek(new Term( > field, > term.text())); > > while (termDocs.next()) { > bits.set(termDocs.doc()); > } > } > > > >That said, it may be best for you to loop through each document and add >that >doc to the relevant filters if it had the fields you're interested in. >You'd >only be fetching each document once, so it'd only be one loop. I don't know >enough about relative efficiencies to make a call here, probably depends >upon how many docs you're dealing with. I'd stop at the first solution that >works with acceptable performance unless you expect your corpus to grow >significantly.... And since this is done in off hours, there's not a >pressing reason to go with the very most efficient solution unless it takes >a too long or you expect to have orders of magnitued more documents in your >index eventually. > >Best >Erick _________________________________________________________________ Is your PC infected? Get a FREE online computer virus scan from McAfee� Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org