From java-user-return-37596-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Mon Dec 08 14:28:54 2008 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 19336 invoked from network); 8 Dec 2008 14:28:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Dec 2008 14:28:54 -0000 Received: (qmail 36440 invoked by uid 500); 8 Dec 2008 14:28:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 36406 invoked by uid 500); 8 Dec 2008 14:28:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 36395 invoked by uid 99); 8 Dec 2008 14:28:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2008 06:28:58 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.198.227 as permitted sender) Received: from [209.85.198.227] (HELO rv-out-0506.google.com) (209.85.198.227) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2008 14:27:28 +0000 Received: by rv-out-0506.google.com with SMTP id f6so1472506rvb.5 for ; Mon, 08 Dec 2008 06:28:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=cy3JQEhEgVa/XVYL3rMrshKS3Iq4suSNGN9yOTBGsUE=; b=lNIXV8JCelm3yWdp8flkcJ+6IKc+hDKNupyNL4Xm/98+BNLoq43Z3L2RiQpWoXqhAi +9L9ZMT961c0j1yE3oGZ0OrlCyc167TQjUud3eT7Wb0Q7G3EiFzF9ZOcgR+CUyPo1cka OnQFTcQ/dsq06tOmsAZ4eH/g2OccWI84hSw44= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=d1bTr4GIiQ1znaBn9Fr4CfEcJTSQ9iX0MCo+agqzt3+Cdm6qqosdLCdzuWQ1ahx/yv wCQkGAEAWTt1vOpv62VnLup/E6cl7ZA7YFR/bI97cdMHLk/vvjkjJDRnq7qbUghNQNR0 E6UgQh70eXaofhun9S7AO4wM4pKgGlvjrC9N4= Received: by 10.141.105.18 with SMTP id h18mr1685573rvm.109.1228746496890; Mon, 08 Dec 2008 06:28:16 -0800 (PST) Received: by 10.141.203.20 with HTTP; Mon, 8 Dec 2008 06:28:16 -0800 (PST) Message-ID: <359a92830812080628h644e4a5bofd28d7860953b84e@mail.gmail.com> Date: Mon, 8 Dec 2008 09:28:16 -0500 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: TopDocs - Get all docs? In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_60117_28416736.1228746496882" References: <5d53d5770812051457q76c3c47fw820e48ad7c7520d0@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_60117_28416736.1228746496882 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline is empid indexed? If it is this should run *much* faster if you used TermEnum/TermDocs to fetch all the empids...... FWIW Erick On Mon, Dec 8, 2008 at 9:17 AM, Donna L Gresh wrote: > I have a need to get the list of all "empid"s (defined by me) in the index > so that I can remove the ones that are "stale" by my definition; in this > snippet I'm returning all the "empids" for later processing, but the core > is very simple. > > public Vector getIndexIds() throws Exception { > > Vector vec = new Vector(); > IndexReader ireader = IndexReader.open(directoryName); > int numdocs = ireader.numDocs(); > for (int i=0; i Document doc = ireader.document(i); > Field field = doc.getField("empid"); > if (field==null) { > continue; > } > String contents = field.stringValue(); > vec.add(contents); > } > return vec; > } > > Donna L. Gresh > Business Analytics and Mathematical Sciences > IBM T.J. Watson Research Center > (914) 945-2472 > http://www.research.ibm.com/people/g/donnagresh > gresh@us.ibm.com > > > "Ian Vink" wrote on 12/05/2008 05:57:20 PM: > > > Is there an easy way to get all the documents in the index? > > Kinda like this: > > > > TopDocs everything = ???.GetAllDocuments(); > ------=_Part_60117_28416736.1228746496882--