From java-user-return-36923-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Fri Oct 31 08:07:50 2008 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 58272 invoked from network); 31 Oct 2008 08:07:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 31 Oct 2008 08:07:49 -0000 Received: (qmail 41784 invoked by uid 500); 31 Oct 2008 08:07:47 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 41749 invoked by uid 500); 31 Oct 2008 08:07:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 41738 invoked by uid 99); 31 Oct 2008 08:07:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2008 01:07:47 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [69.44.16.11] (HELO getopt.org) (69.44.16.11) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2008 08:06:31 +0000 Received: from [192.168.0.220] ([81.219.54.251]) (authenticated) by getopt.org (8.11.6/8.11.6) with ESMTP id m9V87CE11620 for ; Fri, 31 Oct 2008 03:07:12 -0500 Message-ID: <490ABC9A.7080201@getopt.org> Date: Fri, 31 Oct 2008 09:06:50 +0100 From: Andrzej Bialecki User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Read all the data from an index References: <359a92830810301309u6f787bfepd0a0783b1663875a@mail.gmail.com> <359a92830810301657m42f727peb56b9f8bc7ec219@mail.gmail.com> In-Reply-To: <359a92830810301657m42f727peb56b9f8bc7ec219@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Erick Erickson wrote: > I'm not sure what *could* be easier than looping with IndexSearcher.doc(), > looping from 1 to maxDoc. Of course you'll have to pay some attention to > whether you get a document back or not, and I'm not quite sure whether you'd > have to worry about getting deleted documents. But I don't think either of > these > really count if the index was optimized Document numbers start at 0. You will never get a document marked "deleted" from either IndexReader or IndexSearcher. Why use IndexSearcher and not IndexReader? IndexReader reader = IndexReader.open(....); for (int i = 0; i < reader.maxDoc(); i++) { if (reader.isDeleted(i)) { continue; } Document doc = reader.document(i); ... } Hint: if you have an unoptimized index with deleted documents, and you want to retrieve also the content of these deleted documents, call first IndexReader.undeleteAll(). -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org