Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 14111 invoked from network); 1 Apr 2009 18:28:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Apr 2009 18:28:21 -0000 Received: (qmail 41796 invoked by uid 500); 1 Apr 2009 18:28:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 41705 invoked by uid 500); 1 Apr 2009 18:28:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 41695 invoked by uid 99); 1 Apr 2009 18:28:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2009 18:28:18 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.200.173] (HELO wf-out-1314.google.com) (209.85.200.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Apr 2009 18:28:10 +0000 Received: by wf-out-1314.google.com with SMTP id 29so164807wff.20 for ; Wed, 01 Apr 2009 11:27:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.14.18 with SMTP id 18mr3211201wfn.215.1238610468538; Wed, 01 Apr 2009 11:27:48 -0700 (PDT) In-Reply-To: <8837fb770904011104lb86eaccr7809239f7467d709@mail.gmail.com> References: <8837fb770903311241p45483892jf58b4779b1e6f14c@mail.gmail.com> <8837fb770903311358n77aa2065o5f8f1489f319264@mail.gmail.com> <8837fb770903311923m3d159efbtb0c9bc91e2f1928b@mail.gmail.com> <9ac0c6aa0904010102t67590b1asde251d2552bf919f@mail.gmail.com> <8837fb770904010913p4de0d885n22ee8da565bf2d40@mail.gmail.com> <9ac0c6aa0904011017m3bbe46a3xa8279c0c58a7a443@mail.gmail.com> <8837fb770904011104lb86eaccr7809239f7467d709@mail.gmail.com> Date: Wed, 1 Apr 2009 14:27:48 -0400 Message-ID: <9ac0c6aa0904011127n35373d62h6ede92d30e4369d2@mail.gmail.com> Subject: Re: IndexWriter.deleteDocuments(Query query) From: Michael McCandless To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Apr 1, 2009 at 2:04 PM, John Wang wrote: > My test essentially this. I took out the reader.deleteDocuments call from > both scenarios. I took a index of 5m docs. a batch of 10000 randomly > generated uids. > > Compared the following scenarios: > 1) > * open index reader > * for each uid in the batch, find the corresponding docid and add to an > IntList. > *close reader How exactly do you find the corresponding docid? TermDocs? > 2) > * open index reader > * load uid array from payload field > * iterate uid array, and check to see if uid is in deleted set, and add to > an IntList In this case, each doc has a dedicated field that only has a payload that stores the one uid for that doc? But I'm confused how you then map from uid -> docID. I must be missing something. > The datastructure holding deleted set is IntOpenHashSet from fastutil. > > 1) took about 3500 - 4500 ms > 2) took about 815 ms Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org