Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 45554 invoked from network); 7 Mar 2008 14:18:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Mar 2008 14:18:04 -0000 Received: (qmail 70259 invoked by uid 500); 7 Mar 2008 14:17:55 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 70231 invoked by uid 500); 7 Mar 2008 14:17:54 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 70220 invoked by uid 99); 7 Mar 2008 14:17:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Mar 2008 06:17:54 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [130.225.24.87] (HELO luna.statsbiblioteket.dk) (130.225.24.87) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Mar 2008 14:17:08 +0000 Received: from [172.18.249.244] (PC990.sb.statsbiblioteket.dk [172.18.249.244]) by luna.statsbiblioteket.dk (iPlanet Messaging Server 5.2 HotFix 1.16 (built May 14 2003)) with ESMTP id <0JXD00KYZ6D5BM@luna.statsbiblioteket.dk> for java-user@lucene.apache.org; Fri, 07 Mar 2008 15:17:29 +0100 (MET) Date: Fri, 07 Mar 2008 15:17:29 +0100 From: Toke Eskildsen Subject: RE: Swapping between indexes In-reply-to: <068BD1F02ABA4E999506EF880CA98F70@msrvcn04> To: java-user@lucene.apache.org Reply-to: te@statsbiblioteket.dk Message-id: <1204899449.9126.219.camel@PC990.sb.statsbiblioteket.dk> Organization: Statsbiblioteket MIME-version: 1.0 X-Mailer: Evolution 2.8.3 (2.8.3-2.fc6) Content-type: text/plain Content-transfer-encoding: 7BIT References: <227621ad0803060230i485556av7b58bcd6bfbba6cc@mail.gmail.com> <3A267701-C68A-4A99-88D2-3220D6237F0C@mikemccandless.com> <227621ad0803060502j11adff3dg547fc7dbb4da1526@mail.gmail.com> <068BD1F02ABA4E999506EF880CA98F70@msrvcn04> X-Virus-Checked: Checked by ClamAV on apache.org On Thu, 2008-03-06 at 18:40 +0100, spring@gmx.eu wrote: > > > With a commit after every add: 30 min. > > > With a commit after 100 add: 23 min. > > > Only one commit: 20 min. [...] > I think it is a real world scenario because one has always the read the docs > from somewhere and offen has to store the index state somewhere else. Very true, but the time it takes to create the documents varies greatly between systems. I tried repeating your test by creating a simple 14 MB index with 10,000 documents on my desktop-machine. each document was made up of - one non-tokenized unique stored indexed field - one non-tokenized indexed stored field with one of 9 terms - one tokenized field with 930 random characters, including space With a commit after every add: 4 min, 46 sec. With a commit after every 100 add: 12 sec. Only one commit: 8 sec. Guesstimating the amortized time spend on adding each document on such a small corpus, by blatantly ignoring the overhead of creating the documents, gives us the following: With a commit after every add: (286 sec / 10,000 docs) 28.6 ms. With a commit after every 100 add: (12 sec / 10,000 docs) 1.2 ms. Only one commit: (8 sec / 10,000 docs) 0.8 ms. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org