Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 41213 invoked from network); 17 Jul 2007 22:52:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Jul 2007 22:52:56 -0000 Received: (qmail 59294 invoked by uid 500); 17 Jul 2007 22:52:50 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 59248 invoked by uid 500); 17 Jul 2007 22:52:50 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 59237 invoked by uid 99); 17 Jul 2007 22:52:50 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jul 2007 15:52:50 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [66.111.4.26] (HELO out2.smtp.messagingengine.com) (66.111.4.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jul 2007 15:52:45 -0700 Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 966549F41 for ; Tue, 17 Jul 2007 18:52:23 -0400 (EDT) Received: from web6.messagingengine.com ([10.202.2.215]) by compute1.internal (MEProxy); Tue, 17 Jul 2007 18:52:23 -0400 Received: by web6.messagingengine.com (Postfix, from userid 99) id 7A9BC146F1; Tue, 17 Jul 2007 18:52:23 -0400 (EDT) Message-Id: <1184712743.2393.1200679767@webmail.messagingengine.com> X-Sasl-Enc: Cipfx095RsxWBh9KGFRajJZqYo88lGrJu0FdVuSpKm4B 1184712743 From: "Michael McCandless" To: java-dev@lucene.apache.org Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface References: <758071.70810.qm@web23010.mail.ird.yahoo.com> <1184328822.14866.1199984061@webmail.messagingengine.com> <1184337302.8010.1200009223@webmail.messagingengine.com> Subject: Re: Post mortem kudos for (LUCENE-843) :) In-Reply-To: Date: Tue, 17 Jul 2007 18:52:23 -0400 X-Virus-Checked: Checked by ClamAV on apache.org "Peter Keegan" wrote: > I did some performance comparison testing of Lucene 2.0 vs. trunk (with > LUCENE-843). I'm seeing at least a 4X increase in indexing rate with the new > DocumentsWriter in LUCENE-843 (still doing single-threaded indexing). Better > yet, the total time to build the index is much shorter because I can now > build the entire 3GB index (900K docs) in one segment in RAM (using > FSDirectory) and flush it to disk at the end. Before, I had to build smaller > segments (20K docs), merge after 20 segments and then optimize at the end. Awesome :) > The memory usage with LUCENE-843 is much lower, presumably because stored > fields and term vectors no longer sit in RAM. Right, not buffering the stored fields & term vectors in RAM is a big win. In addition, the storage of the postings in RAM as a single shared hash table using a pool of large byte[] arrays vs separate 1 KB buffers for the files for a single segment document, also improve RAM efficiency. In my tests, using Europarl content with small docs (~100 terms = ~550 bytes per doc) with stored fields & term vectors enabled the RAM efficiency is 44X better than before. > I also observed a 20-25% gain by reusing the Field objects. Implementing my > own Fieldable class was too complicated, so I simply extended the Field > class (after removing final) and added 2 setter methods: > > public void setValue(String value) { > this.fieldsData = value; > } > public void setValue(byte[] value) { > this.fieldsData = value; > } > > Since this improved performance significantly, I would vote to either add > setters to Field or make it extendable. OK I've opened LUCENE-963 for this & attached a patch. > Kudos to Mike for this huge improvement! Thanks! Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org