Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54759 invoked from network); 10 Jun 2005 04:58:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 10 Jun 2005 04:58:44 -0000 Received: (qmail 73104 invoked by uid 500); 10 Jun 2005 04:58:39 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 73074 invoked by uid 500); 10 Jun 2005 04:58:38 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 73060 invoked by uid 99); 10 Jun 2005 04:58:38 -0000 X-ASF-Spam-Status: No, hits=1.8 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from web50608.mail.yahoo.com (HELO web50608.mail.yahoo.com) (206.190.38.95) by apache.org (qpsmtpd/0.28) with SMTP; Thu, 09 Jun 2005 21:58:36 -0700 Received: (qmail 92272 invoked by uid 60001); 10 Jun 2005 04:58:14 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=lPcBV6JEMtxU+40flSR81Na1TInSLflTQk3TNfG94TZiH1ZJnSkaxfvinKCtDfca72XUtALnktqNajq3ESdrbzbRwfJduI4E+lW0pFs+22oGKMUq0s7AwoIo8KeIZmMQcW4sR2zpyqgOVKS+UQXNeF18j0yT5jIiQT2biFDci6U= ; Message-ID: <20050610045814.92270.qmail@web50608.mail.yahoo.com> Received: from [67.188.144.134] by web50608.mail.yahoo.com via HTTP; Thu, 09 Jun 2005 21:58:14 PDT Date: Thu, 9 Jun 2005 21:58:14 -0700 (PDT) From: Chris Collins Subject: Re: Optimizing indexes with mulitiple processors? To: java-user@lucene.apache.org, Bill Au In-Reply-To: <20050610005238.82891.qmail@web50601.mail.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N To follow up. I was surprised to find that from the experiment of indexing 4k documents to local disk (Dell PE with onboard RAID with 256MB cache). I got the following data from my profile: 70 % time was spent in inverting the document 30 % in merge Ok that part isnt surprising. However only about 1% of 30% of the merge was spent in the OS.flush call (not very IO bound at all with this controller). And almost all of the invert was in the StandardAnalyzer pegged in the javacc generated code. The profile was based upon duration and not cpu. The profiler was JProbe. I was using a lower case analyzer and this was a slightly hacked lucene-1.4.3 source code line that I swapped out some of the synchronized data structures (hashtable ->hashmap, Vector->ArrayList). <> --- Chris Collins wrote: > I found with a fast RAID controller that I can easily be CPU bound, some of > the > io is related to latency. You can hide the latency by having overlapping IO > (you get that with multiple indexers going on at the same time). > > I think there possibly could be more horsepower you can get out of the > inverter > and merge aspects of the indexing. I am currently jprobeing this at the > moment. > > If your using high latency disks (such as a filer) during merge you may want > to > consider increasing the size of the buffers to reduce the amount of rpc's to > the filer....however my previous attempts to change this failed. > > C > > --- Bill Au wrote: > > > Optimize is disk I/O bound. So I am not sure what multiple CPUs will buy > > you. > > > > Bill > > > > On 6/9/05, Kevin Burton wrote: > > > Is it possible to get Lucene to do an index optimize on multiple > > > processors? > > > > > > Its a single threaded algorithm currently right? > > > > > > Its a shame since I have a quad machine but I'm only using 1/4th of the > > > capacity. Thats a heck of a performance hit. > > > > > > Kevin > > > > > > -- > > > > > > > > > Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. > > > See irc.freenode.net #rojo if you want to chat. > > > > > > Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html > > > > > > Kevin A. Burton, Location - San Francisco, CA > > > AIM/YIM - sfburtonator, Web - http://peerfear.org/ > > > GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org