Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 54915 invoked from network); 10 Jun 2005 05:00:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 10 Jun 2005 05:00:22 -0000 Received: (qmail 75919 invoked by uid 500); 10 Jun 2005 05:00:19 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 75861 invoked by uid 500); 10 Jun 2005 05:00:18 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 75830 invoked by uid 99); 10 Jun 2005 05:00:18 -0000 X-ASF-Spam-Status: No, hits=1.8 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from web50609.mail.yahoo.com (HELO web50609.mail.yahoo.com) (206.190.38.248) by apache.org (qpsmtpd/0.28) with SMTP; Thu, 09 Jun 2005 22:00:16 -0700 Received: (qmail 34775 invoked by uid 60001); 10 Jun 2005 05:00:03 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=0UVcgRkkYcKDOaTw5qtY3bNPfTBDYzEzIGV9psnQBMFSh+vnFp8bbbIkQhN6Cc/mBOwD4RQuvFU0nESUC7tt8XaiOqKjXLyZLNMNW1wjJbA3hM/VJ0BARAQeQSK9uxe13UuYtQz7yn4aGR6tfr7+1GAnbuBzUnyFtNYIX1qPOhY= ; Message-ID: <20050610050003.34773.qmail@web50609.mail.yahoo.com> Received: from [67.188.144.134] by web50609.mail.yahoo.com via HTTP; Thu, 09 Jun 2005 22:00:03 PDT Date: Thu, 9 Jun 2005 22:00:03 -0700 (PDT) From: Chris Collins Subject: Fwd: Re: Optimizing indexes with mulitiple processors? To: java-dev@lucene.apache.org MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="0-1566655556-1118379603=:29894" Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --0-1566655556-1118379603=:29894 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Content-Id: Content-Disposition: inline Forwarding to the dev list as I dont know if this is usefull data....tell me to shut up if it isnt. Chris Note: forwarded message attached. --0-1566655556-1118379603=:29894 Content-Type: message/rfc822 Content-Transfer-Encoding: 8bit X-Apparently-To: chris_j_collins@yahoo.com via 206.190.38.91; Thu, 09 Jun 2005 21:58:53 -0700 X-Originating-IP: [209.237.227.199] Return-Path: Authentication-Results: mta137.mail.scd.yahoo.com from=yahoo.com; domainkeys=fail (bad sig) Received: from 209.237.227.199 (HELO mail.apache.org) (209.237.227.199) by mta137.mail.scd.yahoo.com with SMTP; Thu, 09 Jun 2005 21:58:53 -0700 Received: (qmail 73074 invoked by uid 500); 10 Jun 2005 04:58:38 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 73060 invoked by uid 99); 10 Jun 2005 04:58:38 -0000 X-ASF-Spam-Status: No, hits=1.8 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from web50608.mail.yahoo.com (HELO web50608.mail.yahoo.com) (206.190.38.95) by apache.org (qpsmtpd/0.28) with SMTP; Thu, 09 Jun 2005 21:58:36 -0700 Received: (qmail 92272 invoked by uid 60001); 10 Jun 2005 04:58:14 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=lPcBV6JEMtxU+40flSR81Na1TInSLflTQk3TNfG94TZiH1ZJnSkaxfvinKCtDfca72XUtALnktqNajq3ESdrbzbRwfJduI4E+lW0pFs+22oGKMUq0s7AwoIo8KeIZmMQcW4sR2zpyqgOVKS+UQXNeF18j0yT5jIiQT2biFDci6U= ; Received: from [67.188.144.134] by web50608.mail.yahoo.com via HTTP; Thu, 09 Jun 2005 21:58:14 PDT Date: Thu, 9 Jun 2005 21:58:14 -0700 (PDT) From: Chris Collins Subject: Re: Optimizing indexes with mulitiple processors? To: java-user@lucene.apache.org, Bill Au In-Reply-To: <20050610005238.82891.qmail@web50601.mail.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked Content-Length: 1455 To follow up. I was surprised to find that from the experiment of indexing 4k documents to local disk (Dell PE with onboard RAID with 256MB cache). I got the following data from my profile: 70 % time was spent in inverting the document 30 % in merge Ok that part isnt surprising. However only about 1% of 30% of the merge was spent in the OS.flush call (not very IO bound at all with this controller). And almost all of the invert was in the StandardAnalyzer pegged in the javacc generated code. The profile was based upon duration and not cpu. The profiler was JProbe. I was using a lower case analyzer and this was a slightly hacked lucene-1.4.3 source code line that I swapped out some of the synchronized data structures (hashtable ->hashmap, Vector->ArrayList). <> --- Chris Collins wrote: > I found with a fast RAID controller that I can easily be CPU bound, some of > the > io is related to latency. You can hide the latency by having overlapping IO > (you get that with multiple indexers going on at the same time). > > I think there possibly could be more horsepower you can get out of the > inverter > and merge aspects of the indexing. I am currently jprobeing this at the > moment. > > If your using high latency disks (such as a filer) during merge you may want > to > consider increasing the size of the buffers to reduce the amount of rpc's to > the filer....however my previous attempts to change this failed. > > C > > --- Bill Au wrote: > > > Optimize is disk I/O bound. So I am not sure what multiple CPUs will buy > > you. > > > > Bill > > > > On 6/9/05, Kevin Burton wrote: > > > Is it possible to get Lucene to do an index optimize on multiple > > > processors? > > > > > > Its a single threaded algorithm currently right? > > > > > > Its a shame since I have a quad machine but I'm only using 1/4th of the > > > capacity. Thats a heck of a performance hit. > > > > > > Kevin > > > > > > -- > > > > > > > > > Use Rojo (RSS/Atom aggregator)! - visit http://rojo.com. > > > See irc.freenode.net #rojo if you want to chat. > > > > > > Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html > > > > > > Kevin A. Burton, Location - San Francisco, CA > > > AIM/YIM - sfburtonator, Web - http://peerfear.org/ > > > GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412 > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --0-1566655556-1118379603=:29894 Content-Type: text/plain; charset=us-ascii --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org --0-1566655556-1118379603=:29894--