Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 56559 invoked from network); 22 Nov 2005 14:33:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 22 Nov 2005 14:33:58 -0000 Received: (qmail 59902 invoked by uid 500); 22 Nov 2005 14:33:50 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 59696 invoked by uid 500); 22 Nov 2005 14:33:49 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 59685 invoked by uid 99); 22 Nov 2005 14:33:49 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Nov 2005 06:33:49 -0800 Received-SPF: pass (asf.osuosl.org: domain of shoren@gmail.com designates 64.233.162.207 as permitted sender) Received: from [64.233.162.207] (HELO zproxy.gmail.com) (64.233.162.207) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Nov 2005 06:35:21 -0800 Received: by zproxy.gmail.com with SMTP id k1so1034567nzf for ; Tue, 22 Nov 2005 06:33:27 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:references; b=hTKCoMR7o4cShoqS9mqLEUy0IrH6vpzE7WuekghWJgGIKIOoHTBEconmCEsHcWeJZIk5OzUw0bxmbC/OnpsTciEj8lZ7QYGDzVxLnQvxW8h06S16ZbI+oZJpHAJAokLj4pb9LpIzDiz/5NIs2lBp+HE2IcNZjDrHxGwXfeMZjHA= Received: by 10.36.91.13 with SMTP id o13mr1470432nzb; Tue, 22 Nov 2005 06:33:27 -0800 (PST) Received: by 10.36.89.11 with HTTP; Tue, 22 Nov 2005 06:33:27 -0800 (PST) Message-ID: Date: Tue, 22 Nov 2005 16:33:27 +0200 From: Oren Shir Reply-To: shoren@alum.cs.huji.ac.il To: java-user@lucene.apache.org Subject: Re: Throughput doesn't increase when using more concurrent threads In-Reply-To: <43820731.5000806@apache.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_35796_15331847.1132670007863" References: <43820731.5000806@apache.org> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_35796_15331847.1132670007863 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi, There are two sunchronization points: on the stream and on the reader. Usin= g different FSDirectoriy and IndexReaders should solve this. I'll let you kno= w once I code it. Right now I'm checking if making my Documents store less data will move the bottleneck to some other place. Thanks again, Oren Shir On 11/21/05, Doug Cutting wrote: > > Jay Booth wrote: > > I had a similar problem with threading, the problem turned out to be > that in > > the back end of the FSDirectory class I believe it was, there was a > > synchronized block on the actual RandomAccessFile resource when reading > a > > block of data from it... high-concurrency situations caused threads to > stack > > up in front of this synchronized block and our CPU time wound up being > spent > > thrashing between blocked threads instead of doing anything useful. > > This is correct. In Lucene, multiple streams per file are created by > cloning, and all clones of an FSDirectory input stream share a > RandomAccessFile and must synchronize input from it. MmapDirectory does > not have this limitation. If your indexes are less than a few GB or you > are using 64-bit hardware, then MmapDirectory should work well for you. > Otherwise it would be simple to write an nio-based Directory that does > not use mmap that is also unsynchronized. Such a contribution would be > welcome. > > > Making multiple IndexSearchers and FSDirectories didn't help because in > the > > back end, lucene consults a singleton HashMap of some kind (don't > remember > > implementation) that maintained a single FSDirectory for any given inde= x > > being accessed from the JVM... multiple calls to > FSDirectory.getDirectory > > actually return the same FSDirectory object with synchronization at the > same > > point. > > This does not make sense to me. FSDirectory does keep a cache of > FSDirectory instances, but i/o should not be synchronized on these. One > should be able to open multiple input streams on the same file from an > FSDirectory. But this would not be a great solution, since file handle > limits would soon become a problem. > > Doug > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_35796_15331847.1132670007863--