Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 21699 invoked from network); 13 Mar 2007 21:05:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Mar 2007 21:05:40 -0000 Received: (qmail 96236 invoked by uid 500); 13 Mar 2007 21:05:45 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 96215 invoked by uid 500); 13 Mar 2007 21:05:45 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 96196 invoked by uid 99); 13 Mar 2007 21:05:44 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Mar 2007 14:05:44 -0700 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_10_20,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of dbjohnson.e@gmail.com designates 209.85.132.247 as permitted sender) Received: from [209.85.132.247] (HELO an-out-0708.google.com) (209.85.132.247) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Mar 2007 14:05:32 -0700 Received: by an-out-0708.google.com with SMTP id d18so2200059and for ; Tue, 13 Mar 2007 14:05:12 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=Rr56gfaG2e9j5h41CdpBWIt09M8GwR4XddnHHo2s4svxn3xX9NCXyKFn0JkOORSrBOLEwdaY/Y5dXIB3IQOy5jfPcDwLlbUkK1fQKSWA+l95is0UN5BPtm1vWz2iBpGwB/OUnrNAMCx7uy8r1r4FllxxBBxZshhhuZYNZ9aMAMs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=YKXtKD/q90HMj0WTknqvoXiQTUs3h8LLGwQDwJ5pR2dSscqp01yZahg7j/jYgVfuPrY6dLUqwlxyD7pVUXnXcfPd3HO4GyH4kdHJTgaO55kP5ct8c7x3pa4EOMqfOKj6a6+4DkWjvLtUmGm2JP4+itgVR2hBzJMtFxmjTc1AfaQ= Received: by 10.100.44.13 with SMTP id r13mr1254366anr.1173819912007; Tue, 13 Mar 2007 14:05:12 -0700 (PDT) Received: by 10.100.48.11 with HTTP; Tue, 13 Mar 2007 14:05:11 -0700 (PDT) Message-ID: <4f95e0110703131405x893e8f1q3d71707c96c1117d@mail.gmail.com> Date: Tue, 13 Mar 2007 14:05:11 -0700 From: "David Johnson" To: dev@jackrabbit.apache.org, tobias.bocanegra@day.com Subject: Re: Threading and Query Performance In-Reply-To: <8be731880703131232n76b1af08m49e318abdae0263e@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_32596_7931752.1173819911835" References: <4f95e0110703121249h7dad3bb6m99dae8db7b182951@mail.gmail.com> <8be731880703131232n76b1af08m49e318abdae0263e@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_32596_7931752.1173819911835 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline So if I increase the # of threads to a worst case scenario - 100 threads running 1 query each - I would think I would see a worst case in synchronization and scheduling overhead - i.e., 100 threads would run significantly slower than the 25 thread run. The 100 thread run took 37070 ms to run. Not that far off from the 25 thread run. Only a 10% speed increase between the 1 thread and 2 thread seems off (running on a 4 core box) - unless, as you mention, the searching is hitting a synchronization issue. I am no threading expert by any means, it just seems off. Again, there are no writes happening to this repository during these tests - it is a read only repository. -Dave On 3/13/07, Tobias Bocanegra wrote: > > well, from a first glance: the more threads you add, the faster the > queries are until you reach the number of processors. if you expect > linear improvement - that does not work, since the actual 'searching' > is synchronized (marcel, correct me if i'm saying something wrong > here). > when you add more ad more threads, the synchronization and scheduling > overhead gets bigger and you loose again overall speed. > > what you must ask yourself: is '~350ms' per query fast enough for my > application? how do my queries look like? can i optimize on the query > or data structure? > > regards, toby > > On 3/12/07, David Johnson wrote: > > This is related to two ongoing list threads - one on synchronization and > the > > other on query performance. > > > > As I have mentioned in previous posts, I have been running a variety of > > query tests. I am using a suite of 100 queries and running them against > > Jackrabbit in several different threading scenarios - i.e., I change the > # > > of threads used to run sub-sets of the 100 queries. To be clear - if I > run > > a single thread case, it will run all 100 queries, one after the > other. If > > I run 2 threads - one thread will run 50 queries, while the other thread > > will run the other 50 queries. In all cases, the 100 queries are the > same, > > the only thing that changes is the number of threads used to run them. > > Also, in all tests, the repository is read only - nothing is making any > > writes to the repository. > > > > Here are some results: > > > > 1 thread: 100 queries in 41139 ms > > 2 threads: 50 queries in 37828 ms, 50 queries in 38622 ms - total time > for > > all threads to complete 38960 ms > > 4 threads: 25 queries in 25895 ms, 25 queries in 28034 ms, 25 queries in > > 32335 ms, 25 queries in 32391 ms - total time 32801 ms > > 10 threads: 10 queries in 18733 ms, 10 queries in 19894 ms, ... , 10 > queries > > in 33798 ms, 10 queries in 34924 ms - total time 35286 ms > > 25 threads: 4 queries in 2413 ms, 4 queries in 11725 ms, 4 queries in > 18294 > > ms, ... , 4 queries in 36059 ms, 4 queries in 36222 ms > > > > Some details on the box that I am running these tests on: it is a dual > Xeon > > running Linux - /proc/cpuinfo shows 4 processors, so I am assuming it is > a > > dual core. I am running Jackrabbit 1.2.3 with the Bundle Persistence > > Manager. > > > > I am not sure what the numbers above are really saying, although they > don't > > really look right :-) We have a multi-user use case - large web site > with > > many ongoing reads, occasional writes. I am using the multiple threads > to > > "test" multiple users. I am hoping that the developers with more > > understanding of the internals can explain what's going on above. > > > > I am wondering if I am hitting the synchronization issue that is being > > discussed in other posts? Thoughts? > > > > -Dave > > > > > -- > -----------------------------------------< tobias.bocanegra@day.com >--- > Tobias Bocanegra, Day Management AG, Barfuesserplatz 6, CH - 4001 Basel > T +41 61 226 98 98, F +41 61 226 98 97 > -----------------------------------------------< http://www.day.com >--- > ------=_Part_32596_7931752.1173819911835--