Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 92843 invoked from network); 19 Jun 2008 15:39:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Jun 2008 15:39:54 -0000 Received: (qmail 76812 invoked by uid 500); 19 Jun 2008 15:39:44 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 76789 invoked by uid 500); 19 Jun 2008 15:39:44 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 76774 invoked by uid 99); 19 Jun 2008 15:39:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Jun 2008 08:39:44 -0700 X-ASF-Spam-Status: No, hits=-1.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.18.6.21] (HELO gmp-eb-inf-1.sun.com) (192.18.6.21) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Jun 2008 15:38:51 +0000 Received: from fe-emea-10.sun.com (gmp-eb-lb-2-fe3.eu.sun.com [192.18.6.12]) by gmp-eb-inf-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id m5JFcrF2000331 for ; Thu, 19 Jun 2008 15:38:53 GMT Received: from conversion-daemon.fe-emea-10.sun.com by fe-emea-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0K2P00I01UJLPN00@fe-emea-10.sun.com> (original mail from Kristian.Waagan@Sun.COM) for derby-user@db.apache.org; Thu, 19 Jun 2008 16:38:53 +0100 (BST) Received: from [129.159.112.237] by fe-emea-10.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTPSA id <0K2P00MWZVGIH990@fe-emea-10.sun.com> for derby-user@db.apache.org; Thu, 19 Jun 2008 16:38:42 +0100 (BST) Date: Thu, 19 Jun 2008 17:38:41 +0200 From: Kristian Waagan Subject: Re: Speed of using Derby DB In-reply-to: <1c6e34b0806190733t64366d1eufcb10290f3c22f1d@mail.gmail.com> Sender: Kristian.Waagan@Sun.COM To: Derby Discussion Message-id: <485A7D81.2010800@Sun.com> Organization: Sun Microsystems Inc. MIME-version: 1.0 Content-type: text/plain; format=flowed; charset=UTF-8 Content-transfer-encoding: 7BIT References: <1c6e34b0806110131q12339641j841b8c5861f0085a@mail.gmail.com> <48506269.3010107@sbcglobal.net> <1c6e34b0806160319kc7d46c7q63a8666ccfdd96f@mail.gmail.com> <1c6e34b0806180433w38f1660fp9ac46dc44accbe9@mail.gmail.com> <48593A29.2030202@Sun.com> <1c6e34b0806190142p17f7c917sf679c15b430d1b17@mail.gmail.com> <485A5673.90301@Sun.com> <1c6e34b0806190733t64366d1eufcb10290f3c22f1d@mail.gmail.com> User-Agent: Mozilla/5.0 (X11; U; SunOS i86pc; no-NO; rv:1.8.1.14) Gecko/20080519 Thunderbird/2.0.0.14 Mnenhy/0.7.5.0 X-Virus-Checked: Checked by ClamAV on apache.org vodarus vodarus wrote: [ snip ] > > I could also get down to these times by using a HashMap to store the > intermediate totals in 'calculateTotalCommon'. This does of course > use more memory and might cause trouble if you don't know the number > of clients in your table (i.e. whether you need 25 thousand or 100 > million entries in the map). > > also not applicable, because data amount is 10-100 times more than RAM. Note that you only need to keep the intermediate results in memory, and you don't have to calculate everything before you write data back into the database. In this case, it is the number of unique clients that matter. I agree it is still a non-optimal solution though. > > > It seems what happens is that the log buffer goes full. By > increasing the log buffer, I was able to get a little better > performance. As always with tuning, it's about balance and > tradeoffs. If your IO system is really good, maybe you can run with > a big log buffer and get better performance. However, the effect you > see from this also depends on how often you have commits (then the > complete buffer is flushed anyway, at least in Derby). > > So, in short, experiment with the following, using either the > "insert into..." query or your client code modified to somehow store > the totals in memory: > a) Log buffer size > b) Page cache size (and JVM heap) > c) Page size > > One of my attempts looked like this: > java -Xmx512M -Dderby.storage.pageSize=32768 > -Dderby.storage.logBufferSize=524288 > -Dderby.storage.pageCacheSize=2500 -cp .:${JDB10413} derbytest.FatTest > > > this show 9,5 sec time. Best is Oracle with 1,5 sec. > also is this parameters work with old database or you need to recreate > database to get this parameters work? Except for the pageSize options, the options take effect when you start Derby. Note that the what I posted was just an example. You might still have to experiment to find the best fit for your environment / data. > > > > Using your original test code I haven't been able to get lower than > around 5 seconds (best), the average being somewhere around 6 seconds. > > i could get 9,5 sec best. what another improvements can you do with > application? Maybe that's your hardware, operating system or JVM version? I was using Derby 10.4.1.3 on Solaris 10 and Java SE 6 on a dual CPU (AMD 2.4 GHz) machine. But if 6 seconds is still too much, and you can't use another algorithm, I don't think I have anything more that can help you. > > > > As always, you have to do your own tests on your own system to see > if it is good enough for your use :) > > :) > > > Often there are other things to consider besides performance, for > instance installation and ease of use. > > :) yes, but performance issue is very important. more important than > installation. > Because with Oracle analysis will take near 100 (algorithms) * 100 (data > is 100 time more than in experiment) * 1,5 sec = 4.17 hours each. So > near 6 data parts a day per server. > > Java: analysis 100 * 100 * 9,5 = 26,4 hours each. So less than ONE data > part a day per server. So company need to buy and use 6 time more > servers than it use now. Also disadvantages is Java don't have integrate > SQL into language, so SQL can be validated at runtime, not compile time > (unlike PL/SQL). > I don't think it is correct to blame this on Java. There are other database products using Java, and for all I know they might be faster for the specific case we are discussing. Also, I'm not sure how comparable the Java stored procedure code and the PL/SQL are. Could you post the latter? Does anyone know if (or rather how well maybe?) Oracle is capable of optimizing the PL/SQL? regards, -- Kristian > > Does anyone have any ideas on other possible tunings? > > -- > Kristian > >