From derby-user-return-6107-apmail-db-derby-user-archive=db.apache.org@db.apache.org Tue Feb 06 14:42:54 2007 Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 78700 invoked from network); 6 Feb 2007 14:42:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Feb 2007 14:42:53 -0000 Received: (qmail 72187 invoked by uid 500); 6 Feb 2007 14:42:59 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 72153 invoked by uid 500); 6 Feb 2007 14:42:58 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 72142 invoked by uid 99); 6 Feb 2007 14:42:58 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Feb 2007 06:42:58 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [192.18.1.36] (HELO gmp-ea-fw-1.sun.com) (192.18.1.36) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Feb 2007 06:42:48 -0800 Received: from d1-emea-09.sun.com ([192.18.2.119]) by gmp-ea-fw-1.sun.com (8.13.6+Sun/8.12.9) with ESMTP id l16EgQwi011219 for ; Tue, 6 Feb 2007 14:42:26 GMT Received: from conversion-daemon.d1-emea-09.sun.com by d1-emea-09.sun.com (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) id <0JD100D01Q2WLX00@d1-emea-09.sun.com> (original mail from Oystein.Grovlen@Sun.COM) for derby-user@db.apache.org; Tue, 06 Feb 2007 14:42:26 +0000 (GMT) Received: from [129.159.112.239] by d1-emea-09.sun.com (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPSA id <0JD100308Q6LUW6J@d1-emea-09.sun.com> for derby-user@db.apache.org; Tue, 06 Feb 2007 14:42:23 +0000 (GMT) Date: Tue, 06 Feb 2007 15:42:21 +0100 From: Oystein Grovlen - Sun Norway Subject: Re: keeping the table ordered In-reply-to: <31FDC2DD-D2BE-4FF8-A2CB-867F1380BFCD@yahoo.com> Sender: Oystein.Grovlen@Sun.COM To: Derby Discussion Message-id: <45C893CD.4050400@sun.com> MIME-version: 1.0 Content-type: text/plain; format=flowed; charset=ISO-8859-1 Content-transfer-encoding: 8BIT References: <31FDC2DD-D2BE-4FF8-A2CB-867F1380BFCD@yahoo.com> User-Agent: Thunderbird 1.5.0.9 (X11/20061229) X-Virus-Checked: Checked by ClamAV on apache.org Nurullah Akkaya wrote: > i am using derby in embedded mode. > i have a table of 100 million records when i do a select i get 600k to > 1million records table structure is as follows Are you saying that a single select query may return 1 million records? I am not surprised if that takes more than 1 minute. > > stmt.executeUpdate("CREATE TABLE POSTINGLIST (" > +"WORDID INTEGER NOT NULL," > +"DOCID INTEGER NOT NULL," > +"ANCHORID INTEGER NOT NULL," > +"DOCPOSITION SMALLINT NOT NULL," > +"FLAG SMALLINT NOT NULL)"); > > > stmt.executeUpdate("CREATE INDEX WORDID ON POSTINGLIST(WORDID)"); > stmt.executeUpdate("CREATE INDEX DOCID ON POSTINGLIST(DOCID)"); > stmt.executeUpdate("CREATE INDEX ANCHORID ON POSTINGLIST(ANCHORID)"); > > select docId , docPosition , anchorId, flag from postingList where > wordId = ? > > 99 percent of the time i select ( above query ) based on the wordId i > have implemented all the tuning tips in the manual but it stil takes so > much time( more than a minute ) disk i/o seems to be the bottleneck( > no swapping occurs cpu is idle during select derby uses index) what i > want to do is keep the tables sorted by wordId so that i can avoid > random reads and do a sequential read. rigth now insert performance is > faster than i expected so i can trade some write for read is this > possible? if this is not possible out of the box can you give me some > tips as to how can i implement this in to the source code? > > Thanks for your time... It is not quite clear to me what you are trying to achieve. Why do you want a sequential read? Scanning the entire table of 100 million records should take longer time than looking up a record using a index on wordid. Have you retrieved the query plan and made sure the index on wordid is used? Or are you talking about doing a lookup of many different wordids in sorted order? -- Øystein