db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oystein Grovlen - Sun Norway <Oystein.Grov...@Sun.COM>
Subject Re: keeping the table ordered
Date Tue, 06 Feb 2007 14:42:21 GMT
Nurullah Akkaya wrote:
> i am using derby in embedded mode.
> i have a table of 100 million records when i do a select i get 600k to 
> 1million records table structure is as follows

Are you saying that a single select query may return 1 million records? 
  I am not surprised if that takes more than 1 minute.

> 
> stmt.executeUpdate("CREATE TABLE POSTINGLIST ("
>    +"WORDID INTEGER NOT NULL,"
>    +"DOCID INTEGER NOT NULL,"
>    +"ANCHORID INTEGER NOT NULL,"
>    +"DOCPOSITION SMALLINT NOT NULL,"
>    +"FLAG SMALLINT NOT NULL)");
> 
> 
> stmt.executeUpdate("CREATE INDEX WORDID ON POSTINGLIST(WORDID)");
> stmt.executeUpdate("CREATE INDEX DOCID ON POSTINGLIST(DOCID)");
> stmt.executeUpdate("CREATE INDEX ANCHORID ON POSTINGLIST(ANCHORID)");
> 
> select docId , docPosition , anchorId, flag from postingList  where 
> wordId = ?
> 
> 99 percent of the time i select ( above query ) based on the wordId i 
> have implemented all the tuning tips in the manual but it stil takes so 
> much time( more than a minute ) disk i/o seems to be the bottleneck( 
> no swapping occurs  cpu is idle during select derby uses index) what i 
> want to do is keep the tables sorted by wordId so that i can avoid 
> random reads and do a sequential read. rigth now insert performance is 
> faster than i expected so i can trade some write for read is this 
> possible? if this is not possible out of the box can you give me some 
> tips as to how can i implement this in to the source code?
> 
> Thanks for your time...

It is not quite clear to me what you are trying to achieve.  Why do you 
want a sequential read?  Scanning the entire table of 100 million 
records should take longer time than looking up a record using a index 
on wordid.  Have you retrieved the query plan and made sure the index on 
wordid is used?  Or are you talking about doing a lookup of many 
different wordids in sorted order?

-- 
Øystein

Mime
View raw message