db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: join over more than 3 tables
Date Fri, 25 May 2007 20:58:14 GMT
Paulo Jesus wrote:
> Looking to the querys from "read" and "query" (A and B)  i see a inner
> join over 5 tables. The query plan is  confusing but i can see a table
> scan over table barcelona3. It shouldn't be necessary, why is the
> optimizer doing it?

Both of the queries in question have a "select *", which means that all columns 
for each table are returned.  If the optimizer chose to do an index scan on 
barcelona3 (for example), it would still need to access the table itself (in 
addition to the index) in order to retrieve all of the non-index columns.  So 
given that we have to go to the table for every row anyways, it's possible that 
the optimizer estimated that a straight table scan would be cheaper.

That's just a guess, though.  It's hard to say without more information: ex. DDL 
of the tables in question, and the actual query plan(s) that you are seeing.

Does the optimizer choose a table scan for barcelona3 for both query A *and* 
query B, or just for query A?  Does it choose index scans for all of the other 

> I have indexes over columns id and barcelona2.b2

I assume this means you have an index on the "id" column of _each_ table, plus 
an index on the "b2" column of the "barcelona2" table.  Is that correct?

Did you create the indexes before or after you populated the tables? There are 
known problems with creating an index before populating the table: see esp. 
DERBY-269.  Given that, it's generally a good idea to populate the tables first 
and *then* create the indexes afterward.

> All columns are integers, but B2 can be null.

Generally speaking, how many rows will have a "null" for b2?  Does "b2" exist 
for each table, or just for barcelona2?  Again, it would be good if you could 
send the DDL for your database.  Ex.:

   java org.apache.derby.tools.dblook -d jdbc:derby:mydb

For more see:


> Running query A and B 100 times, the execution of query B is always more
> efficient than A. In order ~8seg to ~19seg. It´s a big difference, i
> can´t explain it. Does anyone have some idea?

It's possible that the optimizer chooses the barcelona2.b2 index for query B, 
which is perhaps better than the barcelona2.id index.  But again, I'm just 
guessing.  If you are able to provide an example database (or script/program to 
create and populate the database) that shows the difference, that might be 
useful.  Otherwise, the DDL and query plans in question would be good, as they 
might allow for better investigation...


View raw message