db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: When is multi field index preferable over single field ?
Date Sun, 13 Jun 2010 11:24:36 GMT
On 06/12/10 07:58 AM, Damian Carey wrote:
> Hello,
>
> We have a simple database, only 3 tables with a few fields in each)
> that contains data supplied by a third party. Almost entirely read
> only.
>
> The table of concern has about 380K rows, and on each retrieval we
> will retrieve between 10 and 500 rows.
>
> I am looking for some BASIC advice on which indexes should be created
> to assist with standard queries.
>
> I am just looking for reasonable advice, and am not worried about any
> super optimal solution.
>
> The two basic queries are as follows ...
>
> SELECT * FROM MYTABLE qt WHERE qt.scheduleId = 'ABC' AND qt.body = '4DR'
>
> and
>
> SELECT DISTINCT qt.body FROM MYTABLE qt WHERE qt.scheduleId = 'ABC'
>
> Are "multi field" indexes preferable or would single field indexes work OK.
>
> Perhaps something like this for a multi field ....
>
> CREATE INDEX T_SCHED_BODY_IDX ON MYTABLE (scheduleId, body)
>
> Or perhaps something like this for a single field ....
> CREATE INDEX T_SCHED_IDX ON MYTABLE (scheduleId)
> CREATE INDEX T_BODY_IDX ON MYTABLE (body)
>
> Any guidance, advice, clues, suggestions, experience would be most appreciated.
>   

Hi Damian,

I think the multi-field index would be preferable in both of these cases.

1) SELECT * FROM MYTABLE qt WHERE qt.scheduleId = 'ABC' AND qt.body = '4DR'


Here, if we only have the two single-field indexes, the query will only
use one of them. That means (if the first index is picked) that the
query will have to look at all the rows that have scheduleId='ABC' and
check if they also have body='4DR'. If many of those rows don't have
body='4DR', we have wasted time looking at rows that shouldn't be
included in the result.

With the multi-field index, we'll only have to look at the rows that
have both scheduleId='ABC' and body='4DR', so we don't waste time
looking at rows that don't qualify.

2) SELECT DISTINCT qt.body FROM MYTABLE qt WHERE qt.scheduleId = 'ABC'


In this query, a single-field index on scheduleId and a multi-field
index with scheduleId as the first field will end up looking at the same
number of rows. However, the multi-field index gives these additional
benefits:

    * Since the index also contains the field body, all the data needed
      to produce the result can be found in the index. This saves one
      lookup in the base table per returned row.
    * The multi-field index will return the rows ordered on body, so we
      don't need to sort the result or build a hash table in order to
      eliminate duplicates as required by the DISTINCT keyword.


Hope this helps,

-- 
Knut Anders


Mime
View raw message