db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damian Carey <jami...@gmail.com>
Subject Re: When is multi field index preferable over single field ?
Date Sun, 13 Jun 2010 12:12:05 GMT
Knut,

Awesome!

All sort of really obvious once I read your response.  Thanks for
providing "what" and "why". That has really helped my understanding of
indexes.

Many thanks,
-Damian


On Sun, Jun 13, 2010 at 9:24 PM, Knut Anders Hatlen <Knut.Hatlen@sun.com> wrote:
> On 06/12/10 07:58 AM, Damian Carey wrote:
>> Hello,
>>
>> We have a simple database, only 3 tables with a few fields in each)
>> that contains data supplied by a third party. Almost entirely read
>> only.
>>
>> The table of concern has about 380K rows, and on each retrieval we
>> will retrieve between 10 and 500 rows.
>>
>> I am looking for some BASIC advice on which indexes should be created
>> to assist with standard queries.
>>
>> I am just looking for reasonable advice, and am not worried about any
>> super optimal solution.
>>
>> The two basic queries are as follows ...
>>
>> SELECT * FROM MYTABLE qt WHERE qt.scheduleId = 'ABC' AND qt.body = '4DR'
>>
>> and
>>
>> SELECT DISTINCT qt.body FROM MYTABLE qt WHERE qt.scheduleId = 'ABC'
>>
>> Are "multi field" indexes preferable or would single field indexes work OK.
>>
>> Perhaps something like this for a multi field ....
>>
>> CREATE INDEX T_SCHED_BODY_IDX ON MYTABLE (scheduleId, body)
>>
>> Or perhaps something like this for a single field ....
>> CREATE INDEX T_SCHED_IDX ON MYTABLE (scheduleId)
>> CREATE INDEX T_BODY_IDX ON MYTABLE (body)
>>
>> Any guidance, advice, clues, suggestions, experience would be most appreciated.
>>
>
> Hi Damian,
>
> I think the multi-field index would be preferable in both of these cases.
>
> 1) SELECT * FROM MYTABLE qt WHERE qt.scheduleId = 'ABC' AND qt.body = '4DR'
>
>
> Here, if we only have the two single-field indexes, the query will only
> use one of them. That means (if the first index is picked) that the
> query will have to look at all the rows that have scheduleId='ABC' and
> check if they also have body='4DR'. If many of those rows don't have
> body='4DR', we have wasted time looking at rows that shouldn't be
> included in the result.
>
> With the multi-field index, we'll only have to look at the rows that
> have both scheduleId='ABC' and body='4DR', so we don't waste time
> looking at rows that don't qualify.
>
> 2) SELECT DISTINCT qt.body FROM MYTABLE qt WHERE qt.scheduleId = 'ABC'
>
>
> In this query, a single-field index on scheduleId and a multi-field
> index with scheduleId as the first field will end up looking at the same
> number of rows. However, the multi-field index gives these additional
> benefits:
>
>    * Since the index also contains the field body, all the data needed
>      to produce the result can be found in the index. This saves one
>      lookup in the base table per returned row.
>    * The multi-field index will return the rows ordered on body, so we
>      don't need to sort the result or build a hash table in order to
>      eliminate duplicates as required by the DISTINCT keyword.
>
>
> Hope this helps,
>
> --
> Knut Anders
>
>

Mime
View raw message