db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6784) change optimizer to choose in list multiprobe more often
Date Thu, 18 Dec 2014 17:48:13 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251938#comment-14251938

Mike Matrigali commented on DERBY-6784:

Good questions.  I am somewhat out of my depth in this code so don't have answers for sure.

I do know that some part of the code does sort the values in the IN LIST.  This was originally
done to implement a first level optimization of IN LIST's before the multi-probe work.  The
code would sort the IN LIST and then rather than doing a full scan of the index it would
use the sort to set start and stop parameter for the scan so hopefully eliminating part of
the index scan.  I am hoping this still happens which also leads to better localized caching
for subsequent probes in the many term multi-probe case.

> change optimizer to choose in list multiprobe more often
> --------------------------------------------------------
>                 Key: DERBY-6784
>                 URL: https://issues.apache.org/jira/browse/DERBY-6784
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Mike Matrigali
>            Assignee: Mike Matrigali
> Using the  multi-probe join strategy is an obvious performance win when
> the optimizer chooses it.  There are cases currently where the costing 
> makes the optimizer choose other plans which do not perform as well as
> the multi-probe strategy.
> The class of queries that are affected are those where the number of terms
> in the IN LIST is large relative to the number of rows in the table, and there
> is a useful index to probe for the column that is referenced by the IN LIST.
> There are multiple benefits to choosing the multi-probe strategy, including
> the following:
> 1) often better execution time, where the alternative is to do a full table 
>     merge on the column.
> 2) The multi-probe strategy results in "pushing" the work into the store, 
>      and this may result in more concurrent behavior (see DERBY-6300 and DERBY-6301).
  First less rows may
>      be locked by probing rather than full table scan (and in the worst case
>      same number if query manages to probe on every value in table).  
>      Second depending on isolation level of the query store will only matching
>      rows, while in the current implementation all rows that are returned by
>      store for qualification above store will remain locked whether they 
>      qualify or not.   Especially in small table cases other query plan choices
>      have been changed to favor probing indexes rather than full table scans
>      even if pure cpu is better with table scan.  

This message was sent by Atlassian JIRA

View raw message