db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-6784) change optimizer to choose in list multiprobe more often
Date Fri, 19 Dec 2014 23:54:14 GMT

     [ https://issues.apache.org/jira/browse/DERBY-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Mike Matrigali updated DERBY-6784:
    Attachment: DERBY_6784_diff_1.txt

preliminary patch, not ready for commit.

This change models changing the cost of the full IN LIST to just the cost
of one term.   Previous benchmark for markers parameter performance in trunk fell off when
number of terms got to around 8% of number of rows.
With this change performance kept being best until somewhere between
10% and 20% of rows.  I need to verify but assume the drop off again is
the system choosing not to do probe.

Need to investigate more, seems likely there is a cost issue somewhere
else also.

> change optimizer to choose in list multiprobe more often
> --------------------------------------------------------
>                 Key: DERBY-6784
>                 URL: https://issues.apache.org/jira/browse/DERBY-6784
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Mike Matrigali
>            Assignee: Mike Matrigali
>         Attachments: DERBY_6784_diff_1.txt
> Using the  multi-probe join strategy is an obvious performance win when
> the optimizer chooses it.  There are cases currently where the costing 
> makes the optimizer choose other plans which do not perform as well as
> the multi-probe strategy.
> The class of queries that are affected are those where the number of terms
> in the IN LIST is large relative to the number of rows in the table, and there
> is a useful index to probe for the column that is referenced by the IN LIST.
> There are multiple benefits to choosing the multi-probe strategy, including
> the following:
> 1) often better execution time, where the alternative is to do a full table 
>     merge on the column.
> 2) The multi-probe strategy results in "pushing" the work into the store, 
>      and this may result in more concurrent behavior (see DERBY-6300 and DERBY-6301).
  First less rows may
>      be locked by probing rather than full table scan (and in the worst case
>      same number if query manages to probe on every value in table).  
>      Second depending on isolation level of the query store will only matching
>      rows, while in the current implementation all rows that are returned by
>      store for qualification above store will remain locked whether they 
>      qualify or not.   Especially in small table cases other query plan choices
>      have been changed to favor probing indexes rather than full table scans
>      even if pure cpu is better with table scan.  

This message was sent by Atlassian JIRA

View raw message