db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-47) Some possible improvements to IN optimization
Date Thu, 22 Feb 2007 00:42:06 GMT

    [ https://issues.apache.org/jira/browse/DERBY-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12474895
] 

A B commented on DERBY-47:
--------------------------

In one of my previous comments I mentioned that d47_engine_doNOTCommit_v1.patch has a problem
where the addition of an InListResultSet to the FromList causes all of the SUR (Scrollable
Updatable Result set) tests that use an IN-clause to fail. More specifically, the presence
of the InListResultSet causes Derby to downgrade the cursor scrollability to "CONCUR_READ_ONLY".


I was eventually able to track down the cause of this behavior: whether or not Derby downgrades
a result set to CONCUR_READ_ONLY depends on the value returned by the execution time result
set in the "isForUpdate()" method.  The default (as defined in NoPutResultSetImpl) is to return
false.

In the case of the SUR tests (prior to my changes) we were getting a ScrollInsensitiveResultSet
on top of a TableScanResultSet.  The former gets its "isForUpdate()" value from the latter,
and the latter correctly returns "true" to indicate that the result set is for update and
thus no downgrade is needed. With d47_engine_doNotCommit_v1.patch applied, though, we add
an InListResultSet to the FromList, which ultimately gives us a JoinResultSet at execution
time.  The JoinResultSet class does not define an "isForUpdate()" method, so we just return
the default--which is "false".  That causes the cursor concurrency to be downgraded to CONCUR_READ_ONLY.
 

To see what would happen I forced JoinResultSet to return "true" and then there was an ASSERT
failure because JoinResultSets are not expected to be used for update/delete.  The relevant
code is in execute/JoinResultSet.java; in the "getRowLocation()" method we see the following
comment:

   * A join is combining rows from two sources, so it has no
   * single row location to return; just return a null.

My guess is that, since the decision to disallow update/delete on a JoinResultSet was intentional,
trying to code around that restriction is a bad idea.  Or at the very least, it would require
a lot more investigation and/or work.

Instead of pursuing that potentially treacherous path, I was able to come up with some logic
that checks to see if a result set is updatable at compile time and, if so, to skip the IN-list
rewrite.  Early testing suggests that this is viable solution.

HOWEVER, as the list of "exceptions" to the IN-list (_v1.patch) rewrite grew, I started to
wonder if there wasn't some other solution to DERBY-47 that would accomplish the same thing,
minus all of the exception cases.

A few days later I came across some queries involving multiple IN-list predicates for a single
SELECT statement.  It turns out that many of those queries return incorrect results (duplicate
rows) and/or run much more slowly with the _v1 patch than without.

The fact that there are so many "exceptions to the rule" combined with the undesirable behavior
in the face of multiple IN-lists prompted me to abandon my initial idea of internally adding
an InListResultSetNode to the user's FROM list (d47_engine_doNOTCommit_v1.patch).

Instead I have been working on an alternate approach to the problem.  This second approach,
like the first, is based on a comment made by someone else on this issue.  This time, though,
the comment in question is from James Synge and is as follows:

> The goal of the re-write is to avoid the source of DERBY-47,
> which is that we don't have a multi-probe index scan, only
> range index scans, and worse still, when there are parameters
> involved, the range is the entire index.

In short, I decided to work with the execution-time result sets to see if it is possible to
enforce some kind of "multi-probing" for indexes.  That is to say, instead of scanning a range
of values on the index, we want to make it so that Derby probes the index N times, where N
is the size of the IN-list.  For each probe, then, we'll get all rows for which the target
column's value equals the N-th value in the IN-list.

Effectively, this comes down to the "Marker" strategy in Derby47Performance.java, where we
evaluate an equality predicate, "column = ?", N times.  Except of course that, unlike with
the Marker strategy, we do the N evaluations internally instead of making the user do it.
 

A high-level description of how this approach will work is as follows:

  - If any value in the IN-list is not a constant AND is not a parameter,
    then do processing as usual (no optimizations, no rewrite).  Notice
    that with this approach we *do* still perform the optimization if
    IN-list values are parameters.  That is not the case for the current
    Derby rewrite (i.e. without any changes for DERBY-47).

  Otherwise:

  - During preprocessing, replace the IN-list with an equality predicate of
    the form "column = ?".  I.e. the right operand will be a parameter node,
    the left operand will be whatever column reference belongs to the IN-list
    (hereafter referred to as the "target column").  We call this new, internal
    predicate an IN-list "probe predicate" because it will be the basis of
    the multi-probing logic at execution time.

  - During costing, the equality predicate "column = ?" will be treated as
    a start/stop key by normal optimizer processing (no changes necessary).
    If the predicate is deemed a start/stop key for the *first* column in
    an index, then we'll multiply the estimated cost of "column = ?" by
    the size of the IN-list, to account for the fact that we're actually
    evaluating the predicate N times (not just once).

    If the predicate is not a start/stop key for the first column in
    the index, then we do not adjust the cost.  See below for why.

  - After we have found the best plan (including a best conglomerate choice),
    check to see if the IN-list probe predicate is a start/stop key on the
    first column in the chosen conglomerate.  In order for that to be the
    case the conglomerate must be an index (because we don't have start/stop
    keys on table conglomerates).  If the probe predicate is not a
    start/stop key for the *first* column in the index, then we "revert" the probe
    predicate back to its original IN-list form and evaluate it as a "normal"
    InListOperatorNode.  In this way we are effectively "giving up" on the multi-
    probing approach.  This is why we don't change the cost for such probe
    predicates (mentioned above).

    If the probe predicate *is* a start/stop key on the first column in
    the index conglomerate, then leave it as it is.

  - When it comes time to generate the byte code, look to see if we have
    a probe predicate that is a start/stop key on the first column in the
    chosen conglomerate.  If so, generate an array of DataValueDescriptors
    to hold the values from the corresponding IN-list and pass that array
    to the underlying execution-time result set (i.e. to TableScanResultSet).
    Then generate the probe predicate as a "normal" start/stop key for the
    scan.  This will serve as a "place-holder" for the IN-list values
    at execution time.

  - Finally, at execution time, instead of using a single start key and a
    single stop key to define a scan range, we iterate through the IN-list
    values and for each non-duplicate value V[i] (0 <= i < N) we treat V[i]
    as both a start key *and* a stop key.  Or put another way, we plug
    V[i] into the "column = ?" predicate and retrieve all matching rows.
    In this way we are "probing" the index for all rows having value V[i]
    in the target column.  Once all rows matching V[i] have been returned,
    we then grab the next IN-list value, V[i+1], and we reposition the
    scan (by calling "reopenCore()"), this time using V[i+1] as the start
    and stop key (i.e. plugging V[i+1] into "column = ?").  This will
    return all rows having value V[1+1] in the target column.

    Continue this process until all values in the IN-list have been
    exhausted.  When we reach that point, we are done.

As a simple example, assume our query is of the form:

  select ... from admin.changes where id in (1, 20000)

During preprocessing we will effectively change this to be:

  select ... from admin.changes where id = ?

where "id = ?" is our IN-list "probe predicate".  Note that we must make sure the optimizer
recognizes "id = ?" as a disguised IN-list operator, as opposed to a true relational operator.
 The reason is that, while we do treat the probe predicate as a "fake" relational operator
so that it can be seen as a start/stop key for the relevant indexes, there are many operations
(ex. transitive closure) that should *not* be done on a probe predicate.  So we have to be
able to distinguish a probe predicate from other relational predicates.

With the probe predicate in place the optimizer will determine that it is a start/stop key
for any index having column "ID" as the first column, and thus the optimizer is more likely
to choose one of those indexes over a table scan.  If we assume the optimizer decides to use
an index on ID for the query, then we'll generate the IN-list values (1, 20000) and we will
pass them to the underlying index scan. We'll then generate the probe predicate "column =
?" as a start and stop key for the scan.  

At execution, then, the scan will first open up the index by using "1" as the start and stop
key for the scan (or put another way, by plugging "1" into the probe predicate, "column =
?").  That will return all rows having ID equal to "1". Then when that scan ends, we'll reopen
the scan a second time, this time using "20000" as the start and stop key, therefore returning
all the 20000's.  Meanwhile, all result sets higher up in the result set tree will just see
a stream of rows from the index where ID only equals 1 or 20000.   

This works for IN-lists with parameters, as well, because by the time execution starts we
know what values the parameters hold.

The first draft of code for implementing all of this is pretty much done, but I plan to post
it in increments for ease of review.  If all goes well, nothing should change functionally
until the final patch, which will be the preprocessing patch that does the actual creation
of the "probe predicate".  At that point all of the machinery should be in place to recognize
the probe predicate and function accordingly.

I ran Derby47PerformanceTest with this "multi-probing" approach and the numbers were similar
to what I saw with the _v1 approach (more details coming later).  Unlike the _v1 approach,
though, the multi-probing approach works with subqueries, left outer joins, and scrollable
updatable result sets.  In addition, all of the test queries that I came up with involving
multiple IN-lists run correctly with the multi-probing approach, and in many cases run quite
a bit more quickly--neither of which was true for the _v1 approach.  And yes, I do hope to
add those test cases to the nightlies as part of my work on this issue.

Incremental patches are forthcoming.  In the meantime, if anyone has any comments/suggestions
on the approach outlined above, I would appreciated the feedback...

> Some possible improvements to IN optimization
> ---------------------------------------------
>
>                 Key: DERBY-47
>                 URL: https://issues.apache.org/jira/browse/DERBY-47
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.0.2.0
>         Environment: all
>            Reporter: Sunitha Kambhampati
>         Assigned To: A B
>         Attachments: d47_engine_doNotCommit_v1.patch, d47_engine_doNotCommit_v1.stat,
derby-47-performance-data.txt, derby-47-performance-data.txt, Derby47PerformanceTest.java,
Derby47PerformanceTest.java, InListOperatorNode.java, QueryPlanUniqueIndexAndWordIndexOneTerm.txt,
QueryPlanUniqueIndexAndWordIndexTwoTerms.txt, QueryPlanUniqueIndexOnlyOneTerm.txt, QueryPlanUniqueIndexOnlyTwoTerms.txt,
readlocks.diff, readlocks_withContext.diff
>
>
> Consider a simple case of  - 
> A table tbl has 10000 rows, there is a primary key index on i1
> and the query in question is 
>  select * from tbl where i1 in (-1,100000)
> derby does a table scan of the entire table even though the "IN" list has only two values
and the comparison is on a field that has an index.
> Briefly looking at the code, it seems like we insert a between and use the IN list to
get the start and stop values for the scan. Thus the range of the values in the "IN" list
here plays an important role. 
> Thus if the query was changed to select * from tbl where i1 in (-1, 1), an index scan
would be chosen.
> It would be nice if we could do something clever in this case where there is clearly
an index on the field and the number of values in the IN list is known. Maybe use the rowcount
estimate and the IN list size to do some optimizations.  
> - consider the length of the "IN" list to do searches on the table.  ie use the IN list
values to do index key searches on the table,
> -or try to convert it to a join. Use the "IN" list values to create a temporary table
and do a join. It is most likely that the optimizer will choose the table with "IN" list here
as the outer table in the join and thus will do key searches on the larger table. 
> -------------------------------------------------------------------
> some query plans that I logged using derby.language.logQueryPlan=true for some similar
queries:
> Table has ascending values from 0 - 9999 for i1. primary key index on i1.
> GMT Thread[UT0,5,main] (XID = 19941), (SESSIONID = 0), select * from scanfixed where
i1 in (-1,9999,9998,9997,9996,9995,9994,9993,9992,9991,9990) ******* Project-Restrict ResultSet
(2):
> Number of opens = 1
> Rows seen = 10000
> Rows filtered = 9990
> restriction = true
> projection = false
> 	constructor time (milliseconds) = 0
> 	open time (milliseconds) = 0
> 	next time (milliseconds) = 0
> 	close time (milliseconds) = 0
> 	restriction time (milliseconds) = 0
> 	projection time (milliseconds) = 0
> 	optimizer estimated row count:          750.38
> 	optimizer estimated cost:         8579.46
> Source result set:
> 	Table Scan ResultSet for SCANFIXED at read committed isolation level using instantaneous
share row locking chosen by the optimizer
> 	Number of opens = 1
> 	Rows seen = 10000
> 	Rows filtered = 0
> 	Fetch Size = 16
> 		constructor time (milliseconds) = 0
> 		open time (milliseconds) = 0
> 		next time (milliseconds) = 0
> 		close time (milliseconds) = 0
> 		next time in milliseconds/row = 0
> 	scan information: 
> 		Bit set of columns fetched=All
> 		Number of columns fetched=9
> 		Number of pages visited=417
> 		Number of rows qualified=10000
> 		Number of rows visited=10000
> 		Scan type=heap
> 		start position: 
> null		stop position: 
> null		qualifiers:
> Column[0][0] Id: 0
> Operator: <=
> Ordered nulls: false
> Unknown return value: false
> Negate comparison result: false
> Column[0][1] Id: 0
> Operator: <
> Ordered nulls: false
> Unknown return value: true
> Negate comparison result: true
> 		optimizer estimated row count:          750.38
> 		optimizer estimated cost:         8579.46
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> l
> 2004-10-14 18:59:47.577 GMT Thread[UT0,5,main] (XID = 19216), (SESSIONID = 0), select
* from scanfixed where i1 in (9999,9998,9997,9996,9995,9994,9993,9992,9991,9990) ******* Project-Restrict
ResultSet (3):
> Number of opens = 1
> Rows seen = 10
> Rows filtered = 0
> restriction = true
> projection = true
> 	constructor time (milliseconds) = 0
> 	open time (milliseconds) = 0
> 	next time (milliseconds) = 0
> 	close time (milliseconds) = 0
> 	restriction time (milliseconds) = 0
> 	projection time (milliseconds) = 0
> 	optimizer estimated row count:            4.80
> 	optimizer estimated cost:           39.53
> Source result set:
> 	Index Row to Base Row ResultSet for SCANFIXED:
> 	Number of opens = 1
> 	Rows seen = 10
> 	Columns accessed from heap = {0, 1, 2, 3, 4, 5, 6, 7, 8}
> 		constructor time (milliseconds) = 0
> 		open time (milliseconds) = 0
> 		next time (milliseconds) = 0
> 		close time (milliseconds) = 0
> 		optimizer estimated row count:            4.80
> 		optimizer estimated cost:           39.53
> 		Index Scan ResultSet for SCANFIXED using index SCANFIXEDX at read committed isolation
level using instantaneous share row locking chosen by the optimizer
> 		Number of opens = 1
> 		Rows seen = 10
> 		Rows filtered = 0
> 		Fetch Size = 16
> 			constructor time (milliseconds) = 0
> 			open time (milliseconds) = 0
> 			next time (milliseconds) = 0
> 			close time (milliseconds) = 0
> 			next time in milliseconds/row = 0
> 		scan information: 
> 			Bit set of columns fetched=All
> 			Number of columns fetched=2
> 			Number of deleted rows visited=0
> 			Number of pages visited=2
> 			Number of rows qualified=10
> 			Number of rows visited=10
> 			Scan type=btree
> 			Tree height=2
> 			start position: 
> 	>= on first 1 column(s).
> 	Ordered null semantics on the following columns: 
> 			stop position: 
> 	> on first 1 column(s).
> 	Ordered null semantics on the following columns: 
> 			qualifiers:
> None
> 			optimizer estimated row count:            4.80
> 			optimizer estimated cost:           39.53

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message