db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "msatoorPotter ." <msat...@gmail.com>
Subject Re: DERBY-6301 - In list predicate not getting moved down to store for a subset of in list queries
Date Tue, 11 Feb 2014 07:57:17 GMT
Dag, thanks for sharing the reverse scenario you had to deal with. Just
want to write down my understanding of what is happening for my specific
case.

Spend some time in code generation and run time code to see what happens
for the query below
select * from t1 where c11  in
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30);
when table has only 200 rows and there is a primary key defined on column
c11.
At code generation time, ProjectRestrictNode has in list predicate
classified as start and stop predicate but since we found that bulk table
scan is more appropriate for this query instead of index scan(because
number of elements in the in list are 30 and table has only200 rows), we
remove the predicate from the list of start/stop predicate and put it as
restriction at ProjectRestrictNode level(the stack trace of that code
sequence is as follows). This is where we loose the information about
predicate between start/stop predicate which is right anyways because
start/stop predicate only makes sense for index scan and not for table scan.
Thread [main] (Suspended)
 ProjectRestrictNode.generateMinion(ExpressionClassBuilder, MethodBuilder,
boolean) line: 1365
 ProjectRestrictNode.generate(ActivationClassBuilder, MethodBuilder) line:
1301
 ProjectRestrictNode.generateMinion(ExpressionClassBuilder, MethodBuilder,
boolean) line: 1348
 ProjectRestrictNode.generate(ActivationClassBuilder, MethodBuilder) line:
1301
 ScrollInsensitiveResultSetNode.generate(ActivationClassBuilder,
MethodBuilder) line: 86
 CursorNode.generate(ActivationClassBuilder, MethodBuilder) line: 628
 CursorNode(StatementNode).generate(ByteArray) line: 317
 GenericStatement.prepMinion(LanguageConnectionContext, boolean, Object[],
SchemaDescriptor, boolean) line: 547
 GenericStatement.prepare(LanguageConnectionContext, boolean) line: 99
 GenericLanguageConnectionContext.prepareInternalStatement(SchemaDescriptor,
String, boolean, boolean) line: 1116
 EmbedStatement.execute(String, boolean, boolean, int, int[], String[])
line: 682
 EmbedStatement.execute(String) line: 631
 ij.executeImmediate(String) line: 367
 utilMain.doCatch(String) line: 527
 utilMain.runScriptGuts() line: 372
 utilMain.go(LocalizedInput[], LocalizedOutput) line: 245
 Main.go(LocalizedInput, LocalizedOutput) line: 229
 Main.mainCore(String[], Main) line: 184
 Main.main(String[]) line: 75
 ij.main(String[]) line: 59
restriction at ProjectRestrictNode looks like following
restriction BinaryRelationalOperatorNode  (id=1139)
 beginOffset -1
 betweenSelectivity false
 btnVis BaseTableNumbersVisitor  (id=1708)
 cm ContextManager  (id=1993)
 constantActionFactory null
 dataTypeServices DataTypeDescriptor  (id=1165)
 endOffset -1
 forQueryRewrite false
 inListProbeSource InListOperatorNode  (id=1533)
  beginOffset 22
  cm ContextManager  (id=1993)
  constantActionFactory null
  dataTypeServices DataTypeDescriptor  (id=1861)
  endOffset 113
  isOrdered true
  isPrivilegeCollectionRequired true
  lcc GenericLanguageConnectionContext  (id=1194)
  leftOperand ColumnReference  (id=1215)
  methodName "in" (id=1122)
  operator "IN" (id=1974)
  rightOperandList ValueNodeList  (id=2443)
  sortDescending false
  transformed false
  visitableTags null
 isPrivilegeCollectionRequired true
 kind 2
 kind 0
 lcc GenericLanguageConnectionContext  (id=1194)
 leftInterfaceType "org.apache.derby.iapi.types.DataValueDescriptor"
(id=1089)
 leftOperand ColumnReference  (id=1215)
 methodName "equals" (id=2102)
 operator "=" (id=1349)
 optBaseTables JBitSet  (id=1080)
 receiver null
 relOpType 1
 resultInterfaceType null
 rightInterfaceType "org.apache.derby.iapi.types.DataValueDescriptor"
(id=1089)
 rightOperand ParameterNode  (id=1783)
 transformed false
 valNodeBaseTables JBitSet  (id=1065)
 visitableTags null
 xmlQuery null
We continue in ProjectRestrictNode.generateMinion and convert the
restriction to a method call at runtime. This restriction method will be
run on every row returned from the store since store does not know anything
about restrictions since no start/stop/qualifier has been passed to bulk
table scan calls to store. So at the end of code generation, we have
generated two kinds of resultsets,
1)BulkTableScanResultSet with no qualifers/start/stop keys, so all the rows
from the table will be returned by store and
2)ProjectRestrictResultSet which will use BulkTableScanResultSet generated
earlier as the source. And for each row returned by BulkTableScanResultSet,
it will apply the restriction method generated earlier which will go
through the row returned and see if returns true for the inlist elements.
So, by this point, we have lost all the information about inlist. It is
only available indirectly as restriction method but there is no way to get
to it at this point for the way the code is written. Following is the
relevant code from ProjectRestrictResultSet.getNextRowCore
    do
    {
 candidateRow = source.getNextRowCore(); //this is the call to
BulkTableScanResultSet to get next row
 if (candidateRow != null)
 {
  beginRT = getCurrentTimeMillis();
  /* If restriction is null, then all rows qualify */
  if (restriction == null)
  {
   restrict = true;
  }
  else
  {
   setCurrentRow(candidateRow);
                ////Following is the call to restriction method to see if
row returned by
                        //BulkTableScanResultSet qualifies or not
          restrictBoolean = (DataValueDescriptor)
                                   restriction.invoke(activation);
The above blurb just explains what is happening at code generation and
execution time for an in list which was identified by optimizer as not a
good candidate for multi-probing and subsequently, table scan was found to
be the most efficient plan for it.


On Fri, Feb 7, 2014 at 3:04 AM, Dag H. Wanvik <dag.wanvik@oracle.com> wrote:

> For what it's worth, I just had the opposite problem: I want the
> ProjectRestrictNode to *not* push qualifiers down to store, and I found I
> could avoid that by making sure this test in PRN was extended with my case,
> ca line 707:
>
>         if ((restrictionList != null) && !alreadyPushed &&
> !hashJoinWithThisPRN && !validatingDeferredCheckConstraints)
>         {
>             restrictionList.pushUsefulPredicates((Optimizable)
> childResult);
>         }
>
> so somehow you'd need to get the pushDown to happen in your case.
>
> Dag
>
>
>
> On 06. feb. 2014 07:51, Mamta Satoor wrote:
>
>> Hi,
>> I have spent sometime looking through the code for DERBY-6301(SQL layer
>> should push down IN list predicates to store when doing a scan). I have
>> found that if the number of elements in the in list are fairly large,
>> optimizer will find that table scan is better than using an available
>> index. This is a good optimization since it is indeed better to use table
>> scan for such an in list query. But the problem is when we talk to the
>> store about doing table scan, we do not pass any qualifiers to the store
>> based on the in list. This causes store to lock more rows than really
>> necessary. As Mike noted down in DERBY-6301 "In addition to performance
>> considerations this presents a locking problem with respect to the
>> repeatable read isolation level. It is optimal in repeatable read to not
>> maintain locks on those
>> rows that do not qualify. Currently this locking optimization only takes
>> place for those rows that are qualified in the store vs. those qualified in
>> the upper SQL layer. So in the case of a non-multi-probe IN-LIST plan all
>> non-qualified rows looked at as part of the execution will remain locked in
>> repeatable read. "
>> During my debugging, following is where I see that there are no
>> qualifiers sent to BulkTableScanResultSet.
>> BulkTableScanResultSet(TableScanResultSet).openCore() line: 246
>>  BulkTableScanResultSet.openCore() line: 286
>>  ProjectRestrictResultSet.openCore() line: 174
>>  ProjectRestrictResultSet(BasicNoPutResultSetImpl).open() line: 266
>>  GenericPreparedStatement.executeStmt(Activation, boolean, long) line:
>> 439
>>  GenericPreparedStatement.execute(Activation, long) line: 320
>>  EmbedStatement.executeStatement(Activation, boolean, boolean) line: 1337
>>  EmbedStatement.execute(String, boolean, boolean, int, int[], String[])
>> line: 704
>>  EmbedStatement.execute(String) line: 631
>>  ij.executeImmediate(String) line: 367
>>  utilMain.doCatch(String) line: 527
>>  utilMain.runScriptGuts() line: 372
>>  utilMain.go(LocalizedInput[], LocalizedOutput) line: 245
>>  Main.go(LocalizedInput, LocalizedOutput) line: 229
>>  Main.mainCore(String[], Main) line: 184
>>  Main.main(String[]) line: 75
>>  ij.main(String[]) line: 59
>> I tried a simple query as shown below to verify that there indeed are
>> other places in the code where we pass qualifiers when doing table scan and
>> we need to mimic something similar for in list when table scan seems a
>> better option for in list. An example query which does table scan and
>> passes qualifier is as below
>> select * from t2 where c21>=1 and c21<=30;
>> Table t2 above has 4 columns with no index on any column. It has 200 rows
>> with values ranging from 1 to 200 in column c21. When i run this query in
>> ij with log query plan, I see following query plan for it.
>> Wed Feb 05 14:50:19 PST 2014 Thread[main,5,main] (XID = 1082), (SESSIONID
>> = 1), select * from t2 where c21>=1 and c21<=30 ******* Table Scan
>> ResultSet for T2 at read committed isolation level using instantaneous
>> share row locking chosen by the optimizer
>> Number of opens = 1
>> Rows seen = 30
>> Rows filtered = 0
>> Fetch Size = 16
>>  constructor time (milliseconds) = 0
>>  open time (milliseconds) = 0
>>  next time (milliseconds) = 0
>>  close time (milliseconds) = 0
>>  next time in milliseconds/row = 0
>> scan information:
>>  Bit set of columns fetched=All
>>  Number of columns fetched=4
>>  Number of pages visited=3
>>  Number of rows qualified=30
>>  Number of rows visited=200
>>  Scan type=heap
>>  start position:
>>   null
>>  stop position:
>>   null
>>  qualifiers:
>>   Column[0][0] Id: 0
>>   Operator: <
>>   Ordered nulls: false
>>   Unknown return value: true
>>   Negate comparison result: true
>>   Column[0][1] Id: 0
>>   Operator: <=
>>   Ordered nulls: false
>>   Unknown return value: false
>>   Negate comparison result: false
>>  optimizer estimated row count: 25.16
>>  optimizer estimated cost: 81.76
>> I will start investigating how are the qualifiers sent for the above
>> query using AND and see if I get the language layer to do the same for in
>> list cases with table scan and an index on the column on which in list is
>> being used. If anyone has any ideas on how to do this, I will highly
>> appreciate that
>> thanks,
>> Mamta
>>
>
>

Mime
View raw message