db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3872) NullPoinerException thrown when INTEGER function used as a predicate in a WHERE clause of a SELECT .. GROUP BY .. HAVING statement
Date Mon, 29 Sep 2008 16:53:44 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635443#action_12635443
] 

Mamta A. Satoor commented on DERBY-3872:
----------------------------------------

I have spent more time on the execution side of the query in question to see why exactly we
are running into NPE and following is what I found. At the high level, when a query is executed
at the JDBC layer, Derby execution code goes through opening the language resultsets associated
with the query. This code is in the method open() for the topmost language resultset and for
all the language resultsets underneath the topmost resultset, openCore() method gets called.
When the user natigates throught the JDBC resultset after the query execution, Derby execution
code goes through getNextRow() method for the topmost language resultset and getNextRowCore()
method for the resultsets underneath the topmost resultset. There are some exceptions in Derby
code where the row fetching on the resultsets happen during the query execution (ie even before
the user has requested say "next" on the JDBC resultset). One such example is when we have
GroupedAggregateRS which has it's isInSortedOrder flag set to true. In this specific query,
this flag is set to true indicating a performance optimization to avoid sorting and this pre-fetching
of the rows at the time of JDBC query execution is resulting in NPE. More detailed information
below.

In the following discussion 
	RS stands for generated language resultset.
	PRS stands for ProjectRestrictedRS.
	RSN stands for resultset number associated with the RS.
	LeftRS stands for left RS.
	RightRS stands for right RS.

The query in question which results in NPE is as follows
select q1."DEPTNO" from DEPTTAB q1, EMPTAB q2 -- DERBY-PROPERTIES joinStrategy = HASH 
where ( q2."DEPT_DEPTNO" = q1."DEPTNO") 
GROUP BY q1."DEPTNO" HAVING 1 = 
  ( select q3."SALARY" from EMPTAB q3 where ( q3."EMPID" = q1."DEPTNO") ); 


The topmost generated RS for the entire query is a PRS(RSN = 6) and the source for this topmost
PRS is GroupedAggregateRS (RSN = 1). The topmost PRS has restriction associated with it. The
restriction handles just the HAVING clause of the query. The rest of the query (before the
HAVING clause) is handled through GroupedAggregateRS (RSN = 1).

To be specific, GroupedAggregateRS (RSN=1) covers the following part of the query
select q1."DEPTNO" from DEPTTAB q1, EMPTAB q2 -- DERBY-PROPERTIES joinStrategy = HASH 
where ( q2."DEPT_DEPTNO" = q1."DEPTNO") 
GROUP BY q1."DEPTNO" 
The generated RSs for this part of the query look as follows
   The topmost generated RS for the entire query is a PRS(RSN = 6)
	The source for PRS is GroupedAggregateRS (RSN = 1)
		The source for GroupedAggregateRS is PRS (RSN = 5)
			The source for PRS is HashJoinRS (RSN = 2)
				LeftRS is BulkTableScanRS (RSN = 3) DEPTTAB q1***
				RightRS is HashScanRS (RSN = 4) EMPTAB q2 ***Has qualifier associated with it.

The remaining part of the query as shown below is covered by the "restriction" field in PRS
(RSN=6)
HAVING 1 = ( select q3."SALARY" from EMPTAB q3 where ( q3."EMPID" = q1."DEPTNO") ); 
The generated RSs for the "restriction" on PRS (RSN=6) looks as follows
  The topmost RS for 2nd part of query is OnceRS (RSN = 10)
     The source for OnceRS is PRS (RSN = 9)
        The source for PRS is IndexRowToBaseRowRS (RSN = 7) for EMPTAB q3
           The source for IndexRowToBaseRowRS is TableScanRS (RSN = 8)**has startKeyGetter
associated with it.

When the query is executed from JDBC layer (say using PreparedStatement.execute), we start
by calling the open() method (which calls openCore) on the topmost RS which is PRS (RSN=6).
On all the subsequent RSs underneath the topmost RS, we will be calling the openCore() method.
The openCore method on PRS (RSN=6) calls the openCore method on GroupedAggregateRS(RSN=1)
which calls openCore on PRS (RSN=5) which makes a call to openCore on HashJoinRS(RSN=2). The
**interesting thing here is** HashJoinRS calls openCore on the leftRS and then it **gets the
next row on that leftRS**. This results in call to BulkTableScanRS(RSN=3).getNextRowCore The
row that is obtained here is set as the current row for this RS in the activation object.
So, the current rows for various RSs in activation object at this point looks as follows
[null, null, null, { 1 }, null, null, null, null, null, null, null]

Then the openCore method on HashJoinRS(RSN=2) calls openCore method on the rightRS(RSN=4)
which is a HashScanRS. HashScanRS in it's openCore method deals with the qualifiers associated
with it. In this particular query, the qualifier code looks at the current row associated
with the leftRS in the activation object. The call to look at current row for leftRS in activation
object happens in BaseActivation.getColumnFromRow(int, int) line: 1419	
Once this work is done for the rightRS(RSN=4), the control goes back to openCore in GroupedAggregateRS
(RSN=1). We check if isInSortedOrder is set to true for GroupedAggregateRS which in this particular
case is set to true. Because of that, we execute following piece of code
	if (isInSortedOrder)
	{
		currSortedRow = getNextRowFromRS();
		if (currSortedRow != null)
		{
			currSortedRow = (ExecIndexRow) currSortedRow.getClone();
			initializeVectorAggregation(currSortedRow);
		}
	}
Notice that the above piece of code is causing us to start getting rows from the resultset
even before the JDBC user has requested a next on the JDBC ResultSet. The code above causes
us to get next row in HashJoinRS rightRS(RSN=4) because we have already gotten the next row
on the leftRS earlier. Once we have the current row for HashJoinRS, we set it as the current
row for this RS in the activation object. So, the current rows for various RSs in activation
object at this point looks as follows 
[null, null, null, { 1 }, { 1 }, null, null, null, null, null, null]

Now that we have the current row for the leftRS and the rightRS, we merge the 2 rows and establish
the current row for HashJoinRS(RSN=2) and once again, we set this current row for HashJoinRS
in the activation object. So, the current rows for various RSs in activation object at this
point looks as follows
[null, null, { 1, 1 }, { 1 }, { 1 }, null, null, null, null, null, null]

Next, we set the current row of the PRS(RSN=5) and the current rows for various RSs in activation
object at this point looks as follows
[null, null, { 1, 1 }, { 1 }, { 1 }, { 1 }, null, null, null, null, null]

Once the current rows for various RSs associated with GroupedAggregateRS and RSs underneath
are set, the openCore processing is over. We do not set the current row for the GroupedAggregateRS
in openCore, it wil lbe done when the JDBC user invokes next on the JDBC resultset. The code
associated with HAVING clause also gets run when the JDBC user requests next on the JDBC ResultSet.
The code path for that next request is described below and that is where we run into NPE.


All the language RSs have methods called getNextRow() and getNextRowCore(). getNextRow() gets
called only on the topmost RS. getNextRowCore() gets called on RSs underneath the topmost
RS. We start out with following code flow
	PRN(RSN=6).getNextRow
	  PRN(RSN=6).getNextRowCore
	    GroupedAggregateRS(RSN=1).getNextRowCore
	      PRN(RSN=5).getNextRowCore
	        HashJoinRS(RSN=2).getNextRowCore
	          rightRS which is HashScanRS(RSN=4).getNextRowCore

There are no more rows in EMPTAB this time which is what HashScanRS (RSN=4) is dealing with
and hence the current row for it will be null and it will set to that in the activation object
as well as shown below.
[null, null, { 1, 1 }, { 1 }, null, { 1 }, null, null, null, null, null]
Since there is no row on the rightRS, we check if there is a row on the leftRS(BulkTableScanRS
(RSN=3)). There is no row in the leftRS also. We mark it so in the activation object.
[null, null, { 1, 1 }, null, null, { 1 }, null, null, null, null, null]

Absence of row on the leftRS results in closing the rightRS. Also, since there is no row on
the leftRS and rightRS, we clear the current row for the HashJoinRS and also mark current
row null for it in the activation object as shown below.
[null, null, null, null, null, { 1 }, null, null, null, null, null]
This current row null setting goes up to PRS(RSN=5) which is on top of HashJoinRS as shown
below
[null, null, null, null, null, null, null, null, null, null, null]
GroupedAggreateRS.getNextRowCore code sees that there are no more rows that could be used
aggregation and hence it settles with the row prior to finding the null row. The activation
object reflects that as follows
[null, { 1 }, null, null, null, null, null, null, null, null, null]

At this point, the only RS that has a current row set on it is the GroupedAggregateRS and
control of the code at this point is in the topmost RS PRS(RSN=6) 
	ProjectRestrictResultSet.getNextRowCore() line: 266
Here we check if there is any restriction associated with this PRS and in this particular
query, the HAVING clause is implemented as the restriction on the PRS. We first set the current
row for this PRS to be what we got from the GroupedAggregateRS and next we will apply restriction
to that row. The activation object looks as follows
[null, { 1 }, null, null, null, null, { 1 }, null, null, null, null]
The code for applying the restriction is as follows in PRS.getNextRowCore() line: 267
            restrictBoolean = (DataValueDescriptor) 
			restriction.invoke(activation);
The execution of the code above causes us the create the RSs associated with the HAVING clause
which are as follows	
The topmost RS for 2nd part of query is OnceRS (RSN = 10)
  The source for OnceRS is PRS (RSN = 9)
    The source for PRS is IndexRowToBaseRowRS (RSN = 7) for EMPTAB q3
      The source for IndexRowToBaseRowRS is TableScanRS (RSN = 8)**has startKeyGetter associated
with it.

We go through the process of opening each one of these RSs by calling the openCore methods
on them. The NullPointerException happens when we are opening the TableScanRS(RSN=8) which
has startKeyGetter associated with it. We try to find the startPosition using the startKeyGetter
as shown below
		if (startKeyGetter != null)
		{
			startPosition = (ExecIndexRow) startKeyGetter.invoke(activation);
			if (sameStartStopPosition)
			{
				stopPosition = startPosition;
			}
		}
The startKeyGetter.invoke causes us to look at the current row of the RSN=3 which is the RS
associated with the BulkTableScanRS for "DEPTTAB q1" but since the current row for it in activation
object has been set to null, we run into NPE. 
Thread [main] (Suspended (breakpoint at line 1419 in BaseActivation))	
	acf81e0010x011cxa07fxbcfex0000003f01400(BaseActivation).getColumnFromRow(int, int) line:
1419	
	acf81e0010x011cxa07fxbcfex0000003f01400.e6() line: not available	
	DirectCall.invoke(Object) line: 151	
	TableScanResultSet.openCore() line: 250	
	IndexRowToBaseRowResultSet.openCore() line: 202	
	ProjectRestrictResultSet.openCore() line: 168	
	OnceResultSet.openCore() line: 119	
	acf81e0010x011cxa07fxbcfex0000003f01400.g0() line: not available [local variables unavailable]

	acf81e0010x011cxa07fxbcfex0000003f01400.e4() line: not available	
	DirectCall.invoke(Object) line: 147	
	ProjectRestrictResultSet.getNextRowCore() line: 267	
	ProjectRestrictResultSet(BasicNoPutResultSetImpl).getNextRow() line: 460	
	EmbedResultSet40(EmbedResultSet).movePosition(int, int, String) line: 423	
	JDBCDisplayUtil.indent_DisplayResults(PrintWriter, ResultSet, Connection, int, int[], int[])
line: 338	
	JDBCDisplayUtil.DisplayResults(PrintWriter, Statement, Connection) line: 229	
	utilMain.doCatch(String) line: 509	
	JDBCDisplayUtil.indent_DisplayResults(PrintWriter, Statement, Connection, int, int[], int[])
line: 241	
	JDBCDisplayUtil.DisplayResults(PrintWriter, Statement, Connection) line: 229	
	utilMain.displayResult(LocalizedOutput, ijResult, Connection) line: 435	
	utilMain.go(LocalizedInput[], LocalizedOutput, Properties) line: 248	
	utilMain.runScriptGuts() line: 350	
	Main.mainCore(String[], Main) line: 181	
	Main.go(LocalizedInput, LocalizedOutput, Properties) line: 215	
	Main.mainCore(String[], Main) line: 181	
	Main.main(String[]) line: 73	
	ij.main(String[]) line: 59	

Now, that I understand what is causing the NPE to happen, the next step is to figure out should
the current row for the RS for DEPTTAB be null at this point when we know from the query that
it needs to be available for us to enforce HAVING clause.

I realize that this is a very long comment but I wanted to share the code flow during the
query execution to pinpoint what is exactly happening. 

As always, will appreciate any insight in understanding should the row which is needed to
enforce the HAVING clause be null. Is the issue with code generation where we are trying to
avoid sorting when the rows used for sorting are actually needed for the HAVING clause enforcement.

> NullPoinerException thrown when INTEGER function used as a predicate in a WHERE clause
of a SELECT .. GROUP BY ..  HAVING statement
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3872
>                 URL: https://issues.apache.org/jira/browse/DERBY-3872
>             Project: Derby
>          Issue Type: Bug
>    Affects Versions: 10.3.3.0
>            Reporter: Stan Bradbury
>            Assignee: Mamta A. Satoor
>         Attachments: NPE_Reproduction.sql, QueryWithoutTruePred.out, QueryWithTruePred.out
>
>
> Use attached SQL script to create two tables , execute the following SQL and throw the
exception and stack trace below.  NOTE:  removing the 'always true' clause '.. ( integer (1.1)
= 1) .." from the SQL and the query does not fail.  Releated??
> select  q1."DEPTNO" from DEPTTAB q1, EMPTAB q2 where  ( integer (1.1) = 1)  and  ( q2."DEPT_DEPTNO"
=  q1."DEPTNO")  
> GROUP BY q1."DEPTNO" 
> HAVING  max( q2."SALARY") >=  ( select  q3."SALARY" from EMPTAB q3 where  ( q3."EMPID"
=  q1."DEPTNO") ) 
> ERROR 38000: The exception 'java.lang.NullPointerException' was thrown while evaluating
an expression.
> . . .derby.iapi.error.StandardException.newException
> . . .derby.iapi.error.StandardException.unexpectedUserException
> . . .derby.impl.services.reflect.DirectCall.invoke
> . . .derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore
> . . .derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow
> . . .derby.impl.jdbc.EmbedResultSet.movePosition
> . . .derby.impl.jdbc.EmbedResultSet.next
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.DisplayResults
> . . .derby.impl.tools.ij.utilMain.displayResult
> . . .derby.impl.tools.ij.utilMain.doCatch
> . . .derby.impl.tools.ij.utilMain.runScriptGuts
> . . .derby.impl.tools.ij.utilMain.go
> . . .derby.impl.tools.ij.Main.go
> . . .derby.impl.tools.ij.Main.mainCore
> . . .derby.impl.tools.ij.Main14.main
> . . .derby.tools.ij.main
> Caused by: java.lang.NullPointerException
> . . .derby.iapi.types.NumberDataType.compare
> . . .derby.impl.store.access.btree.ControlRow.compareIndexRowFromPageToKey
> . . .derby.impl.store.access.btree.ControlRow.searchForEntry
> . . .derby.impl.store.access.btree.LeafControlRow.search
> . . .derby.impl.store.access.btree.BTreeScan.positionAtStartForForwardScan
> . . .derby.impl.store.access.btree.BTreeForwardScan.positionAtStartPosition
> . . .derby.impl.store.access.btree.BTreeForwardScan.fetchRows
> . . .derby.impl.store.access.btree.BTreeScan.fetchNext
> . . .derby.impl.sql.execute.TableScanResultSet.getNextRowCore
> . . .derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore
> . . .derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore
> . . .derby.impl.sql.execute.OnceResultSet.getNextRowCore
> . . .derby.exe.ac601a400fx011cx480cx5eacx00000010d8100.g0
> . . .derby.exe.ac601a400fx011cx480cx5eacx00000010d8100.e6
> 	... 17 more
> ============= begin nested exception, level (1) ===========
> java.lang.NullPointerException
> . . .derby.iapi.types.NumberDataType.compare
> . . .derby.impl.store.access.btree.ControlRow.compareIndexRowFromPageToKey
> . . .derby.impl.store.access.btree.ControlRow.searchForEntry
> . . .derby.impl.store.access.btree.LeafControlRow.search
> . . .derby.impl.store.access.btree.BTreeScan.positionAtStartForForwardScan
> . . .derby.impl.store.access.btree.BTreeForwardScan.positionAtStartPosition
> . . .derby.impl.store.access.btree.BTreeForwardScan.fetchRows
> . . .derby.impl.store.access.btree.BTreeScan.fetchNext
> . . .derby.impl.sql.execute.TableScanResultSet.getNextRowCore
> . . .derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore
> . . .derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore
> . . .derby.impl.sql.execute.OnceResultSet.getNextRowCore
> . . .derby.exe.ac601a400fx011cx480cx5eacx00000010d8100.g0
> . . .derby.exe.ac601a400fx011cx480cx5eacx00000010d8100.e6
> . . .derby.impl.services.reflect.DirectCall.invoke
> . . .derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore
> . . .derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow
> . . .derby.impl.jdbc.EmbedResultSet.movePosition
> . . .derby.impl.jdbc.EmbedResultSet.next
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.indent_DisplayResults
> . . .derby.tools.JDBCDisplayUtil.DisplayResults
> . . .derby.impl.tools.ij.utilMain.displayResult
> . . .derby.impl.tools.ij.utilMain.doCatch
> . . .derby.impl.tools.ij.utilMain.runScriptGuts
> . . .derby.impl.tools.ij.utilMain.go
> . . .derby.impl.tools.ij.Main.go
> . . .derby.impl.tools.ij.Main.mainCore
> . . .derby.impl.tools.ij.Main14.main
> . . .derby.tools.ij.main
> ============= end nested exception, level (1) ===========
> Cleanup action completed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message