db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Nielsen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3603) 'IN' clause ignores valid results, incorrect qualifier handling suspected
Date Fri, 11 Apr 2008 09:18:05 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12587894#action_12587894
] 

Thomas Nielsen commented on DERBY-3603:
---------------------------------------

Looking at the queryplans I see something quite interesting.

Both queries S2 and S3 end up with the same query plan. The top looking like this:

                              Scalar
                                    |
                                 PRN
                                    |
                     NestedLoop ExistsJoin
                              /                     \
NestedLoopExistsJoin        IndexScanResultSet
   /         \                                          |
...           ...                                      ...

The interesting stuff happens in the lower, right-hand IndexScanResultSet (let's call it ISR1)

In the correct 3-row query S3, ISR1 sees 3 rows and qualifies 3 rows.
In the incorrect 2-row query S2, ISR1 seens 3 rows, but only qualifies 2 of them.

So for S2, this becomes
                              Scalar
                                 ("2")
                                    |
                                 PRN
                               (2 rows)
                                    |
                     NestedLoop ExistsJoin
                    (3 rows L, 2 rows R)
                              /                     \
NestedLoopExistsJoin        IndexScanResultSet
(3 rows L, 3 rows R)             (sees 3 rows, qualifies 2 rows)
   /         \                                          |
...           ...                                      ...

The query plan for both S2 and S3 state that ISR1 is an
"Index Scan ResultSet for ADMIN_UNITS using constraint ADMIN_UNIT_ID_PK at read committed
isolation level using share row locking chosen by the optimizer".

Next task would be to find out why ISR1 rejects/discards the third row in S2, and not in S3.

> 'IN' clause ignores valid results, incorrect qualifier handling suspected
> -------------------------------------------------------------------------
>
>                 Key: DERBY-3603
>                 URL: https://issues.apache.org/jira/browse/DERBY-3603
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.3.2.1, 10.4.1.1
>            Reporter: David Butterworth
>         Attachments: derbydb.jar, derbydb.tar.bz2
>
>
> Derbys' 'IN' clause is returning different results depending on which side of a joined
table
> I am doing my 'IN' comparison against. This only occurs when the number of items within
the 'IN' clause is greater then 1.
> This behaviour was also confirmed by Bryan Pendleton in this thread:
> http://mail-archives.apache.org/mod_mbox/db-derby-user/200804.mbox/%3c47FA5974.2060705@amberpoint.com%3e
> Using the test database attatched the following 2 queries produce the issue:
> ij>  select count(*) FROM spike.accounts account, spike.admin_units admin_unit,
>     spike.bookings booking
>     WHERE booking.child_id = 2 AND
>     admin_unit.admin_unit_id IN (1,21) AND
>     booking.booking_date_time_out >= 20080331000000 AND
>     booking.booking_date_time_in <= 20080406235900 AND
>     account.account_id = booking.account_id AND
>     admin_unit.admin_unit_id = account.admin_unit_id;
> 1          
> -----------
> 2          
> 1 row selected
> ij> select count(*) FROM spike.accounts account, spike.admin_units admin_unit,
>     spike.bookings booking
>     WHERE booking.child_id = 2 AND
>     account.admin_unit_id IN (1,21) AND
>     booking.booking_date_time_out >= 20080331000000 AND
>     booking.booking_date_time_in <= 20080406235900 AND
>     account.account_id = booking.account_id AND
>     admin_unit.admin_unit_id = account.admin_unit_id;
> 1          
> -----------
> 3          
> 1 row selected
> ij> 
> The only difference between the 2 statements is which side of a join the 'IN' clause
is matched against.
> Bryan performed some initial testing and stated the following:
> --------------------- SNIP ------------------------
> Interestingly, although the actual results do NOT contain any values
> for admin_unit_id = 21, if I change the query to:
>     admin_unit.admin_unit_id IN (1)
> or
>     account.admin_unit_id IN (1)
> then the problem disappears -- I get 3 rows for both queries.
> I also ran query plans for both the queries (in the IN (1,21) case)
> and have pasted the (simplified) query plans at the end of this message.
> I notice that in the case where the query gives 2 rows, which is
> when we specify admin_unit.admin_unit_id in (1,21), the admin_unit_id
> index scan output in the query plan contains:
>            qualifiers:
> Column[0][0] Id: 0
> Operator: =
> Ordered nulls: false
> Unknown return value: false
> Negate comparison result: false
> However, in the case where the query gives 3 rows, which is
> when we specify account.admin_unit_id in (1,21), the admin_unit_id
> index scan output in the query plan contains:
>            qualifiers:
> None
> I think it is the presence/absence of this qualifier on the query
> scan which is causing the different results in the query, as in
> the first case we see:
>            Number of rows qualified=2
>            Number of rows visited=3
> but in the second case we see:
>            Number of rows qualified=3
>            Number of rows visited=3
> I definitely don't have any explanation for why you are getting
> this odd behavior; it certainly seems like a bug to me.
> -------------END SNIP -----------------------

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message