db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Chaney (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6819) incorrect results from a multi-column group by query
Date Thu, 18 Jun 2015 00:14:01 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14590898#comment-14590898
] 

Craig Chaney commented on DERBY-6819:
-------------------------------------

I was able to work around this by changing the order of the joins, like:

select e.PART_NAME as ENGINE, b.PART_NAME as BODY, count(distinct m.ID) as NUM_MODELS from
CARS.MODELS m
left outer join CARS.TOP_SPEED s on m.ID=s.ID
left outer join CARS.PART_NAMES e on m.ENGINE_ID=e.ID
left outer join CARS.PART_NAMES b on m.BODY_ID=b.ID
where s.SPEED>50
group by e.PART_NAME, b.PART_NAME

> incorrect results from a multi-column group by query
> ----------------------------------------------------
>
>                 Key: DERBY-6819
>                 URL: https://issues.apache.org/jira/browse/DERBY-6819
>             Project: Derby
>          Issue Type: Bug
>          Components: JDBC, SQL
>    Affects Versions: 10.10.2.0, 10.11.1.1
>         Environment: Ubuntu Linux, Java 6, JDBC
>            Reporter: Craig Chaney
>         Attachments: sample_db.sql
>
>
> I'm getting some strange results from a specific pattern of queries related to finding
a count of columns grouped in two dimensions.
> The following query works as I would expect:
> select e.PART_NAME as ENGINE, b.PART_NAME as BODY, count(distinct m.ID) as NUM_MODELS
from CARS.MODELS m
> left outer join CARS.PART_NAMES e on m.ENGINE_ID=e.ID
> left outer join CARS.PART_NAMES b on m.BODY_ID=b.ID
> group by e.PART_NAME, b.PART_NAME
> This returns something like:
> ENGINE	BODY	NUM_MODELS
> electric	compact	1
> gas	compact	2
> gas	sedan	1
> gas	truck	2
> hybrid	compact	1
> hybrid	sedan	2
> So this fictitious car company sells 2 different hybrid sedans, one gas sedan etc.
> If I add a filter to the query that should not actually change the output, I see output
that doesn't make sense.
> This query filters out any car whose top speed is less than 50 (and all cars have a top
speed higher than this):
> select e.PART_NAME as ENGINE, b.PART_NAME as BODY, count(distinct m.ID) as NUM_MODELS
from CARS.MODELS m
> left outer join CARS.PART_NAMES e on m.ENGINE_ID=e.ID
> left outer join CARS.PART_NAMES b on m.BODY_ID=b.ID
> left outer join CARS.TOP_SPEED s on m.ID=s.ID
> where s.SPEED>50
> group by e.PART_NAME, b.PART_NAME
> The results show the wrong values in column 2:
> ENGINE	BODY	NUM_MODELS
> electric	electric	1
> gas	gas	2
> gas	gas	1
> gas	gas	2
> hybrid	hybrid	1
> hybrid	hybrid	2
> I've tried the same query on DB2 with the same data and I get the results that I expect
-- that is, both queries return the same result that I showed on the first query here.
> I'll attach a script that creates a database with the sample data used above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message