ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Setrakyan (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (IGNITE-6865) Wrong map query build when using group by in both the outer and inner queries
Date Mon, 13 Nov 2017 08:47:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitriy Setrakyan reassigned IGNITE-6865:
-----------------------------------------

    Assignee: Vladimir Ozerov

> Wrong map query build when using group by in both the outer and inner queries
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-6865
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6865
>             Project: Ignite
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>          Components: sql
>    Affects Versions: 2.3
>            Reporter: Alin Andrei Corodescu
>            Assignee: Vladimir Ozerov
>             Fix For: 2.4
>
>
> The Ignite SQL engine builds an incorrect SQL query for the "map" part of a distributed
query when the select statements contains 2 nested group by's. I initiated a discussion on
the ignite users mailing list : [http://apache-ignite-users.70518.x6.nabble.com/Incorrect-map-query-built-when-joining-with-a-subquery-with-group-by-statement-td17784.html]
> To reproduce the error:
> Consider a simple table with only one column: Persons(name), and two Ignite nodes, each
containing 2 rows for this table, for example :
> Node 1 :
> {code}
> { p1 : name = "A"; p2 : name = "B"}
> {code}
> Node 2 :
> {code}
> { p3 : name = "A"; p4 : name = "B"}
> {code}
> Given the query :
> {code}
> SELECT t1.name, count(1)
> FROM "default".Persons t1
> JOIN (SELECT name from "default".Persons group by name) t2
> on t1.name = t2.name
> group by t1.name
> {code}
> The query won't be executed because a wrong map query is being built :
> {code}
> SELECT
> T1__Z0.NAME AS __C0_0,
> COUNT(1) AS __C0_1
> FROM "default".PERSONS T1__Z0
> /* "default".PERSONS._SCAN */
> {code}
> The map query uses an aggregate function even though the group by has been dropped, thus
the query can't be run on any of the nodes.
> I tested with the distributedJoins=true flag, and the behaviour is still the same. When
running with collocated=true however, it works (as it is expected since the query is passed
directly to the underlying H2 engine), but only retrieves data from the current node (as it
is expected) (result = (A,1 ; B,1).
> The workaround I found for this problem is to re-write the query as follows:
> {code}
> SELECT t1.name, count(1)
> FROM (select * from "default".Persons) t1
> JOIN (SELECT name from "default".Persons group by name) t2
> on t1.name = t2.name
> group by t1.name
> {code}
> This form is completely equivalent with the previous one and works as expected, and the
data returned is correctly calculated (A, 2 ; B, 2).
> I also found out that the problem doesn't arise when Ignite decides to use indexes instead
of a table scan, so the problem seems to be related with table scans.
> Please mind the example given is simply to reproduce the error. It was identified using
real production queries with much more complicated structure, but I was able to reproduce
it using this simple table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message