kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lianle987 (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-2983) Support for INNER JOINS in any order in the FROM clause
Date Sun, 25 Nov 2018 09:05:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698113#comment-16698113
] 

lianle987 commented on KYLIN-2983:
----------------------------------

also need this feature.

> Support for INNER JOINS in any order in the FROM clause
> -------------------------------------------------------
>
>                 Key: KYLIN-2983
>                 URL: https://issues.apache.org/jira/browse/KYLIN-2983
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Query Engine
>    Affects Versions: v2.0.0, v2.1.0
>            Reporter: Roberto Tardío Olmos
>            Assignee: liyang
>            Priority: Major
>              Labels: features, scope
>
> Kylin versions after Kylin 1.6 (2.0 and 2.1) do not support queries over dimensions with
INNER JOINS if you do not put first the fact table in the FROM clause. I think this can be
related with new snowflake schema support. 
> For example. I defined a data model over a hive star schema. Therefore I defined INNER
JOINS between fact table (F_RENDIMIENTO) and dimension tables (D_CURSO_ACADEMICO_VK). I also
defined a cube and built it. 
> The issue is, the following query goes:
> _select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)
>  from F_RENDIMIENTO JOIN D_CURSO_ACADEMICO_VK ON F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO
>  group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO_
> But susprisingly if I change the INNER JOIN order the following query does not go
> _
> select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO, sum(CREDITOS)
> from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON F_RENDIMIENTO.ID_CURSO_ACADEMICO = D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO
> group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO_
> {color:red}Error while executing SQL "select D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO,
sum(CREDITOS) from D_CURSO_ACADEMICO_VK JOIN F_RENDIMIENTO ON F_RENDIMIENTO.ID_CURSO_ACADEMICO
= D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO group by D_CURSO_ACADEMICO_VK.ID_CURSO_ACADEMICO
LIMIT 50000": No realization found for rel#7393:OLAPTableScan.OLAP.[](table=[DM_ACAD_KYLIN_ORC,
D_CURSO_ACADEMICO_VK],fields=[0, 1]), JoinDesc [type=INNER, primary_key=[ID_CURSO_ACADEMICO],
foreign_key=[ID_CURSO_ACADEMICO]]{color}
> Since this issue is possible related to snowflake schema support, I suggest to differentiate
in metadata and then, kylin query engine, between this two types of schemas, in order to allow
this kind of queries over a Star Schema. Another possibility is to check on query engine if
there is a INNER JOIN, independently of the kind of Hive Source Schema. What do you think
about?
> This issue could lead to human errors writing ANSI-92 SQL queries. Furthermore some BI
tools, specially old ones, can generate this kind of queries not supported by Kylin 2.0 and
2.1.
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message