ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Kondakov <kondako...@mail.ru.INVALID>
Subject Re: New SQL execution engine
Date Fri, 27 Sep 2019 12:10:50 GMT
Hello Nikolay,

please see IEP--37 [1]. Issues are there.


[1] 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084


-- 
Kind Regards
Roman Kondakov

On 27.09.2019 14:20, Nikolay Izhikov wrote:
> Hello, Roman.
>
>> Also Apache Calcite is commonly used in popular Apache projects
> I don't think it's a good point.
> H2 is also commonly used.
> But, it doesn't conform to Ignite requirements.
>
> Can you, please, write down issues and engine requirements to the IEP?
> So we can discuss each point separately.
>
>
> В Пт, 27/09/2019 в 13:56 +0300, Roman Kondakov пишет:
>> Hello Nikolay.
>>
>> You've asked very good questions. I'll try to answer.
>>
>>> 1. What the exact issues with the H2 integration?
>>> Can you send a tickets links?
>>> Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
>> Current SQL engine is confined in the single-pass map-reduce algorithm.
>> This make impossible to execute complex queries which can not be
>> expressed with a single map-reduce pass like subqueries with aggregates
>> [1].  Another problem is that H2 optimizer is very primitive and not
>> able to perform many useful optimizations [2].
>>
>> Also Apache Calcite is commonly used in popular Apache projects like
>> Hive, Drill, Flink and others [3]. So it's mature and well battle tested
>> framework, while H2 is a toy database which is hardly ever used in the
>> real production systems.
>>
>>> 2. What are the requirements for the new SQL engine?
>>> We should write it down and discuss.
>> The main requirement is to fix the problems listed above. The new SQL
>> engine should be able to *effectively* execute SQL queries of the
>> *arbitrary complexity*. For example the new engine will be able to
>> perform distributed joins in a multiple ways [4], when current engine
>> can do it only in two ways: collocated and distributed (the latter is
>> usually not very efficient and needed to set manually).
>>
>>> 3. What options do we have?
>>> Are there any alternatives to Calcite on the market?
>>> We did the wrong choice that looked obvious one time.
>>> So we should carefully avoid it at this time.
>> I know the only one open source implementation of the efficient query
>> optimization strategy - and this is Apache Calcite. The alternative way
>> is to write our own query optimizer from scratch which is not a trivial
>> task at all.
>>
>>
>>> 4. What is improvements of Ignite we want to make with the new engine?
>> Ignite will be able to execute complex queries using optimal strategy. I
>> think this is a quite good improvement.
>>
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-11448
>> [2] https://issues.apache.org/jira/browse/IGNITE-6085
>> [3] https://calcite.apache.org/docs/powered_by.html
>> [4] https://www.memsql.com/blog/scaling-distributed-joins/

Mime
View raw message