ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Kondakov <kondako...@mail.ru.INVALID>
Subject Re: New SQL execution engine
Date Fri, 27 Sep 2019 10:56:28 GMT
Hello Nikolay.

You've asked very good questions. I'll try to answer.

> 1. What the exact issues with the H2 integration?
> Can you send a tickets links?
> Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
Current SQL engine is confined in the single-pass map-reduce algorithm. 
This make impossible to execute complex queries which can not be 
expressed with a single map-reduce pass like subqueries with aggregates 
[1].  Another problem is that H2 optimizer is very primitive and not 
able to perform many useful optimizations [2].

Also Apache Calcite is commonly used in popular Apache projects like 
Hive, Drill, Flink and others [3]. So it's mature and well battle tested 
framework, while H2 is a toy database which is hardly ever used in the 
real production systems.

> 2. What are the requirements for the new SQL engine?
> We should write it down and discuss.
The main requirement is to fix the problems listed above. The new SQL 
engine should be able to *effectively* execute SQL queries of the 
*arbitrary complexity*. For example the new engine will be able to 
perform distributed joins in a multiple ways [4], when current engine 
can do it only in two ways: collocated and distributed (the latter is 
usually not very efficient and needed to set manually).

> 3. What options do we have?
> Are there any alternatives to Calcite on the market?
> We did the wrong choice that looked obvious one time.
> So we should carefully avoid it at this time.
I know the only one open source implementation of the efficient query 
optimization strategy - and this is Apache Calcite. The alternative way 
is to write our own query optimizer from scratch which is not a trivial 
task at all.


> 4. What is improvements of Ignite we want to make with the new engine?
Ignite will be able to execute complex queries using optimal strategy. I 
think this is a quite good improvement.


[1] https://issues.apache.org/jira/browse/IGNITE-11448
[2] https://issues.apache.org/jira/browse/IGNITE-6085
[3] https://calcite.apache.org/docs/powered_by.html
[4] https://www.memsql.com/blog/scaling-distributed-joins/
-- 
Kind Regards
Roman Kondakov

On 27.09.2019 12:20, Nikolay Izhikov wrote:
> Hello, Igor.
>
> Thanks for starting this discussion.
>
> I think we should take a step back in it and answer the following questions:
>
> 1. What the exact issues with the H2 integration?
> Can you send a tickets links?
> Can we label all H2 integration issues in JIRA? I propose to use "h2" label.
>
> 2. What are the requirements for the new SQL engine?
> We should write it down and discuss.
>
> 3. What options do we have?
> Are there any alternatives to Calcite on the market?
> We did the wrong choice that looked obvious one time.
> So we should carefully avoid it at this time.
>
> 4. What is improvements of Ignite we want to make with the new engine?
>
>
> В Пт, 27/09/2019 в 08:44 +0000, Igor Seliverstov пишет:
>> Hi Igniters!
>>
>> As you might know currently we have many open issues relating to current H2 based
engine and its execution flow.
>>
>> Some of them are critical (like impossibility to execute particular queries), some
of them are majors (like impossibility to execute particular queries without pre-preparation
your data to have a collocation) and many minors.
>>
>> Most of the issues cannot be solved without whole engine redesign.
>>
>> So, here the proposal: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
>>
>> I'll appreciate if you share your thoughts on top of that.
>>
>> Regards,
>> Igor

Mime
View raw message