ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Seliverstov Igor <gvvinbl...@gmail.com>
Subject Re: New SQL execution engine
Date Fri, 27 Sep 2019 15:08:11 GMT
Nikolay,

At last we have better questions.

There is no decision, here we should decide.

Doing nothing isn’t a decision, it’s just doing nothing

Spark Catalyst is a good example, but under the hood it has absolutely the same idea, but
adopted to Spark. Calcite is the same, but general. That’s why it’s better start point.

Implementing an engine from scratch is really cool, but looks like inventing a bicycle, don’t
think it makes sense. At least I against this option.

I added requirements to IEP (as you asked), you may see it’s in DRAFT state and will be
complemented by details.

We have some thoughts on how to make smooth replacement, but at first we should decide what
to replace and what with.

At now Calcite based engine is placed in different module, we checked it can build execution
graph for both local and distributed cases, it has good expandability. 
We talked to Calcite community to identify possible future issues and everything points to
the fact it’s the best option. 
It’s possible to develop it as an experimental extension at first (not a replacement) until
we make sure that it works as expected. This way there are no risks for anybody who uses Ignite
on production environment.

Regards,
Igor


> 27 сент. 2019 г., в 17:25, Nikolay Izhikov <nizhikov@apache.org> написал(а):
> 
> Igor.
> 
>> The main issue - there is no *selection*.
> 
> 1. I don't remember community decision about this.
> 
> 2. We should avoid to make such long-term decision so quickly.
> We done this kind of decision with H2 and come to the point when we should review it.
> 
>> 1) Implementing white papers from scratch
>> 2) Adopting Calcite to our needs.
> 
> The third option don't fix issues we have with H2.
> The fourth option I know is using spark-catalyst.
> 
> What is wrong with writing engine from scratch?
> 
> I ask you to start with engine requirements.
> Can we, please, discuss it?
> 
>> If you have an alternative - you're welcome, I'll gratefully listen to you.
> 
> We have alternative for now - H2 based engine.
> 
>> The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point
of view.
> 
> When we make a decision about engine we can discuss roadmap for replacement.
> One more time - replacement of SQL engine to some more customizable make sense for me.
> But, this kind of decisions need carefull discussion.
> 
> В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет:
>> Nikolay,
>> 
>> The main issue - there is no *selection*.
>> 
>> There is a field of knowledge - relational algebra, which describes how to transform
relational expressions saving their semantics, and a couple of implementations (Calcite is
only one written in Java).
>> 
>> There are only two alternatives:
>> 
>> 1) Implementing white papers from scratch
>> 2) Adopting Calcite to our needs.
>> 
>> The second way was chosen by several other projects, there is experience, there is
a list of known issues (like using indexes) so, almost everything is already done for us.
>> 
>> Implementing a planner is a big deal, I think anybody understands it there. That's
why our proposal to reuse others experience is obvious.
>> 
>> If you have an alternative - you're welcome, I'll gratefully listen to you.
>> 
>> The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point
of view.
>> 
>> Regards,
>> Igor
>> 
>>> 27 сент. 2019 г., в 16:37, Nikolay Izhikov <nizhikov@apache.org>
написал(а):
>>> 
>>> Roman.
>>> 
>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious 
>>>> for you as it obvious for SQL team. So, please arrange your questions in

>>>> a more constructive way.
>>> 
>>> What is SQL team?
>>> I only know Ignite community :)
>>> 
>>> Please, share you knowledge in IEP.
>>> I want to join to the process of engine *selection*.
>>> It should start with the requirements to such engine.
>>> Can you write it in IEP, please?
>>> 
>>> My point is very simple:
>>> 
>>> 1. We made the wrong decision with H2
>>> 2. We should make a well-thought decision about the new engine.
>>> 
>>>> How many tickets would satisfy you?
>>> 
>>> You write about "issueS" with the H2.
>>> All I see is one open ticket.
>>> IEP doesn't provide enough information.
>>> So it's not about the number of tickets, it's about
>>> 
>>>> These two points (single map-reduce execution and inflexible optimizer) 
>>>> are the main problems with the current engine.
>>> 
>>> We may come to the point when Calcite(or any other engine) brings us third and
other "main problems".
>>> This is how it happens with H2.
>>> 
>>> Let's start from what we want to get with the engine and move forward from this
base.
>>> What do you think?
>>> 
>>> 
>>> 
>>> В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
>>>> Maxim, Nikolay,
>>>> 
>>>> I've listed two issues which show the ideological flaws of the current 
>>>> engine.
>>>> 
>>>> 1. IGNITE-11448 - Open. This ticket describes the impossibility of 
>>>> executing queries which can not be fit in the hardcoded one pass 
>>>> map-reduce paradigm.
>>>> 
>>>> 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second 
>>>> major problem with the current engine: H2 query optimizer is very 
>>>> primitive and can not perform many useful optimizations.
>>>> 
>>>> These two points (single map-reduce execution and inflexible optimizer) 
>>>> are the main problems with the current engine. It means that our engine 
>>>> is currently  suitable for execution only a very limited subset of the 
>>>> typical SQL queries. For example it can not even run most of the TPC-H 
>>>> benchmark queries because they don't fit to the simple map-reduce paradigm.
>>>> 
>>>>> All I see is links to two tickets:
>>>> 
>>>> How many tickets would satisfy you? I named two. And it looks like it is

>>>> not enough from your point of view. Ok, so how many is enough? The set 
>>>> of problems caused by listed above tickets is infinite, therefore I can 
>>>> not create a ticket for each of them.
>>>>> Tech details also should be added.
>>>> 
>>>> Tech details are in the tickets.
>>>> 
>>>>> We can't discuss such a huge change as an execution engine replacement
with descrition like:
>>>>> "No data co-location control, i.e. arbitrary data can be returned silently"
or
>>>>> "Low control on how query executes internally, as a result we have limited
possibility to implement improvements/fixes."
>>>> 
>>>> Why not? Don't you understand these problems? Or you don't think this is

>>>> a problem?
>>>> 
>>>>> Let's make these descriptions more specific.
>>>> 
>>>> What do you mean by "more specific"? What is the criteria of the 
>>>> specific description?
>>>> 
>>>> 
>>>> 
>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious 
>>>> for you as it obvious for SQL team. So, please arrange your questions in

>>>> a more constructive way.
>>>> 
>>>> Thank you!
>> 
>> 


Mime
View raw message