ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Kasnacheev <ilya.kasnach...@gmail.com>
Subject Re: SQL query and Indexes architecture
Date Mon, 17 Sep 2018 14:57:27 GMT

I recommend starting with H2TreeIndex class. Maybe dropping mails on
developer list with precise questions.

Ilya Kasnacheev

пн, 17 сент. 2018 г. в 16:25, eugene miretsky <eugene.miretsky@gmail.com>:

> Thanks!
> I am curious about the process of loading data from Ignite to H2 on the
> fly, as H2 creating indexes but storing them in Ignite. Can you point me to
> some JIRAs that discuss it, or which part of the code is responsible for
> that?
> On Mon, Sep 17, 2018 at 9:18 AM Ilya Kasnacheev <ilya.kasnacheev@gmail.com>
> wrote:
>> Hello!
>> 1. 1. H2 executes the query, during which it has to load rows from
>> tables, and Ignite does the row loading part. Then Ignite will collect
>> query results on all nodes and aggregate them on a single node.
>> 1. 2. Index is created by H2, but it is stored in Ignite pages (?).
>> 2. Maybe you're right, I have to admit I'm unfamiliar with precise
>> details here.
>> Regards,
>> --
>> Ilya Kasnacheev
>> пн, 17 сент. 2018 г. в 16:02, eugene miretsky <eugene.miretsky@gmail.com
>> >:
>>> Thanks!
>>>    1.
>>>    1.  "Ignite feeds H2 rows that it asks for, and H2 creates indexes
>>>       on them and executes queries on them." - what exactly do you mean by that?
>>>       Do you mean that all parts of a query that use indexes are executed by
>>>       then the actual data is retrieved from Ignite pages, and the final
>>>       (non-indexed) parts of the query executed by Ignite?
>>>       2.  What happens when I create an index on a new column? Is the
>>>       index created in Ignite (and stored in Ignite pages?), or is it created
>>>       H2?
>>>    2.  The reason I was asking about AFFINITY_KEY, _key_PK and
>>>    _key_PK_hash indexed is that in this   code
>>>    <https://github.com/apache/ignite/blob/56975c266e7019f307bb9da42333a6db4e47365e/modules/indexing/src/main/java/org/apache/ignite/internal/processors/query/h2/H2TableDescriptor.java>
>>>    looks like they are created in H2
>>> On Mon, Sep 17, 2018 at 8:36 AM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>> Hello!
>>>> 1. H2 does not store data but, as far as my understanding goes, it
>>>> created SQL indexes from data. Ignite feeds H2 rows that it asks for, and
>>>> H2 creates indexes on them and executes queries on them.
>>>> 2. Ignite always has special index on your key (since it's a key-value
>>>> storage it can always find tuple by key). Ignite is also aware of key's
>>>> hash code, and affinity key value always maps to one partition of data (of
>>>> 1024 by default). Those are not H2 indexes and they're mostly used on
>>>> planning stage. E.g. you can map query to one node if affinity key is
>>>> present in the request.
>>>> 3. Data is brought onto the heap to read any fields from row. GROUP BY
>>>> will hold its tuples on heap. Ignite has configurable index inlining where
>>>> you can avoid reading objects from heap just to access indexed fields.
>>>> 4. With GROUP BY, lazy evaluation will not help you much. It will still
>>>> have to hold all data on heap at some point. Lazy evaluation mostly helps
>>>> with "SELECT * FROM table" type queries which provide very large and boring
>>>> result set.
>>>> Hope this helps.
>>>> --
>>>> Ilya Kasnacheev
>>>> пт, 14 сент. 2018 г. в 17:39, eugene miretsky <
>>>> eugene.miretsky@gmail.com>:
>>>>> Hello,
>>>>> Trying to understand how exactly SQL queries are executed in Ignite.
>>>>> few questions
>>>>>    1. To what extent is H2 used? Does it store the data? Does it
>>>>>    create the indexes? Is it used only for generating execution plans?
>>>>>    believe that all the data used to be stored in H2, but with the new
>>>>>    memory architecture, I believe that's no longer the case.
>>>>>    2. Which indexes are used? Ignite creates  B+ tree indexes and
>>>>>    stores them in Index pages, but I also see AFFINITY_KEY, _key_PK and
>>>>>    _key_PK_hash indexes created in H2.
>>>>>    3. When is data brought onto the heap? I am assuming that groupby
>>>>>    and aggregate require all the matching queries to first be copied
>>>>>    off-heap to heap
>>>>>    4. How does lazy evaluation work? For example, for group_by, does
>>>>>    it bring batches of matching records with the same group_by key onto
>>>>>    heap?
>>>>> I am not necessarily looking for the exact answers, but rather pointer
>>>>> in the right direction (documentation, code, jiras)
>>>>> Cheers,
>>>>> Eugene

View raw message