ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-6019) SQL: client node should not hold the whole data set in-memory when possible
Date Thu, 10 Aug 2017 17:34:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-6019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121964#comment-16121964

ASF GitHub Bot commented on IGNITE-6019:

GitHub user alexpaschenko opened a pull request:




You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gridgain/apache-ignite ignite-6019

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2430
commit 12a479fe7d1973bdba5dcdb488d3ff109667d107
Author: Alexander Paschenko <alexander.a.paschenko@gmail.com>
Date:   2017-08-10T12:48:38Z

    IGNITE-6019 Merge indexes iterator

commit 51a62134345731c5464c854c62e8be59fb79a9fd
Author: Alexander Paschenko <alexander.a.paschenko@gmail.com>
Date:   2017-08-10T14:06:08Z


commit 652713cba2498cfa85ad89bf9f0aaf96784cae76
Author: Alexander Paschenko <alexander.a.paschenko@gmail.com>
Date:   2017-08-10T14:19:08Z


commit 76c9e2ad6d2e7d0ab4e73b85fa6c2cda9fb913b3
Author: Alexander Paschenko <alexander.a.paschenko@gmail.com>
Date:   2017-08-10T17:31:59Z

    Added a test.


> SQL: client node should not hold the whole data set in-memory when possible
> ---------------------------------------------------------------------------
>                 Key: IGNITE-6019
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6019
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>            Assignee: Alexander Paschenko
>            Priority: Critical
>              Labels: performance
>             Fix For: 2.2
> Our SQL engine requests request data from server nodes in pieces called "page". This
allows us to control memory consumption on client side. However, currently our client code
is designed in a way that all pages are requested from all servers before a single cursor
row is returned to the user. It defeats the whole idea of "cursor" and "page", and could easily
crash client node with OOME. 
> We need to fix that and request further pages in a kind of sliding window, keeping no
more than "N" pages in memory simultaneously. Note that sometimes it is not possible, e.g.
in case of {{DISTINCT}} or non-collocated {{GROUP BY}}. In this case we would have to build
the whole result set first anyway. So let's focus on a scenario when the whole result set
is not needed.
> As currently everything is requested synchronously page-by-page, in the first version
it would be enough to distribute synchronous page requests between cursor reads, without any
> Implementation details:
> 1) Optimization should be applied only to {{skipMergeTbl=true}} cases, when complete
result set of map queries is not needed.
> 2) Starting point is {{GridReduceQueryExecutor#query}}, see {{skipMergeTbl=true}} branch
- this is where we get all pages eagerly.
> 3) Get no more than one page from the server at a time. We request the page, iterate
over it, then request another page.

This message was sent by Atlassian JIRA

View raw message