Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@ignite.apache.org
Date: Fri, 28 Oct 2016 19:12:58 +0000 (UTC)
From: "Andrew Mashenkov (JIRA)" <jira@apache.org>
To: issues@ignite.apache.org
Message-ID: <JIRA.13014679.1477305110000.120459.1477681978519@Atlassian.JIRA>
In-Reply-To: <JIRA.13014679.1477305110000@Atlassian.JIRA>
References: <JIRA.13014679.1477305110000@Atlassian.JIRA> <JIRA.13014679.1477305110527@arcas>
Subject: [jira] [Commented] (IGNITE-4106) SQL: parallelize sql queries over
 cache local partitions
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 28 Oct 2016 19:13:00 -0000


    [ https://issues.apache.org/jira/browse/IGNITE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616263#comment-15616263 ] 

Andrew Mashenkov commented on IGNITE-4106:
------------------------------------------

I've implementes 2 prototypes. In both I try to speed up SQL query Map phase with multi-threading approach.
Compared scenarios: 1 node with splitting into 4 threads vs 4 nodes without splitting.

1) The first one. MapQuery message processing splits into several threads. Each thread runs same query over certain cache local partitions. When all threads fiished - results merged and return to Reducer. This approach shows significant speedup, but throughput is 10-15% slower than if we just add more nodes to grid. Code is far from ideal, i believe we can fix this 10-15% slowdown.

2)  The second. I try to split queries with sending more Map queries messages from query initiator node. But subset of primary partitions for target node were specified in these messages . So, remote nodes process these messages in parallel. This approach give worse results, throughput is 50% slower than if we just add more nodes to grid.

> SQL: parallelize sql queries over cache local partitions
> --------------------------------------------------------
>
>                 Key: IGNITE-4106
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4106
>             Project: Ignite
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6, 1.7
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>              Labels: performance
>
> If we run SQL query on cache partitioned over several cluster nodes, it will be split into several queries running in parallel. But really we will have one thread per query on each node.
> So, for now, to improve SQL query performance we need to run more Ignite instances or split caches manually.
> It seems to be better to split local SQL queries over cache partitions, so we would be able to parallelize SQL query on every single node and utilize CPU more efficiently.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)