ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Mashenkov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-4106) SQL: parallelize sql queries over cache local partitions
Date Fri, 28 Oct 2016 19:12:58 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616263#comment-15616263

Andrew Mashenkov commented on IGNITE-4106:

I've implementes 2 prototypes. In both I try to speed up SQL query Map phase with multi-threading
Compared scenarios: 1 node with splitting into 4 threads vs 4 nodes without splitting.

1) The first one. MapQuery message processing splits into several threads. Each thread runs
same query over certain cache local partitions. When all threads fiished - results merged
and return to Reducer. This approach shows significant speedup, but throughput is 10-15% slower
than if we just add more nodes to grid. Code is far from ideal, i believe we can fix this
10-15% slowdown.

2)  The second. I try to split queries with sending more Map queries messages from query initiator
node. But subset of primary partitions for target node were specified in these messages .
So, remote nodes process these messages in parallel. This approach give worse results, throughput
is 50% slower than if we just add more nodes to grid.

> SQL: parallelize sql queries over cache local partitions
> --------------------------------------------------------
>                 Key: IGNITE-4106
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4106
>             Project: Ignite
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6, 1.7
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>              Labels: performance
> If we run SQL query on cache partitioned over several cluster nodes, it will be split
into several queries running in parallel. But really we will have one thread per query on
each node.
> So, for now, to improve SQL query performance we need to run more Ignite instances or
split caches manually.
> It seems to be better to split local SQL queries over cache partitions, so we would be
able to parallelize SQL query on every single node and utilize CPU more efficiently.

This message was sent by Atlassian JIRA

View raw message