Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1B6A6200BAE for ; Fri, 28 Oct 2016 21:13:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1A14B160B07; Fri, 28 Oct 2016 19:13:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 62BA1160ACA for ; Fri, 28 Oct 2016 21:12:59 +0200 (CEST) Received: (qmail 78419 invoked by uid 500); 28 Oct 2016 19:12:58 -0000 Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list issues@ignite.apache.org Received: (qmail 78403 invoked by uid 99); 28 Oct 2016 19:12:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Oct 2016 19:12:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7FC722C0D55 for ; Fri, 28 Oct 2016 19:12:58 +0000 (UTC) Date: Fri, 28 Oct 2016 19:12:58 +0000 (UTC) From: "Andrew Mashenkov (JIRA)" To: issues@ignite.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (IGNITE-4106) SQL: parallelize sql queries over cache local partitions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 28 Oct 2016 19:13:00 -0000 [ https://issues.apache.org/jira/browse/IGNITE-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616263#comment-15616263 ] Andrew Mashenkov commented on IGNITE-4106: ------------------------------------------ I've implementes 2 prototypes. In both I try to speed up SQL query Map phase with multi-threading approach. Compared scenarios: 1 node with splitting into 4 threads vs 4 nodes without splitting. 1) The first one. MapQuery message processing splits into several threads. Each thread runs same query over certain cache local partitions. When all threads fiished - results merged and return to Reducer. This approach shows significant speedup, but throughput is 10-15% slower than if we just add more nodes to grid. Code is far from ideal, i believe we can fix this 10-15% slowdown. 2) The second. I try to split queries with sending more Map queries messages from query initiator node. But subset of primary partitions for target node were specified in these messages . So, remote nodes process these messages in parallel. This approach give worse results, throughput is 50% slower than if we just add more nodes to grid. > SQL: parallelize sql queries over cache local partitions > -------------------------------------------------------- > > Key: IGNITE-4106 > URL: https://issues.apache.org/jira/browse/IGNITE-4106 > Project: Ignite > Issue Type: Improvement > Components: SQL > Affects Versions: 1.6, 1.7 > Reporter: Andrew Mashenkov > Assignee: Andrew Mashenkov > Labels: performance > > If we run SQL query on cache partitioned over several cluster nodes, it will be split into several queries running in parallel. But really we will have one thread per query on each node. > So, for now, to improve SQL query performance we need to run more Ignite instances or split caches manually. > It seems to be better to split local SQL queries over cache partitions, so we would be able to parallelize SQL query on every single node and utilize CPU more efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)