From user-return-24508-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Thu Jan 24 16:10:47 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 626D818077A for ; Thu, 24 Jan 2019 16:10:47 +0100 (CET) Received: (qmail 91814 invoked by uid 500); 24 Jan 2019 15:10:46 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 91804 invoked by uid 99); 24 Jan 2019 15:10:46 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2019 15:10:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E7C59C0156 for ; Thu, 24 Jan 2019 15:10:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.918 X-Spam-Level: X-Spam-Status: No, score=0.918 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_NONE=-0.0001, SPF_FAIL=0.919, SPF_HELO_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id CVJGNaDgkdxZ for ; Thu, 24 Jan 2019 15:10:43 +0000 (UTC) Received: from n6.nabble.com (n6.nabble.com [162.255.23.37]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 8060A5FAEF for ; Thu, 24 Jan 2019 15:10:43 +0000 (UTC) Received: from n6.nabble.com (localhost [127.0.0.1]) by n6.nabble.com (Postfix) with ESMTP id 24BDFBE6F6B1 for ; Thu, 24 Jan 2019 08:10:43 -0700 (MST) Date: Thu, 24 Jan 2019 08:10:43 -0700 (MST) From: gourav10041996 To: user@ignite.apache.org Message-ID: <1548342643147-0.post@n6.nabble.com> Subject: Performance problems while running SQL query involving JOINS and ORDER BY eventually leading to heap OOME. MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit To simplify our use-case, we created two caches using the SQL query and loaded data consisting of about 4 million records and 60k records approximately, in the respective caches with INDEX created on all the columns. Ignite is set up to run on a single node, meaning all the data is present on the same node. The query used for testing/the one we are facing issue with is of the type - SELECT * FROM CACHE1 C1, CACHE2 C2 WHERE C1.JOINCol = C2.JOINCol AND C1.COL1 = 'someValue' ORDER BY C1.COL2 The above query execution leads to the Ignite thread memory rising extensively, eventually leading to heap OOME. When the heap memory was increased to about 14GB, we were able to get the results back, but the processing time of the query was too long, about 2-4 minutes ( with CPUs =2). We ran an EXPLAIN for the above query and found out that INDEX was created on COL1 for C1 cache and on JOINCol for C2 cache. There was no index on the sorted column. We think the problem of 'slow querying and huge heap memory requirement' is because of the absence of an index in the sorted column. Whenever there is a condition present in the WHERE clause ( in our example C1.COL1='someValue'), Ignite is using an INDEX for that column and there is no INDEX being created on the ORDER BY column. And for our use-case, it is imperative that we have a condition in the where clause ( to filter out the data) and a join condition apart from the order by clause. We tried the multiple column indexing strategy on the COL1, COL2 as per our use case. In case of a composite index with the order as (COL1, COL2), INDEX was created only for the COL1. While for the composite index order as (COL2, COL1), INDEX was getting created for both COL1 and COL2 and the results were index sorted. ( But only in case of the absence of an INDEX for COL1, it looks for the ORDER BY clause column and uses a composite index). But, if we don't have a separate INDEX for COL1, it again poses a problem as COL1 is something which is heavily used for filtering in all other queries. So an INDEX on COL1 is necessary. To summarize, In case there is a condition present in the WHERE clause, Ignite uses the WHERE clause column for indexing, and therefore there is no INDEX in the sorting column, resulting in severe query performance, which can eventually lead us to our system going down. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/