Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A6FF3200BE4 for ; Tue, 6 Dec 2016 13:41:14 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A5A6F160B0C; Tue, 6 Dec 2016 12:41:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 027A2160B1B for ; Tue, 6 Dec 2016 13:41:13 +0100 (CET) Received: (qmail 3998 invoked by uid 500); 6 Dec 2016 12:41:03 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 3449 invoked by uid 99); 6 Dec 2016 12:41:00 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Dec 2016 12:41:00 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7720D2C03E7 for ; Tue, 6 Dec 2016 12:40:58 +0000 (UTC) Date: Tue, 6 Dec 2016 12:40:58 +0000 (UTC) From: "Milan Majercik (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-12796) Heap exhaustion when rebuilding secondary index over a table with wide partitions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 06 Dec 2016 12:41:14 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725358#comment-15725358 ] Milan Majercik edited comment on CASSANDRA-12796 at 12/6/16 12:40 PM: ---------------------------------------------------------------------- [~beobal], the patch for branch _3.0_ works fine, however the page size for single partition pager appears to be calculated incorrectly. My table's average partition size is around *7GB* and yet the page size got calculated as *1*. {code:java} private int calculateIndexingPageSize() { double averageRowSize = baseCfs.getMeanPartitionSize(); if (averageRowSize <= 0) return DEFAULT_PAGE_SIZE; return (int) Math.max(1, Math.min(DEFAULT_PAGE_SIZE, 4 * 1024 * 1024 / averageRowSize)); } {code} This rendered index rebuild extremely slow as registering read/write order group implies significant performance overhead and for this reason the page size should have reasonable value. I think there is no harm if we set page size to {{DEFAULT_PAGE_SIZE}} as the pager doesn't span across different partitions in case the partition is small ([https://github.com/mmajercik/cassandra/commit/3fc016e73d3032f4d04584a45945141151a49213]) [12796-3.0|https://github.com/mmajercik/cassandra/tree/12796-3.0] was (Author: mmajercik): [~beobal], the patch for branch _3.0_ works fine, however the page size for single partition pager appears to be calculated incorrectly. My table's average partition size is around *7GB* and yet the page size got calculated as *1*. {code:java} private int calculateIndexingPageSize() { double averageRowSize = baseCfs.getMeanPartitionSize(); if (averageRowSize <= 0) return DEFAULT_PAGE_SIZE; return (int) Math.max(1, Math.min(DEFAULT_PAGE_SIZE, 4 * 1024 * 1024 / averageRowSize)); } {code} This rendered index rebuild extremely slow as registering read/write order group implies significant performance overhead and for this reason the page size should have reasonable size. I think there is no harm if we set page size to {{DEFAULT_PAGE_SIZE}} as the pager doesn't span across different partitions in case the partition is small ([https://github.com/mmajercik/cassandra/commit/3fc016e73d3032f4d04584a45945141151a49213]) [12796-3.0|https://github.com/mmajercik/cassandra/tree/12796-3.0] > Heap exhaustion when rebuilding secondary index over a table with wide partitions > --------------------------------------------------------------------------------- > > Key: CASSANDRA-12796 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12796 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Milan Majercik > Priority: Critical > > We have a table with rather wide partition and a secondary index defined over it. As soon as we try to rebuild the index we observed exhaustion of Java heap and eventual OOM error. After a lengthy investigation we have managed to find a culprit which appears to be a wrong granule of barrier issuances in method {{org.apache.cassandra.db.Keyspace.indexRow}}: > {code} > try (OpOrder.Group opGroup = cfs.keyspace.writeOrder.start()){html} > { > Set indexes = cfs.indexManager.getIndexesByNames(idxNames); > Iterator pager = QueryPagers.pageRowLocally(cfs, key.getKey(), DEFAULT_PAGE_SIZE); > while (pager.hasNext()) > { > ColumnFamily cf = pager.next(); > ColumnFamily cf2 = cf.cloneMeShallow(); > for (Cell cell : cf) > { > if (cfs.indexManager.indexes(cell.name(), indexes)) > cf2.addColumn(cell); > } > cfs.indexManager.indexRow(key.getKey(), cf2, opGroup); > } > } > {code} > Please note the operation group granule is a partition of the source table which poses a problem for wide partition tables as flush runnable ({{org.apache.cassandra.db.ColumnFamilyStore.Flush.run()}}) won't proceed with flushing secondary index memtable before completing operations prior recent issue of the barrier. In our situation the flush runnable waits until whole wide partition gets indexed into the secondary index memtable before flushing it. This causes an exhaustion of the heap and eventual OOM error. > After we changed granule of barrier issue in method {{org.apache.cassandra.db.Keyspace.indexRow}} to query page as opposed to table partition secondary index (see [https://github.com/mmajercik/cassandra/commit/7e10e5aa97f1de483c2a5faf867315ecbf65f3d6?diff=unified]), rebuild started to work without heap exhaustion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)