Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9FDA7200B80 for ; Wed, 31 Aug 2016 06:23:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9E45B160AC5; Wed, 31 Aug 2016 04:23:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E5A32160ABA for ; Wed, 31 Aug 2016 06:23:21 +0200 (CEST) Received: (qmail 36593 invoked by uid 500); 31 Aug 2016 04:23:21 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 36575 invoked by uid 99); 31 Aug 2016 04:23:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Aug 2016 04:23:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 9BE972C1B91 for ; Wed, 31 Aug 2016 04:23:20 +0000 (UTC) Date: Wed, 31 Aug 2016 04:23:20 +0000 (UTC) From: "Michael Kjellman (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-9754) Make index info heap friendly for large CQL partitions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 31 Aug 2016 04:23:22 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451066#comment-15451066 ] Michael Kjellman commented on CASSANDRA-9754: --------------------------------------------- I pushed a rebased commit that addresses many additional comments by [~jasobrown] from review, adds additional unit tests, and has many further improvements to documentation. This is still 2.1 based, however the review and improvements made in the org.apache.cassandra.db.index.birch package is agnostic to a trunk or 2.1 based patch. https://github.com/mkjellman/cassandra/commit/3d686799a0e79c23d86881bb041b5408dcfda014 https://github.com/mkjellman/cassandra/tree/CASSANDRA-9754-2.1 Some Highlights: * Fix a bug in KeyIterator identified by [~jjirsa] that would cause the iterator to return nothing when the backing SegmentedFile contains exactly 1 key/segment. * Add unit tests for KeyIterator * Add SSTable version ka support to LegacySSTableTest. Actually test something in LegacySSTableTest. * Add additional unit tests around PageAlignedReader, PageAlignedWriter, BirchWriter, and BirchReader * Remove word lists and refactor all unit tests to use TimeUUIDTreeSerializableIterator instead * Improve documentation and fix documentation as required to properly parse and format during javadoc creation * Remove reset() functionality from BirchReader.BirchIterator * Fix many other nits > Make index info heap friendly for large CQL partitions > ------------------------------------------------------ > > Key: CASSANDRA-9754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9754 > Project: Cassandra > Issue Type: Improvement > Reporter: sankalp kohli > Assignee: Michael Kjellman > Priority: Minor > Fix For: 4.x > > Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff > > > Looking at a heap dump of 2.0 cluster, I found that majority of the objects are IndexInfo and its ByteBuffers. This is specially bad in endpoints with large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for GC. Can this be improved by not creating so many objects? -- This message was sent by Atlassian JIRA (v6.3.4#6332)