Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 110EE117D8 for ; Tue, 22 Jul 2014 07:37:42 +0000 (UTC) Received: (qmail 41362 invoked by uid 500); 22 Jul 2014 07:37:40 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 41320 invoked by uid 500); 22 Jul 2014 07:37:40 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 41218 invoked by uid 99); 22 Jul 2014 07:37:40 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jul 2014 07:37:40 +0000 Date: Tue, 22 Jul 2014 07:37:40 +0000 (UTC) From: "graham sanderson (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069938#comment-14069938 ] graham sanderson commented on CASSANDRA-7546: --------------------------------------------- {quote} However whether it is one-way or not is somewhat unimportant for me. This flip would only last the lifetime of a memtable, which is not super lengthy (under heavily load probably only a few minutes), and would not have dramatically negative consequences if it got it slightly wrong {quote} Cool, that's what I was asking/thinking. As for the tree size/rebalancing, I have no particular proof... when things go wrong we are hinting massively, and so maybe there are hundreds of hint mutation threads each with their own in progress rebalance, pinning a lot of nodes across young GC. That said, the memory allocation rate is truly spectacular, even given the excessive hinting, so I have to suspect the spinning (and as you say probably some of the in arena allocation it does too) - though that would also be surprising since these are hint updates which are a single cell update Anyway... we can track cost in the Holder I guess to avoid any atomic operations, and maybe factor in the tree size there too. Note as an aside, we are partly to blame for this issue (best practices to be learned, and ways we can mitigate) but the result is surprising enough (because things go bad at random, and usually when we are inserting 100s of times less data than we can easily handle) that others might easily get bitten. I would describe everything that I think is going on in the snowballing of problems, but it is a bit of a comedy of errors. > AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory > ----------------------------------------------------------------------------- > > Key: CASSANDRA-7546 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7546 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: graham sanderson > Assignee: graham sanderson > Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_alt.txt, suggestion1.txt, suggestion1_21.txt > > > In order to preserve atomicity, this code attempts to read, clone/update, then CAS the state of the partition. > Under heavy contention for updating a single partition this can cause some fairly staggering memory growth (the more cores on your machine the worst it gets). > Whilst many usage patterns don't do highly concurrent updates to the same partition, hinting today, does, and in this case wild (order(s) of magnitude more than expected) memory allocation rates can be seen (especially when the updates being hinted are small updates to different partitions which can happen very fast on their own) - see CASSANDRA-7545 > It would be best to eliminate/reduce/limit the spinning memory allocation whilst not slowing down the very common un-contended case. -- This message was sent by Atlassian JIRA (v6.2#6252)