Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C289711805 for ; Fri, 28 Mar 2014 20:19:27 +0000 (UTC) Received: (qmail 13111 invoked by uid 500); 28 Mar 2014 20:19:22 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 12666 invoked by uid 500); 28 Mar 2014 20:19:20 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 12612 invoked by uid 99); 28 Mar 2014 20:19:18 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2014 20:19:18 +0000 Date: Fri, 28 Mar 2014 20:19:18 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6945) Calculate liveRatio on per-memtable basis, non per-CF MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13951320#comment-13951320 ] Jonathan Ellis commented on CASSANDRA-6945: ------------------------------------------- Ship it! > Calculate liveRatio on per-memtable basis, non per-CF > ----------------------------------------------------- > > Key: CASSANDRA-6945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6945 > Project: Cassandra > Issue Type: Bug > Reporter: Aleksey Yeschenko > Assignee: Aleksey Yeschenko > Fix For: 2.0.7 > > > Currently we recalculate live ratio every doubling of write ops to the CF, not to an individual memtable. The value itself is also CF-bound, not memtable-bound. This is causing at least several issues: > 1. Depending on what stage the current memtable is, the live ratio calculated can vary *a lot* > 2. That calculated live ratio will potentially stay that way for quite a while - the longer C* process is on, the longer it would stay incorrect > 3. Incorrect live ratio means inefficient MeteredFlusher - flushing less or more often than needed, picking bad candidates for flushing, etc. > 4. Incorrect live ratio means incorrect size returned to the metrics consumers > 5. Compaction strategies that rely on memtable size estimation are affected > 6. All of the above is slightly amplified by the fact that all the memtables pending flush would also use that one incorrect value > Depending on the stage the current memtable at the moment of live ratio recalculation is, the value calculated can be *extremely* wrong (say, a recently created, fresh memtable - would have a much higher than average live ratio). > The suggested fix is to bind live ratio to individual memtables, not column families as a whole, with some optimizations to make recalculations run less often by inheriting previous memtable's stats. -- This message was sent by Atlassian JIRA (v6.2#6252)