Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 837081001A for ; Thu, 19 Dec 2013 11:39:23 +0000 (UTC) Received: (qmail 61839 invoked by uid 500); 19 Dec 2013 11:39:13 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 61780 invoked by uid 500); 19 Dec 2013 11:39:09 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 61752 invoked by uid 99); 19 Dec 2013 11:39:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 11:39:08 +0000 Date: Thu, 19 Dec 2013 11:39:08 +0000 (UTC) From: "Aleksey Yeschenko (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-6504) counters++ MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-6504: ----------------------------------------- Description: Continuing CASSANDRA-4775 here. We are changing counter write path to explicitly lock-read-modify-unlock-replicate, thus getting rid of the previously used 'local' (deltas) and 'remote' shards distinction. Unfortunately, we can't simply start using 'remote' shards exclusively, since shard merge rules prioritize the 'local' shards. Which is why we are introducing the third shard type - 'global', the only shard type to be used in 2.1+. The updated merge rules are going to look like this: global + global = keep the shard with the highest logical clock global + local or remote = keep the global one local + local = sum counts (and logical clock) local + remote = keep the local one remote + remote = keep the shard with highest logical clock This is required for backward compatibility with pre-2.1 counters. To make 2.0-2.1 live upgrade possible, 'global' shard merge logic will have to be back ported to 2.0. 2.0 will not produce them, but will be able to understand the global shards coming from the 2.1 nodes during the live upgrade. See CASSANDRA-6505. Other changes introduced in this issue: 1. replicate_on_write is gone. From now on we only avoid replication at RF 1. 2. REPLICATE_ON_WRITE stage is gone 3. counter mutations are running in their own COUNTER_MUTATION stage now 4. counter mutations have a separate counter_write_request_timeout setting 5. mergeAndRemoveOldShards() code is gone, for now, until/unless a better solution is found 6. we only replicate the fresh global shard now, not the complete (potentially quite large) counter context 7. to help with concurrency and reduce lock contention, we cache node's global shards in a new counter cache ({cf id, partition key, cell name} -> {count, clock}). The cache is only used by counter writes, to help with 'hot' counters being simultaneously updated. Improvements to be handled by separate JIRA issues: 1. Split counter context into separate cells - one shard per cell. See CASSANDRA-6506. This goes into either 2.1 or 3.0. Potential improvements still being debated: 1. Coalesce the mutations in COUNTER_MUTATION stage if they share the same partition key, and apply them together, to improve the locking situation when updating different counter cells in one partition. See CASSANDRA-6508. Will to into 2.1 or 3.0, if deemed beneficial. was: Continuing CASSANDRA-4775 here. We are changing counter write path to explicitly lock-read-modify-unlock-replicate, thus getting rid of the previously used 'local' (deltas) and 'remote' shards distinction. Unfortunately, we can't simply start using 'remote' shards exclusively, since shard merge rules prioritise the 'local' shards. Which is why we are introducing the third shard type - 'global', the only shard type to be used in 2.1+. The updated merge rules are going to look like this: global + global = keep the shard with the highest logical clock ({count, clock} pair will actually be replaced with {increment count, decrement count} tuple - see CASSANDRA-6507) global + local or remote = keep the global one local + local = sum counts (and logical clock) local + remote = keep the local one remote + remote = keep the shard with highest logical clock This is required for backward compatibility with pre-2.1 counters. To make 2.0-2.1 live upgrade possible, 'global' shard merge logic will have to be back ported to 2.0. 2.0 will not produce them, but will be able to understand the global shards coming from the 2.1 nodes during the live upgrade. See CASSANDRA-6505. Other changes introduced in this issue: 1. replicate_on_write is gone. From now on we only avoid replication at RF 1. 2. REPLICATE_ON_WRITE stage is gone 3. counter mutations are running in their own COUNTER_MUTATION stage now 4. counter mutations have a separate counter_write_request_timeout setting 5. mergeAndRemoveOldShards() code is gone, for now, until/unless a better solution is found 6. we only replicate the fresh global shard now, not the complete (potentially quite large) counter context 7. to help with concurrency and reduce lock contention, we cache node's global shards in a new counter cache ({cf id, partition key, cell name} -> {count, clock}). The cache is only used by counter writes, to help with 'hot' counters being simultaneously updated. Improvements to be handled by separate JIRA issues: 1. Replace {count, clock} with {increment count, decrement count} tuple. When merging two global shards, the maximums of both will be picked. See CASSANDRA-6507. This goes into 2.1, and makes the new implementation match PN-Counters from the http://hal.inria.fr/docs/00/55/55/88/PDF/techreport.pdf white paper. 2. Split counter context into separate cells - one shard per cell. See CASSANDRA-6506. This goes into either 2.1 or 3.0. Potential improvements still being debated: 1. Coalesce the mutations in COUNTER_MUTATION stage if they share the same partition key, and apply them together, to improve the locking situation when updating different counter cells in one partition. See CASSANDRA-6508. Will to into 2.1 or 3.0, if deemed beneficial. > counters++ > ---------- > > Key: CASSANDRA-6504 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6504 > Project: Cassandra > Issue Type: Improvement > Reporter: Aleksey Yeschenko > Assignee: Aleksey Yeschenko > Fix For: 2.1 > > > Continuing CASSANDRA-4775 here. > We are changing counter write path to explicitly lock-read-modify-unlock-replicate, thus getting rid of the previously used 'local' (deltas) and 'remote' shards distinction. Unfortunately, we can't simply start using 'remote' shards exclusively, since shard merge rules prioritize the 'local' shards. Which is why we are introducing the third shard type - 'global', the only shard type to be used in 2.1+. > The updated merge rules are going to look like this: > global + global = keep the shard with the highest logical clock > global + local or remote = keep the global one > local + local = sum counts (and logical clock) > local + remote = keep the local one > remote + remote = keep the shard with highest logical clock > This is required for backward compatibility with pre-2.1 counters. To make 2.0-2.1 live upgrade possible, 'global' shard merge logic will have to be back ported to 2.0. 2.0 will not produce them, but will be able to understand the global shards coming from the 2.1 nodes during the live upgrade. See CASSANDRA-6505. > Other changes introduced in this issue: > 1. replicate_on_write is gone. From now on we only avoid replication at RF 1. > 2. REPLICATE_ON_WRITE stage is gone > 3. counter mutations are running in their own COUNTER_MUTATION stage now > 4. counter mutations have a separate counter_write_request_timeout setting > 5. mergeAndRemoveOldShards() code is gone, for now, until/unless a better solution is found > 6. we only replicate the fresh global shard now, not the complete (potentially quite large) counter context > 7. to help with concurrency and reduce lock contention, we cache node's global shards in a new counter cache ({cf id, partition key, cell name} -> {count, clock}). The cache is only used by counter writes, to help with 'hot' counters being simultaneously updated. > Improvements to be handled by separate JIRA issues: > 1. Split counter context into separate cells - one shard per cell. See CASSANDRA-6506. This goes into either 2.1 or 3.0. > Potential improvements still being debated: > 1. Coalesce the mutations in COUNTER_MUTATION stage if they share the same partition key, and apply them together, to improve the locking situation when updating different counter cells in one partition. See CASSANDRA-6508. Will to into 2.1 or 3.0, if deemed beneficial. -- This message was sent by Atlassian JIRA (v6.1.4#6159)