Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C0D610C24 for ; Wed, 19 Mar 2014 17:44:00 +0000 (UTC) Received: (qmail 76722 invoked by uid 500); 19 Mar 2014 17:43:56 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 76060 invoked by uid 500); 19 Mar 2014 17:43:49 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 75870 invoked by uid 99); 19 Mar 2014 17:43:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Mar 2014 17:43:45 +0000 Date: Wed, 19 Mar 2014 17:43:45 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6506) counters++ split counter context shards into separate cells MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940736#comment-13940736 ] Sylvain Lebresne commented on CASSANDRA-6506: --------------------------------------------- bq. With CASSANDRA-6717 this should become a non-issue, but that's 3.0 Well, that's one more reason why rushing this in 2.1 is possibly not the best option. But truly, I'm not sure we absolutely need CASSANDRA-6717 for this if we really don't want to. It's simple enough to have a boolean flag to distinguish between dense and non-dense in 2.1 if we really want to. Even if that flag get replaced by something else in CASSANDRA-6717, that's still feel cleaner to me that having specific CellNameType implementation, originalType(), etc... bq. (see the last set two sets of graphs in CASSANDRA-6553, where 6556 writes are a lot smoother than writes w/out them). That and getting rid of CASSANDRA-6405 for good. Fair enough, I'm not saying the end goal is wrong. But tbh, it feels like the patch currently add more complexity than it removes, and I'm really bugged about having special cellName and cellNameType implementations for counters. We also already have done lots of changes to counter in 2.1, I'm just not sure adding another (definitively-not-small) layer of changes on top of that a short time before release is the best strategy to minimize the change of breaking things. Let's say that my gut feeling is that we leave that for 3.0, and use that opportunity to try to simplify that further (CASSANDRA-6717, maybe getting rid of local/remote). bq. but I seriously don't see how this could be accomplished I think the first step is CASSANDRA-6888. If we have that, then in 3.0 we can at least detect if there is remaining local/remote shard in the system. If there is, one idea could be to ask (force really) people to run some offline upgrade tool. Or run it for them at startup really. I know it's not ideal but well, this is just to say that it's possible, and this may be worth the effort truly. bq. hence not holding my breath for implementing counters as maps That's not related to the local/remote shards, is it? Seems to me that we could do that even if we still have CounterCell. bq. That's because using the timestamp field for the logical clock breaks the re-adding of previously dropped counter cells Good point, but that's kind of not a detail imo. But now that we do read-modify-write, we could really use the current time for the timestamp can't we? We'd just have to make sure the times assigned by a local node never go back in time, and add +1 here and there, but that's not too hard. This could also help make writeTime() work with counters, which I believe work currently and is broken by this patch. > counters++ split counter context shards into separate cells > ----------------------------------------------------------- > > Key: CASSANDRA-6506 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6506 > Project: Cassandra > Issue Type: Improvement > Reporter: Aleksey Yeschenko > Assignee: Aleksey Yeschenko > Fix For: 2.1 beta2 > > > This change is related to, but somewhat orthogonal to CASSANDRA-6504. > Currently all the shard tuples for a given counter cell are packed, in sorted order, in one binary blob. Thus reconciling N counter cells requires allocating a new byte buffer capable of holding the union of the two context's shards N-1 times. > For writes, in post CASSANDRA-6504 world, it also means reading more data than we have to (the complete context, when all we need is the local node's global shard). > Splitting the context into separate cells, one cell per shard, will help to improve this. We did a similar thing with super columns for CASSANDRA-3237. Incidentally, doing this split is now possible thanks to CASSANDRA-3237. > Doing this would also simplify counter reconciliation logic. Getting rid of old contexts altogether can be done trivially with upgradesstables. > In fact, we should be able to put the logical clock into the cell's timestamp, and use regular Cell-s and regular Cell reconcile() logic for the shards, especially once we get rid of the local/remote shards some time in the future (until then we still have to differentiate between global/remote/local shards and their priority rules). -- This message was sent by Atlassian JIRA (v6.2#6252)