Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Wed, 2 Oct 2013 16:06:42 +0000 (UTC)
From: "Oleg Anastasyev (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12671959.1380727699174.707.1380730002175@arcas>
In-Reply-To: <JIRA.12671959.1380727699174@arcas>
References: <JIRA.12671959.1380727699174@arcas>
Subject: [jira] [Commented] (CASSANDRA-6134) More efficient BatchlogManager
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784111#comment-13784111 ] 

Oleg Anastasyev commented on CASSANDRA-6134:
--------------------------------------------

Well, the way how to migrate old batchlog records is a subject to discussion and TBD. The easiest way is to have batchlog2 CF with new definition and batchlog with old one. But i find it somewhat ugly.

> More efficient BatchlogManager
> ------------------------------
>
>                 Key: CASSANDRA-6134
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6134
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Oleg Anastasyev
>            Priority: Minor
>         Attachments: BatchlogManager.txt
>
>
> As we discussed earlier in CASSANDRA-6079 this is the new BatchManager.
> It stores batch records in 
> {code}
> CREATE TABLE batchlog (
>   id_partition int,
>   id timeuuid,
>   data blob,
>   PRIMARY KEY (id_partition, id)
> ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (id DESC)
> {code}
> where id_partition is minute-since-epoch of id uuid. 
> So when it scans for batches to replay ot scans within a single partition for  a slice of ids since last processed date till now minus write timeout.
> So no full batchlog CF scan and lot of randrom reads are made on normal cycle. 
> Other improvements:
> 1. It runs every 1/2 of write timeout and replays all batches written within 0.9 * write timeout from now. This way we ensure, that batched updates will be replayed to th moment client times out from coordinator.
> 2. It submits all mutations from single batch in parallel (Like StorageProxy do). Old implementation played them one-by-one, so client can see half applied batches in CF for a long time (depending on size of batch).
> 3. It fixes a subtle racing bug with incorrect hint ttl calculation


--
This message was sent by Atlassian JIRA
(v6.1#6144)