cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6134) More efficient BatchlogManager
Date Wed, 19 Feb 2014 14:18:21 GMT


Jonathan Ellis commented on CASSANDRA-6134:

Are you planning to pick this back up, [~m0nstermind]?

> More efficient BatchlogManager
> ------------------------------
>                 Key: CASSANDRA-6134
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Oleg Anastasyev
>            Assignee: Oleg Anastasyev
>            Priority: Minor
>         Attachments: BatchlogManager.txt
> As we discussed earlier in CASSANDRA-6079 this is the new BatchManager.
> It stores batch records in 
> {code}
> CREATE TABLE batchlog (
>   id_partition int,
>   id timeuuid,
>   data blob,
>   PRIMARY KEY (id_partition, id)
> {code}
> where id_partition is minute-since-epoch of id uuid. 
> So when it scans for batches to replay ot scans within a single partition for  a slice
of ids since last processed date till now minus write timeout.
> So no full batchlog CF scan and lot of randrom reads are made on normal cycle. 
> Other improvements:
> 1. It runs every 1/2 of write timeout and replays all batches written within 0.9 * write
timeout from now. This way we ensure, that batched updates will be replayed to th moment client
times out from coordinator.
> 2. It submits all mutations from single batch in parallel (Like StorageProxy do). Old
implementation played them one-by-one, so client can see half applied batches in CF for a
long time (depending on size of batch).
> 3. It fixes a subtle racing bug with incorrect hint ttl calculation

This message was sent by Atlassian JIRA

View raw message