cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9673) Improve batchlog write path
Date Fri, 28 Aug 2015 03:08:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717983#comment-14717983
] 

Stefania commented on CASSANDRA-9673:
-------------------------------------

bq. LegacyBatchlogMigrator should log at INFO, conditional on legacy batchlog table being
empty

Do you mean only during migration or all the time? Also, when would we check that the table
is empty: before migrating, after migrating or every time?

bq. LegacyBatchlogMigrator should try and calculate page size from sstable stats (like LegacyHintsMigrator)

I've applied the logic already available in {{BatchlogManager}}, which is not the same as
{{LegacyHintsMigrator}}. Let me know if the latter is preferable.

bq. Also, there is a bug in LBM::apply: we are calling Batch::createLocal using the current
timestampMicros, whereas we should be using the original create time. Otherwise we risk to
resurrect expired batches here, because of overly fresh creationTime.

Is it sufficient to multiply the timestamp by 1000 or do we need to dig it out from the partition
update? Not sure how to do this via {{QueryProcessor.executeInternalWithPaging}}.

Remaining points are done.

> Improve batchlog write path
> ---------------------------
>
>                 Key: CASSANDRA-9673
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Stefania
>              Labels: performance
>             Fix For: 3.0 beta 2
>
>         Attachments: 9673_001.tar.gz, 9673_004.tar.gz, gc_times_first_node_patched_004.png,
gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched mutations into,
before sending it to a distant node, generating unnecessary garbage (potentially a lot of
it).
> With materialized views using the batchlog, it would be nice to optimise the write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the mutations, expiration,
and creation time (similar to {{HintMessage}} in CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should change the
structure of the batchlog table to use a list or a map of individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message