cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9673) Improve batchlog write path
Date Wed, 26 Aug 2015 06:53:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712610#comment-14712610
] 

Stefania commented on CASSANDRA-9673:
-------------------------------------

Thanks for all your work, it's in much better shape now.

bq. As for migration, my preference would be to make it happen at the borders only. So I'd
ask you again to move this live upgrade batchlog mutation conversion to MutationVerbHandler,
and away from Keyspace::apply. The latter is just too deep.

Done. I was trying to limit the impact so I pushed it down until I found the loop on the partition
updates. I've added a check on the message version though, so we should be OK.

bq. A standalone test for LegacyBatchMigrator::converBatchEntries would be nice.

{{BatchLogManager.testReplay}} tests this (now that we moved the conversion to {{MutationVerbHandler}})

bq. I would also prefer to move the whole package to o.a.c.batchlog instead of o.a.c.db.batch.
o.a.c.db is overpopulated, and, frankly, meaningless. I've been depopulating it slowly - with
schema and hints being separate top-level packages already.

Done.

--

Hinted handoff dtests were broken so I reverted vints in {{HintsMessage}}. I think the hint
size must match what's stored on disk? If I am correct then I believe {{HintsBuffer.ENTRY_OVERHEAD_SIZE}}
also must change, since it wasn't totally trivial to fix this I reverted {{HintsMessage}}.
I did not revert {{Hint}} however, up to you if you want to fix everything here or move to
another ticket.

Regarding {{MaterializedViewLongTest}}, I ran it in a loop for 10 times and it failed once
out of 10 times on both unpatched 3.0 and patched 3.0. There is one obvious problem in that
the test is still looking at _system.batchlog_ rather than _system.batches_, I fixed that.
I also think we need to wait for the ordinary mutations to be applied in the test as well,
so that means the ordinary MUTATION stage. Finally shouldn't the mutations with only a local
endpoint in SP.mutateMV be applied in MATERIALIZED_VIEW_MUTATION stage? See tentative fix
[here|https://github.com/stef1927/cassandra/commit/7d772ebc731524a3cdf08225451e8022bc98b7be].
With the fix applied the test ran 20 times successfully but this doesn't necessarily mean
it is OK now. Also, the change in SP may not be needed, provided we wait for the MUTATION
stage as well in the test (I did not test without the changes to SP).

> Improve batchlog write path
> ---------------------------
>
>                 Key: CASSANDRA-9673
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9673
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Stefania
>              Labels: performance
>             Fix For: 3.0 beta 2
>
>         Attachments: 9673_001.tar.gz, 9673_004.tar.gz, gc_times_first_node_patched_004.png,
gc_times_first_node_trunk_004.png
>
>
> Currently we allocate an on-heap {{ByteBuffer}} to serialize the batched mutations into,
before sending it to a distant node, generating unnecessary garbage (potentially a lot of
it).
> With materialized views using the batchlog, it would be nice to optimise the write path:
> - introduce a new verb ({{Batch}})
> - introduce a new message ({{BatchMessage}}) that would encapsulate the mutations, expiration,
and creation time (similar to {{HintMessage}} in CASSANDRA-6230)
> - have MS serialize it directly instead of relying on an intermediate buffer
> To avoid merely shifting the temp buffer to the receiving side(s) we should change the
structure of the batchlog table to use a list or a map of individual mutations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message