cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-12148) Improve determinism of CDC data availability
Date Tue, 26 Jul 2016 12:35:20 GMT


Branimir Lambov commented on CASSANDRA-12148:

bq. parse CDC data that's never actually persisted to disk

Thanks, that's the answer I was looking for. Can you state this explicitly somewhere in the
documentation and perhaps as {{writeCDCIndexFile}} JavaDoc?

> Improve determinism of CDC data availability
> --------------------------------------------
>                 Key: CASSANDRA-12148
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Joshua McKenzie
>            Assignee: Joshua McKenzie
> The latency with which CDC data becomes available has a known limitation due to our reliance
on CommitLogSegments being discarded to have the data available in cdc_raw: if a slowly written
table co-habitates a CommitLogSegment with CDC data, the CommitLogSegment won't be flushed
until we hit either memory pressure on memtables or CommitLog limit pressure. Ultimately,
this leaves a non-deterministic element to when data becomes available for CDC consumption
unless a consumer parses live CommitLogSegments.
> To work around this limitation and make semi-realtime CDC consumption more friendly to
end-users, I propose we extend CDC as follows:
> h6. High level:
> * Consumers parse hard links of active CommitLogSegments in cdc_raw instead of waiting
for flush/discard and file move
> * C* stores an offset of the highest seen CDC mutation in a separate idx file per commit
log segment in cdc_raw. Clients tail this index file, delta their local last parsed offset
on change, and parse the corresponding commit log segment using their last parsed offset as
> * C* flags that index file with an offset and DONE when the file is flushed so clients
know when they can clean up
> h6. Details:
> * On creation of a CommitLogSegment, also hard-link the file in cdc_raw
> * On first write of a CDC-enabled mutation to a segment, we:
> ** Flag it as {{CDCState.CONTAINS}}
> ** Set a long tracking the {{CommitLogPosition}} of the 1st CDC-enabled mutation in the
> ** Set a long in the CommitLogSegment tracking the offset of the end of the last written
CDC mutation in the segment if higher than the previously known highest CDC offset
> * On subsequent writes to the segment, we update the offset of the highest known CDC
> * On CommitLogSegment fsync, we write a file in cdc_raw as <segment_name>_cdc.idx
containing the min offset and end offset fsynced to disk per file
> * On segment discard, if CDCState == {{CDCState.PERMITTED}}, delete both the segment
in commitlog and in cdc_raw
> * On segment discard, if CDCState == {{CDCState.CONTAINS}}, delete the segment in commitlog
and update the <segment_name>_cdc.idx file w/end offset and a DONE marker
> * On segment replay, store the highest end offset of seen CDC-enabled mutations from
a segment and write that to <segment_name>_cdc.idx on completion of segment replay.
This should bridge the potential correctness gap of a node writing to a segment and then dying
before it can write the <segment_name>_cdc.idx file.
> This should allow clients to skip the beginning of a file to the 1st CDC mutation, track
an offset of how far they've parsed, delta against the _cdc.idx file end offset, and use that
as a determinant on when to parse new CDC data. Any existing clients written to the initial
implementation of CDC need only add the <segment_name>_cdc.idx logic and checking for
DONE marker to their code, so the burden on users to update to support this should be quite
small for the benefit of having data available as soon as it's fsynced instead of at a non-deterministic
time when potentially unrelated tables are flushed.
> Finally, we should look into extending the interface on CommitLogReader to be more friendly
for realtime parsing, perhaps supporting taking a CommitLogDescriptor and RandomAccessReader
and resuming readSection calls, assuming the reader is at the start of a SyncSegment. Would
probably also need to rewind to the start of the segment before returning so subsequent calls
would respect this contract. This would skip needing to deserialize the descriptor and all
completed SyncSegments to get to the root of the desired segment for parsing.
> One alternative we discussed offline - instead of just storing the highest seen CDC offset,
we could instead store an offset per CDC mutation (potentially delta encoded) in the idx file
to allow clients to seek and only parse the mutations with CDC enabled. My hunch is that the
performance delta from doing so wouldn't justify the complexity given the SyncSegment deserialization
and seeking restrictions in the compressed and encrypted cases as mentioned above.
> The only complication I can think of with the above design is uncompressed mmapped CommitLogSegments
on Windows being undeletable, but it'd be pretty simple to disallow configuration of CDC w/uncompressed
CommitLog on that environment.
> And as a final note: while the above might sound involved, it really shouldn't be a big
change from where we are with v1 of CDC from a C* complexity nor code perspective, or from
a client implementation perspective.

This message was sent by Atlassian JIRA

View raw message