cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Lynch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13924) Continuous/Infectious Repair
Date Thu, 09 Nov 2017 21:53:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16246622#comment-16246622
] 

Joseph Lynch commented on CASSANDRA-13924:
------------------------------------------

Sure, I can certainly start working on the part that tracks mutations that are fully acked
first since that's useful whether we do the back half (read repair) or not. If we can get
the repair service re-written for 4.0 so that incrementals are automatically scheduled by
Cassandra itself, that would fulfill much of the second half of this idea (continuous, always
on repair). My only concern with keeping the markers in memory is that we can't re-create
them on a crash / shutdown (without drain), so you'd end up with a bunch of unrepaired tables,
but that's probably worth the performance tradeoff.

If we do decide that read-repairing is faster/safer than a run of incremental repair, then
I can just make something which reads the partitions out of the non repaired tables, reads
@ ALL, and marks the table as repaired (or does a anticompaction) etc .. but that can be later
work if we like that. My main reason for advocating for the read @ ALL technique is that I
find reading the data to calculate a merkle tree seems potentially harder to manage from a
consistent performance perspective than just reading the data. 

I'll start working on the mutation tracking bit and the flushing part.

> Continuous/Infectious Repair
> ----------------------------
>
>                 Key: CASSANDRA-13924
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13924
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Repair
>            Reporter: Joseph Lynch
>            Priority: Minor
>              Labels: CommunityFeedbackRequested
>
> eI've been working on a way to keep data consistent without scheduled/external/manual
repair, because for large datasets repair is extremely expensive. The basic gist is to introduce
a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs
repair) and is recorded on replicas instead of coordinators during write time. Then a periodic
background task can issue read repairs to just the PKs that were mutated. The initial performance
degradation of this approach is non trivial, but I believe that I can optimize it so that
we are doing very little additional work (see below in the design doc for some proposed optimizations).
> My extremely rough proof of concept (uses a local table instead of HintStorage, etc)
so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair]
and has a rough [design document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst].
I'm working on getting benchmarks of the various optimizations, but I figured I should start
this ticket before I got too deep into it.
> I believe this approach is particularly good for high read rate clusters requiring consistent
low latency, and for clusters that mutate a relatively small proportion of their data (since
you never have to read the whole dataset, just what's being mutated). I view this as something
that works _with_ incremental repair to reduce work required because with this technique we
could potentially flush repaired + unrepaired sstables directly from the memtable. I also
see this as something that would be enabled or disabled per table since it is so use case
specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid
approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based
repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's
worth talking about.
> If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg]
I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that
conversation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message