cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sankalp kohli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13924) Continuous/Infectious Repair
Date Thu, 09 Nov 2017 20:32:02 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16246470#comment-16246470
] 

sankalp kohli commented on CASSANDRA-13924:
-------------------------------------------

I like this idea but want to propose the following changes to it. 

We track in memtable at partition level what data was replicated to all replicas. This will
require co-ordinator to update the replicas once data is acked from all replicas. 

We flush memtable as separate sstables containing repaired and non repaired data. Incremental
repair will take care of non repaired data. 

Another optimization we can build on top of this is to flush only repaired data when we need
to flush and keep non repaired for a little longer time. This will make sure they get ACKed
from co-ordinator. Co-ordoinator can also ack back to replicas if hints were successfully
delivered. 

> Continuous/Infectious Repair
> ----------------------------
>
>                 Key: CASSANDRA-13924
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13924
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Repair
>            Reporter: Joseph Lynch
>            Priority: Minor
>              Labels: CommunityFeedbackRequested
>
> eI've been working on a way to keep data consistent without scheduled/external/manual
repair, because for large datasets repair is extremely expensive. The basic gist is to introduce
a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs
repair) and is recorded on replicas instead of coordinators during write time. Then a periodic
background task can issue read repairs to just the PKs that were mutated. The initial performance
degradation of this approach is non trivial, but I believe that I can optimize it so that
we are doing very little additional work (see below in the design doc for some proposed optimizations).
> My extremely rough proof of concept (uses a local table instead of HintStorage, etc)
so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair]
and has a rough [design document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst].
I'm working on getting benchmarks of the various optimizations, but I figured I should start
this ticket before I got too deep into it.
> I believe this approach is particularly good for high read rate clusters requiring consistent
low latency, and for clusters that mutate a relatively small proportion of their data (since
you never have to read the whole dataset, just what's being mutated). I view this as something
that works _with_ incremental repair to reduce work required because with this technique we
could potentially flush repaired + unrepaired sstables directly from the memtable. I also
see this as something that would be enabled or disabled per table since it is so use case
specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid
approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based
repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's
worth talking about.
> If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg]
I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that
conversation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message