incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Notes from committers meeting: streaming and repair
Date Mon, 25 Feb 2013 23:09:19 GMT
There was broad agreement that repair is a weak point for us.  One
committer administers a cluster that takes about a month to complete a
full repair.  This is too long.  Pavel went further: "repair is
unusable."  It is worth noting that these clusters are on 1.1.x, and
[1] makes a big difference in 1.2.  Still, there is clearly plenty of
room for improvement, both in streaming (such as streaming ranges
intelligently to avoid creating many small sstables, which in turn
cause a compaction spike on the recipient [2]) and in repair itself.

Some improvements to repair could include

* making the repair coordinator smarter to know when to avoid
duplicate streaming.  E.g., if replicas A and B have row X, but C does
not, currently both A and B will stream to C.
* Dynamic adjustment of merkle tree precision [3]
* Track "known-to-be-in-sync data" and avoid re-validating that part.
We've had a couple proposals for this; implementation complexity
aside, though, I'm not sure that it's worth "hiding" the worst case
(when you really do need to rebuild a node) and giving operators a
false sense of security (i.e., during the rebuild performance will
degrade and cause surprise and chagrin).

[1] https://issues.apache.org/jira/browse/CASSANDRA-4297
[2] https://issues.apache.org/jira/browse/CASSANDRA-5286
[3] https://issues.apache.org/jira/browse/CASSANDRA-5263

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced

Mime
View raw message