cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain Vandendorpe <al...@tapstream.com>
Subject Deduplicating data on a node (RF=1)
Date Mon, 17 Nov 2014 20:04:35 GMT
Hey all,

For legacy reasons we're living with Cassandra 2.0.10 in an RF=1 setup.
This is being moved away from ASAP. In the meantime, adding a node recently
encountered a Stream Failed error (http://pastie.org/9725846). Cassandra
restarted and it seemingly restarted streaming from zero, without having
removed the failed stream's data.

With bootstrapping and initial compactions finished that node now has what
seems to be duplicate data, with almost exactly 2x the expected disk usage.
CQL returns correct results but we depend on the ability to directly read
the SSTable files (hence also RF=1.)

Would anyone have suggestions on a good way to resolve this?

Thanks,
Alain

Mime
View raw message