cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Black <>
Subject data dropped when using sstableloader?
Date Wed, 27 Nov 2013 09:12:12 GMT

Using Cassandra 1.2.10, I am trying to load sstable data into a cluster of
6 machines.
The machines are using vnodes, and are configured with
NetworkTopologyStrategy replication=3 and LeveledCompactionStrategy on the
tables being loaded.
The sstable data was generated using SSTableSimpleUnsortedWriter.
The small dataset for one table is ~100GB, the large dataset for another
table is ~500GB.

The data was loaded using:
    sstableloader --nodes ihz58,ihz59,ihz60,ihz61,ihz62,ihz63 --verbose
and was run on a machine that was not part of the cluster.

After loading the data using sstableloader, I discovered that some rows
were missing from Cassandra.  I dumped the sstables using sstable2json and
could see the missing rows in the generated data.

Over time the list of missing rows reduced, but for several days now the
list of missing data has not changed.  It is now more than a week since I
first loaded the data.
I have tried flushing all the nodes, restarting all machines, and running a
repair, but nothing changes the set of missing rows.

Is there anything I have done wrong here that could result in lost data?


View raw message