incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Jordan <jeremiah.jor...@morningstar.com>
Subject Re: gracefully recover from data file corruptions
Date Fri, 16 Dec 2011 17:16:57 GMT
You need to run repair on the node once it is back up (to get back the 
data you just deleted).  If this is happening on more than one node you 
could have data loss...

-Jeremiah

On 12/16/2011 07:46 AM, Ramesh Natarajan wrote:
> We are running a 30 node 1.0.5 cassandra cluster  running RHEL 5.6
> x86_64 virtualized on ESXi 5.0. We are seeing Decorated Key assertion
> error during compactions and at this point we are suspecting anything
> from OS/ESXi/HBA/iSCSI RAID.  Please correct me i am wrong, once a
> node gets into this state I don't see any way to recover unless I
> remove the corrupted data file and restart cassandra. I am running
> tests with replication factor 3 and all reads and writes are done with
> QUORUM. So i believe there will not be data loss if i do this.
>
> If this is a correct way to recover I would like to know how to
> gracefully do this in production environment..
>
> - Disable thrift
> - Disable gossip
> - Drain the node
> - kill the cassandra java process ( send a sigterm and or sigkill )
> - do a filesystem sync
> - remove the corrupted file from the /var/lib/cassandra/data directory
> - start cassandra
> - enable gossip so all pending hintedhandoff occurs
> - enable thrift.
>
> Thanks
> Ramesh

Mime
View raw message