cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Coverston <ben.covers...@datastax.com>
Subject Re: Cluster not starting up
Date Fri, 04 Mar 2011 18:10:19 GMT
The EOF exception looks like CASSANDRA-1992, which, if that is the 
problem, will be resolved by the scrub tool in 0.7.3.

That release is being voted on right now.

HTH,
Ben

On 3/4/11 10:32 AM, Matt Kennedy wrote:
> I'm currently the proud owner of an 8-node cluster that won't start up.
>
> Yesterday we had a developer doing very high volume writes to our 
> cluster via a Hadoop job that was reading an HDFS file and running six 
> concurrent mappers on each of 8 nodes and using Hector to do the load 
> and it sort of killed Cassandra.  It was running 0.7.0 and actually 
> killed three of the nodes with OutOfMemory errors before he realized 
> something was awry and killed the job.  He then tried to get rid of 
> the keyspace by dropping it in the CLI and got the following error:
>
> javax.management.InstanceAlreadyExistsException: 
> org.apache.cassandra.db:type=ColumnFamilies,keyspace=devks,columnfamily=OriginCF
>
> So he punted to me, and I decided to just try restarting the cluster 
> in the hopes that it would sort itself out.  The nodes that were still 
> up died gracefully with the stop-server command, no kill -9s 
> required.  But when I tried to start the nodes again, they all failed 
> with stack traces.
>
> My googling led me to this: 
> https://issues.apache.org/jira/browse/CASSANDRA-2197
>
> So I upgraded to 0.7.2 and tried restarting, once again all the nodes 
> fail with two different stack traces,  but both types occur 
> immediately after an INFO message of the form:
>
> INFO 12:06:26,979 Finished reading 
> /path/to/commitlog/etc/CommitLog-NNNNNNNN.log
>
> The stack traces are one of:
>
> Exception encountered during startup.
> java.io.IOError: java.io.EOFException
>     at 
> org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:246)
> ...
>
> or
>
> Exception encountered during startup.
> java.lang.NullPointerException
>     at 
> org.apache.cassandra.db.Table.createReplicationStrategy(Table.java:318)
> ...
>
> Fortunately, I have the luxury of clearing out the data in the 
> cluster, but I'd like a more elegant option than that.  Anybody have 
> any suggestions?
>
> Thanks,
> Matt

-- 
Ben Coverston
DataStax -- The Apache Cassandra Company
http://www.datastax.com/


Mime
View raw message