cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piavlo <lolitus...@gmail.com>
Subject Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ
Date Fri, 18 May 2012 18:34:57 GMT
  Hi,

I had a schema disagreement problem in cassandra 1.0.9 cluster, where 
one node had different schema version.
So I followed the faq at 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
disabled gossip, disabled thrift, drained  and finally stopped the 
cassandra process, on startup
noticed
INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.
in the log, and after
INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
it started throwing Fatal exceptions for all read/write operations 
endlessly.

I had to stop cassandra process again(no draining was done)

On second start it did came up ok immediately loading the correct 
cluster schema version
INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
Loading schema version 9db34ef0-a0be-11e1-0000-f9687e034cf7

But now this node appears to have started with no data from keyspace 
which had schema disagreement.
The original keyspace sstables now appear under snapshots dir.

# nodetool -h localhost ring
Address         DC          Rack        Status State   Load            
Owns    Token
                                                                                141784319550391026443072753096570088106
10.49.127.4     eu-west     1a          Up     Normal  8.19 GB         
16.67%  0
10.241.29.65    eu-west     1b          Up     Normal  8.18 GB         
16.67%  28356863910078205288614550619314017621
10.59.46.236    eu-west     1c          Up     Normal  8.22 GB         
16.67%  56713727820156410577229101238628035242
10.50.33.232    eu-west     1a          Up     Normal  8.2 GB          
16.67%  85070591730234615865843651857942052864
10.234.71.33    eu-west     1b          Up     Normal  8.15 GB         
16.67%  113427455640312821154458202477256070485
10.58.249.118   eu-west     1c          Up     Normal  660.98 MB       
16.67%  141784319550391026443072753096570088106
#

The node is the one with 660.98 MB data( which is opscenter keyspace 
data which was not invalidated)

So i have some questions:

1) What did I wrong? - why cassandra was throwing exceptions on first 
startup?
2) Why the keyspace data was invalidated ? Is it expected?
3) If answer to #2 is  "yes it's expected" then  that's the point in 
doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement
then all keyspace data is lost anyway? It makes more sense to just do 
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
4) afaiu i could also stop cassandra again move old sstables from 
snapshot back to keyspace data dir and run repair for all keyspace CFs? 
So that it finishes faster
and makes less load than running a repair which has no previous keyspace 
data at all?

The first startup log is below:

  INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 
105) Logging initialized
  INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 
126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
  INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
127) Heap size: 2600468480/2600468480
  INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
128) Classpath: 
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar
  INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA 
mlockall successful
  INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) 
Loading settings from file:/etc/cassandra/ssa/cassandra.yaml
  INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) 
DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
  INFO [main] 2012-05-18 16:23:10,911 DatabaseDescriptor.java (line 233) 
Global memtable threshold is enabled at 826MB
  INFO [main] 2012-05-18 16:23:10,970 Ec2Snitch.java (line 63) EC2Snitch 
using region: eu-west, zone: 1c.
  INFO [SSTableBatchOpen:1] 2012-05-18 16:23:11,652 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/Versions-hc-5 (247 bytes)
  INFO [SSTableBatchOpen:1] 2012-05-18 16:23:11,767 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/IndexInfo-hc-2 (128 bytes)
  INFO [SSTableBatchOpen:2] 2012-05-18 16:23:11,768 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/IndexInfo-hc-3 (124 bytes)
  INFO [SSTableBatchOpen:1] 2012-05-18 16:23:11,802 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/IndexInfo-hc-1 (124 bytes)
  INFO [SSTableBatchOpen:1] 2012-05-18 16:23:11,825 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/LocationInfo-hc-35 (89 
bytes)
  INFO [SSTableBatchOpen:2] 2012-05-18 16:23:11,825 SSTableReader.java 
(line 153) Opening /var/lib/cassandra/data/system/LocationInfo-hc-34 
(558 bytes)
  INFO [SSTableBatchOpen:1] 2012-05-18 16:23:11,835 SSTableReader.java 
(line 153) Opening 
/var/lib/cassandra/data/system/HintsColumnFamily-hc-56 (235748 bytes)
  INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.
  INFO [main] 2012-05-18 16:23:11,880 DatabaseDescriptor.java (line 493) 
Found table data in data directories. Consider using the CLI to define 
your schema.
  INFO [main] 2012-05-18 16:23:11,914 CommitLogSegment.java (line 60) 
Creating new commitlog segment 
/var/lib/cassandra/commitlog/CommitLog-1337358191914.log
  INFO [main] 2012-05-18 16:23:11,928 CommitLog.java (line 172) 
Replaying /var/lib/cassandra/commitlog/CommitLog-1337356780099.log
  INFO [main] 2012-05-18 16:23:15,237 CommitLog.java (line 355) Finished 
reading /var/lib/cassandra/commitlog/CommitLog-1337356780099.log
  INFO [main] 2012-05-18 16:23:15,238 CommitLog.java (line 360) Skipped 
77886 mutations from unknown (probably removed) CF with id 1016
  INFO [main] 2012-05-18 16:23:15,238 CommitLog.java (line 360) Skipped 
37804 mutations from unknown (probably removed) CF with id 1008
  INFO [main] 2012-05-18 16:23:15,238 CommitLog.java (line 360) Skipped 
3 mutations from unknown (probably removed) CF with id 1024
  INFO [main] 2012-05-18 16:23:15,239 CommitLog.java (line 360) Skipped 
67759 mutations from unknown (probably removed) CF with id 1009
  INFO [main] 2012-05-18 16:23:15,239 CommitLog.java (line 360) Skipped 
51268 mutations from unknown (probably removed) CF with id 1010
  INFO [main] 2012-05-18 16:23:15,239 CommitLog.java (line 360) Skipped 
59319 mutations from unknown (probably removed) CF with id 1011
  INFO [main] 2012-05-18 16:23:15,239 CommitLog.java (line 360) Skipped 
61110 mutations from unknown (probably removed) CF with id 1012
  INFO [main] 2012-05-18 16:23:15,239 CommitLog.java (line 360) Skipped 
182908 mutations from unknown (probably removed) CF with id 1013
  INFO [main] 2012-05-18 16:23:15,240 CommitLog.java (line 360) Skipped 
10038 mutations from unknown (probably removed) CF with id 1014
  INFO [main] 2012-05-18 16:23:15,240 CommitLog.java (line 360) Skipped 
14783 mutations from unknown (probably removed) CF with id 1001
  INFO [main] 2012-05-18 16:23:15,240 CommitLog.java (line 360) Skipped 
4256 mutations from unknown (probably removed) CF with id 1000
  INFO [main] 2012-05-18 16:23:15,240 CommitLog.java (line 360) Skipped 
746 mutations from unknown (probably removed) CF with id 1003
  INFO [main] 2012-05-18 16:23:15,241 CommitLog.java (line 360) Skipped 
3013 mutations from unknown (probably removed) CF with id 1002
  INFO [main] 2012-05-18 16:23:15,241 CommitLog.java (line 179) Log 
replay complete, 0 replayed mutations
  INFO [main] 2012-05-18 16:23:15,259 StorageService.java (line 400) 
Cassandra version: 1.0.9
  INFO [main] 2012-05-18 16:23:15,260 StorageService.java (line 401) 
Thrift API version: 19.20.0
  INFO [main] 2012-05-18 16:23:15,260 StorageService.java (line 414) 
Loading persisted ring state
  INFO [main] 2012-05-18 16:23:15,288 StorageService.java (line 494) 
Starting up server gossip
  INFO [main] 2012-05-18 16:23:15,292 ColumnFamilyStore.java (line 705) 
Enqueuing flush of Memtable-LocationInfo@189237805(29/36 serialized/live 
bytes, 1 ops)
  INFO [FlushWriter:1] 2012-05-18 16:23:15,293 Memtable.java (line 246) 
Writing Memtable-LocationInfo@189237805(29/36 serialized/live bytes, 1 ops)
  INFO [FlushWriter:1] 2012-05-18 16:23:15,350 Memtable.java (line 283) 
Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-hc-36-Data.db (80 bytes)
  INFO [main] 2012-05-18 16:23:15,365 Ec2Snitch.java (line 113) 
Ec2Snitch adding ApplicationState ec2region=eu-west ec2zone=1c
  INFO [main] 2012-05-18 16:23:15,379 MessagingService.java (line 269) 
Starting Messaging Service on port 7000
  INFO [main] 2012-05-18 16:23:15,399 StorageService.java (line 609) 
Using saved token 141784319550391026443072753096570088106
  INFO [main] 2012-05-18 16:23:15,403 ColumnFamilyStore.java (line 705) 
Enqueuing flush of Memtable-LocationInfo@559001765(53/66 serialized/live 
bytes, 2 ops)
  INFO [FlushWriter:1] 2012-05-18 16:23:15,404 Memtable.java (line 246) 
Writing Memtable-LocationInfo@559001765(53/66 serialized/live bytes, 2 ops)
  INFO [FlushWriter:1] 2012-05-18 16:23:15,451 Memtable.java (line 283) 
Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-hc-37-Data.db (163 bytes)
  INFO [main] 2012-05-18 16:23:15,454 StorageService.java (line 989) 
Node dsc2c.internal/10.58.249.118 state jump to normal
  INFO [CompactionExecutor:1] 2012-05-18 16:23:15,461 
CompactionTask.java (line 115) Compacting 
[SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-hc-37-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-hc-35-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-hc-34-Data.db'), 
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo-hc-36-Data.db')]
  INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
  INFO [CompactionExecutor:1] 2012-05-18 16:23:15,507 
CompactionTask.java (line 223) Compacted to 
[/var/lib/cassandra/data/system/LocationInfo-hc-38-Data.db,].  890 to 
558 (~62% of original) bytes for 4 keys at 0.015651MB/s.  Time: 34ms.
ERROR [MutationStage:1] 2012-05-18 16:23:15,664 
RowMutationVerbHandler.java (line 61) Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
cfId=1012
         at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
         at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401)
         at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409)
         at 
org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:357)
         at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
         at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
ERROR [MutationStage:2] 2012-05-18 16:23:15,666 
RowMutationVerbHandler.java (line 61) Error in row mutation
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
cfId=1010
         at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126)
         at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401)
         at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409)
         at 
org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:357)
         at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42)
         at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:1] 2012-05-18 16:23:15,668 AbstractCassandraDaemon.java 
(line 139) Fatal exception in thread Thread[ReadStage:1,5,main]
java.lang.IllegalArgumentException: Unknown ColumnFamily UserData in 
keyspace PRODUCTION
         at 
org.apache.cassandra.config.Schema.getComparator(Schema.java:223)
         at 
org.apache.cassandra.db.ColumnFamily.getComparatorFor(ColumnFamily.java:320)
         at 
org.apache.cassandra.db.ReadCommand.getComparator(ReadCommand.java:92)
         at 
org.apache.cassandra.db.SliceByNamesReadCommand.<init>(SliceByNamesReadCommand.java:44)
         at 
org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:106)
         at 
org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:74)
         at 
org.apache.cassandra.db.ReadCommandSerializer.deserialize(ReadCommand.java:132)
         at 
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:51)
         at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:3] 2012-05-18 16:23:15,669 AbstractCassandraDaemon.java 
(line 139) Fatal exception in thread Thread[ReadStage:3,5,main]
java.lang.IllegalArgumentException: Unknown ColumnFamily 
UserFraudParameters in keyspace PRODUCTION
         at 
org.apache.cassandra.config.Schema.getComparator(Schema.java:223)
         at 
org.apache.cassandra.db.ColumnFamily.getComparatorFor(ColumnFamily.java:320)
         at 
org.apache.cassandra.db.ReadCommand.getComparator(ReadCommand.java:92)
         at 
org.apache.cassandra.db.SliceByNamesReadCommand.<init>(SliceByNamesReadCommand.java:44)
         at 
org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:106)
         at 
org.apache.cassandra.db.SliceByNamesReadCommandSerializer.deserialize(SliceByNamesReadCommand.java:74)
         at 
org.apache.cassandra.db.ReadCommandSerializer.deserialize(ReadCommand.java:132)
         at 
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:51)
         at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:662)
...
and so endlessly on


Thanks
Alex




Mime
View raw message