incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Can't replace dead node
Date Mon, 18 Mar 2013 17:27:23 GMT
If a node is a seed node it will not bootstrap data from others the first time it starts. 

You can always run a nodetool repair when you think data is not where it should be. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/03/2013, at 1:16 PM, Andrey Ilinykh <ailinykh@gmail.com> wrote:

> More info.
> I see this problem only if rest of cluster runs 1.1.5 and I try to replace dead node
by version 1.1.7 or higher. If I upgrade rest of cluster to the same version (I tried 1.1.10)
everything is ok. I guess there is some incompatibility between 1.1.5 and 1.1.7 and higher.
> 
> Thank you,
>   Andrey
> 
> 
> On Fri, Mar 15, 2013 at 11:39 AM, Andrey Ilinykh <ailinykh@gmail.com> wrote:
> I removed Priam and get the same picture. 
> 
> 
> What I do is- I added to cassandra-env.sh two lines and start cassandra.
> 
> JVM_OPTS="$JVM_OPTS -Dcassandra.initial_token=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba"
> JVM_OPTS="$JVM_OPTS -Dcassandra.replace_token=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba"
> 
> Then I can successfully run ring command
> 
> 
> Note: Ownership information does not include topology, please specify a keyspace. 
> Address         DC          Rack        Status State   Load            Owns         
      Token                                       
>                                                                                     
      Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba])
> 10.28.241.14    us-east     1a          Up     Normal  251.96 GB       33.33%       
      Token(bytes[00000000000000000000000000000010])
> 10.240.119.230  us-east     1b          Up     Normal  252.48 GB       33.33%       
      Token(bytes[55555555555555555555555555555565])
> 10.147.174.27   us-east     1c          Up     Normal  11.26 KB        33.33%       
      Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba])
> 
> It shows the current node as part of the ring, but it is empty. In data directory I can
see only system key space.
> 
> There is no any errors in log file. It just doen't stream data from other nodes. 
> I can launch 1.1.6 but not 1.1.7 or higher. Any ideas what was changed in 1.1.7?
> 
> Thank you,
>   Andrey
> 
> 
> 
> INFO [main] 2013-03-15 18:20:45,303 AbstractCassandraDaemon.java (line 101) Logging initialized
>  INFO [main] 2013-03-15 18:20:45,309 AbstractCassandraDaemon.java (line 122) JVM vendor/version:
Java HotSpot(TM) 64-Bit Server VM/1.6.0_35
>  INFO [main] 2013-03-15 18:20:45,310 AbstractCassandraDaemon.java (line 123) Heap size:
1931476992/1931476992
>  INFO [main] 2013-03-15 18:20:45,311 AbstractCassandraDaemon.java (line 124) Classpath:
/opt/apache-cassandra-1.1.10/bin/../conf:/opt/apache-cassandra-1.1.10/bin/../build/classes/main:/opt/apache-cassandra-1.1.10/bin/../build/classes/thrift:/opt/apache-cassandra-1.1.10/bin/../lib/antlr-3.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-clientutil-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/apache-cassandra-thrift-1.1.10.jar:/opt/apache-cassandra-1.1.10/bin/../lib/avro-1.4.0-fixes.jar:/opt/apache-cassandra-1.1.10/bin/../lib/avro-1.4.0-sources-fixes.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-cli-1.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-codec-1.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/commons-lang-2.4.jar:/opt/apache-cassandra-1.1.10/bin/../lib/compress-lzf-0.8.4.jar:/opt/apache-cassandra-1.1.10/bin/../lib/concurrentlinkedhashmap-lru-1.3.jar:/opt/apache-cassandra-1.1.10/bin/../lib/guava-r08.jar:/opt/apache-cassandra-1.1.10/bin/../lib/high-scale-lib-1.1.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jackson-core-asl-1.9.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jackson-mapper-asl-1.9.2.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jamm-0.2.5.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jline-0.9.94.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jna.jar:/opt/apache-cassandra-1.1.10/bin/../lib/json-simple-1.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/libthrift-0.7.0.jar:/opt/apache-cassandra-1.1.10/bin/../lib/log4j-1.2.16.jar:/opt/apache-cassandra-1.1.10/bin/../lib/metrics-core-2.0.3.jar:/opt/apache-cassandra-1.1.10/bin/../lib/priam.jar:/opt/apache-cassandra-1.1.10/bin/../lib/servlet-api-2.5-20081211.jar:/opt/apache-cassandra-1.1.10/bin/../lib/slf4j-api-1.6.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snakeyaml-1.6.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snappy-java-1.0.4.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/snaptree-0.1.jar:/opt/apache-cassandra-1.1.10/bin/../lib/jamm-0.2.5.jar
>  INFO [main] 2013-03-15 18:20:47,406 CLibrary.java (line 111) JNA mlockall successful
>  INFO [main] 2013-03-15 18:20:47,419 DatabaseDescriptor.java (line 123) Loading settings
from file:/opt/apache-cassandra-1.1.10/conf/cassandra.yaml
>  INFO [main] 2013-03-15 18:20:47,840 DatabaseDescriptor.java (line 182) DiskAccessMode
'auto' determined to be mmap, indexAccessMode is mmap
>  INFO [main] 2013-03-15 18:20:47,853 DatabaseDescriptor.java (line 246) Global memtable
threshold is enabled at 614MB
>  INFO [main] 2013-03-15 18:20:47,879 Ec2Snitch.java (line 66) EC2Snitch using region:
us-east, zone: 1c.
>  INFO [main] 2013-03-15 18:20:48,359 CacheService.java (line 96) Initializing key cache
with capacity of 92 MBs.
>  INFO [main] 2013-03-15 18:20:48,376 CacheService.java (line 107) Scheduling key cache
save to each 14400 seconds (going to save all keys).
>  INFO [main] 2013-03-15 18:20:48,377 CacheService.java (line 121) Initializing row cache
with capacity of 0 MBs and provider org.apache.cassandra.cache.SerializingCacheProvider
>  INFO [main] 2013-03-15 18:20:48,384 CacheService.java (line 133) Scheduling row cache
save to each 0 seconds (going to save all keys).
>  INFO [main] 2013-03-15 18:20:48,661 DatabaseDescriptor.java (line 509) Couldn't detect
any schema definitions in local storage.
>  INFO [main] 2013-03-15 18:20:48,676 DatabaseDescriptor.java (line 512) Found table data
in data directories. Consider using the CLI to define your schema.
>  INFO [main] 2013-03-15 18:20:48,716 CommitLog.java (line 124) No commitlog files found;
skipping replay
>  INFO [main] 2013-03-15 18:20:48,740 StorageService.java (line 433) Cassandra version:
1.1.10
>  INFO [main] 2013-03-15 18:20:48,740 StorageService.java (line 434) Thrift API version:
19.33.0
>  INFO [main] 2013-03-15 18:20:48,743 StorageService.java (line 435) CQL supported versions:
2.0.0,3.0.0-beta1 (default: 2.0.0)
>  INFO [main] 2013-03-15 18:20:48,793 StorageService.java (line 465) Loading persisted
ring state
>  INFO [main] 2013-03-15 18:20:48,796 StorageService.java (line 546) Starting up server
gossip
>  INFO [main] 2013-03-15 18:20:48,809 ColumnFamilyStore.java (line 679) Enqueuing flush
of Memtable-LocationInfo@523908271(124/155 serialized/live bytes, 3 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:48,810 Memtable.java (line 264) Writing Memtable-LocationInfo@523908271(124/155
serialized/live bytes, 3 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:48,858 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-1-Data.db (232 bytes) for
commitlog position ReplayPosition(segmentId=1363371648611, position=585)
>  INFO [main] 2013-03-15 18:20:48,873 Ec2Snitch.java (line 116) Ec2Snitch adding ApplicationState
ec2region=us-east ec2zone=1c
>  INFO [main] 2013-03-15 18:20:48,882 MessagingService.java (line 284) Starting Messaging
Service on port 7000
>  INFO [main] 2013-03-15 18:20:48,890 StorageService.java (line 788) JOINING: waiting
for ring information
>  INFO [GossipStage:1] 2013-03-15 18:20:49,001 Gossiper.java (line 851) Node /10.240.119.230
is now part of the cluster
>  INFO [GossipStage:1] 2013-03-15 18:20:49,002 Gossiper.java (line 817) InetAddress /10.240.119.230
is now UP
>  INFO [GossipStage:1] 2013-03-15 18:20:49,004 ColumnFamilyStore.java (line 679) Enqueuing
flush of Memtable-LocationInfo@64338076(52/65 serialized/live bytes, 2 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:49,005 Memtable.java (line 264) Writing Memtable-LocationInfo@64338076(52/65
serialized/live bytes, 2 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:49,015 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-2-Data.db (165 bytes) for
commitlog position ReplayPosition(segmentId=1363371648611, position=768)
>  INFO [GossipStage:1] 2013-03-15 18:20:49,019 Gossiper.java (line 851) Node /10.28.241.14
is now part of the cluster
>  INFO [GossipStage:1] 2013-03-15 18:20:49,019 Gossiper.java (line 817) InetAddress /10.28.241.14
is now UP
>  INFO [GossipStage:1] 2013-03-15 18:20:49,021 ColumnFamilyStore.java (line 679) Enqueuing
flush of Memtable-LocationInfo@160194151(35/43 serialized/live bytes, 1 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:49,022 Memtable.java (line 264) Writing Memtable-LocationInfo@160194151(35/43
serialized/live bytes, 1 ops)
>  INFO [FlushWriter:1] 2013-03-15 18:20:49,031 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-3-Data.db (89 bytes) for
commitlog position ReplayPosition(segmentId=1363371648611, position=866)
>  INFO [GossipStage:1] 2013-03-15 18:20:49,034 Gossiper.java (line 831) InetAddress /10.194.211.64
is now dead.
>  INFO [main] 2013-03-15 18:21:18,893 StorageService.java (line 788) JOINING: schema complete
> INFO [main] 2013-03-15 18:21:18,894 StorageService.java (line 788) JOINING: waiting for
pending range calculation
>  INFO [main] 2013-03-15 18:21:18,894 StorageService.java (line 788) JOINING: calculation
complete, ready to bootstrap
>  INFO [main] 2013-03-15 18:22:18,895 StorageService.java (line 788) JOINING: Replacing
a node with token: Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaba])
>  INFO [main] 2013-03-15 18:22:18,896 ColumnFamilyStore.java (line 679) Enqueuing flush
of Memtable-LocationInfo@662811604(36/45 serialized/live bytes, 1 ops)
>  INFO [FlushWriter:2] 2013-03-15 18:22:18,897 Memtable.java (line 264) Writing Memtable-LocationInfo@662811604(36/45
serialized/live bytes, 1 ops)
>  INFO [FlushWriter:2] 2013-03-15 18:22:18,909 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-4-Data.db (87 bytes) for
commitlog position ReplayPosition(segmentId=1363371648611, position=962)
>  INFO [main] 2013-03-15 18:22:18,911 StorageService.java (line 788) JOINING: Starting
to bootstrap...
>  INFO [CompactionExecutor:3] 2013-03-15 18:22:18,929 CompactionTask.java (line 107) Compacting
[SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-3-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-4-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-2-Data.db'),
SSTableReader(path='/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-1-Data.db')]
>  INFO [main] 2013-03-15 18:22:18,930 ColumnFamilyStore.java (line 679) Enqueuing flush
of Memtable-LocationInfo@744919514(53/66 serialized/live bytes, 2 ops)
>  INFO [FlushWriter:2] 2013-03-15 18:22:18,931 Memtable.java (line 264) Writing Memtable-LocationInfo@744919514(53/66
serialized/live bytes, 2 ops)
>  INFO [FlushWriter:2] 2013-03-15 18:22:18,944 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-5-Data.db (163 bytes) for
commitlog position ReplayPosition(segmentId=1363371648611, position=1143)
>  INFO [main] 2013-03-15 18:22:18,969 StorageService.java (line 1133) Node ip-10-147-174-27.ec2.internal/10.147.174.27
state jump to normal
>  INFO [main] 2013-03-15 18:22:18,969 StorageService.java (line 701) Bootstrap/Replace/Move
completed! Now serving reads.
>  INFO [main] 2013-03-15 18:22:19,011 CassandraDaemon.java (line 125) Binding thrift service
to ip-10-147-174-27.ec2.internal/10.147.174.27:9160
>  INFO [main] 2013-03-15 18:22:19,015 CassandraDaemon.java (line 134) Using TFastFramedTransport
with a max frame size of 15728640 bytes.
>  INFO [main] 2013-03-15 18:22:19,019 CassandraDaemon.java (line 161) Using synchronous/threadpool
thrift server on ip-10-147-174-27.ec2.internal/10.147.174.27 : 9160
>  INFO [Thread-6] 2013-03-15 18:22:19,020 CassandraDaemon.java (line 213) Listening for
thrift clients...
>  INFO [CompactionExecutor:3] 2013-03-15 18:22:19,031 CompactionTask.java (line 230) Compacted
to [/var/lib/cassandra/data/system/LocationInfo/system-LocationInfo-hf-6-Data.db,].  573 to
468 (~81% of original) bytes for 4 keys at 0.004649MB/s.  Time: 96ms.
> 
> 
> 
> On Fri, Mar 8, 2013 at 8:22 AM, aaron morton <aaron@thelastpickle.com> wrote:
> If it does not have the schema check the logs for errors and ensure it is actually part
of the cluster. 
> 
> You may have better luck with Priam specific questions on https://github.com/Netflix/Priam
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 7/03/2013, at 11:11 AM, Andrey Ilinykh <ailinykh@gmail.com> wrote:
> 
>> Hello everybody!
>> 
>> I used to run cassandra 1.1.5 with Priam. To replace dead node priam launches cassandra
with cassandra.replace_token property. It works smoothly with 1.1.5. Couple days ago I moved
to 1.1.10 and have a problem now. New cassandra successfully starts, joins the ring but it
doesn't see my keyspaces. It doesn't try to stream data from other nodes. I see only system
keyspace. Any idea what is the difference between 1.1.5 and 1.1.10? How am I supposed to replace
dead node?
>> 
>> Thank you,
>>    Andrey 
> 
> 
> 


Mime
View raw message