incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From adeel.ak...@panasiangroup.com
Subject Cassandra services down frequently [Version 1.1.4]
Date Thu, 04 Apr 2013 08:27:42 GMT
Hi,

We are running 4 nodes Cassandra cluster (1.1.4) with Replica Factor 2  
(DC 1) and Replica Factor 1 (DC 2) in two differnet data cnters with  
network topology. Our machines are having 16GB RAM and 8 core with two  
hard drives.

# /opt/apache-cassandra-1.1.4/bin/nodetool -h localhost ring
Address         DC          Rack        Status State   Load             
Effective-Ownership Token
                                                                        
                      169417178424467235000914166253263322299
10.0.0.3        DC1         RAC1        Up     Normal  91.93 GB         
66.67%              0
10.0.0.4        DC1         RAC1        Up     Normal  84.88 GB         
66.67%              56713727820156410577229101238628035242
10.0.0.15       DC1         RAC1        Up     Normal  82.51 GB         
66.67%              113427455640312821154458202477256070484
10.40.1.103     DC2         RAC1        Up     Normal  303.2 MB         
100.00%             169417178424467235000914166253263322299

# java -version
java version "1.6.0_43"
Java(TM) SE Runtime Environment (build 1.6.0_43-b01)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

After some time (1 hour / 2 hour) cassandra shut services on one or  
two nodes with follwoing errors;

============================================================
  INFO 11:01:25,527 GC for ConcurrentMarkSweep: 1968 ms for 2  
collections, 3817667464 used; max is 4093640704
  INFO 11:01:42,838 GC for ConcurrentMarkSweep: 1828 ms for 2  
collections, 3850830504 used; max is 4093640704
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid27363.hprof ...
Heap dump file created [4664912349 bytes in 44.731 secs]
ERROR 11:02:41,156 Exception in thread Thread[CompactionExecutor:87,1,main]
java.lang.OutOfMemoryError: Java heap space
         at  
org.apache.cassandra.io.util.FastByteArrayOutputStream.expand(FastByteArrayOutputStream.java:104)
         at  
org.apache.cassandra.io.util.FastByteArrayOutputStream.write(FastByteArrayOutputStream.java:220)
         at java.io.DataOutputStream.write(DataOutputStream.java:90)
         at  
org.apache.cassandra.io.util.DataOutputBuffer.write(DataOutputBuffer.java:61)
         at  
org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
         at  
org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
         at  
org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:62)
         at  
org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:366)
         at  
org.apache.cassandra.db.SuperColumnSerializer.serialize(SuperColumn.java:339)
         at  
org.apache.cassandra.db.ColumnFamilySerializer.serializeForSSTable(ColumnFamilySerializer.java:89)
         at  
org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:138)
         at  
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
         at  
org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
         at  
org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:154)
         at  
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         at  
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
         at  
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
         at  
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
         at java.lang.Thread.run(Thread.java:662)
  INFO 11:02:41,373 Stop listening to thrift clients
  INFO 11:02:41,376 InetAddress /10.0.0.15 is now dead.
  INFO 11:02:41,376 InetAddress /10.0.0.3 is now dead.
  INFO 11:02:41,377 InetAddress /10.40.1.103 is now dead.
  INFO 11:02:41,397 InetAddress /10.0.0.3 is now UP
  INFO 11:02:41,397 InetAddress /10.0.0.15 is now UP
  INFO 11:02:41,398 InetAddress /10.40.1.103 is now UP
  INFO 11:02:41,398 Started hinted handoff for token: 0 with IP: /10.0.0.3
  INFO 11:02:41,450 Announcing shutdown
  INFO 11:02:48,184 GC for ConcurrentMarkSweep: 1887 ms for 2  
collections, 2234362128 used; max is 4093640704
  INFO 11:02:48,206 Waiting for messaging service to quiesce
  INFO 11:02:48,207 MessagingService shutting down server thread.
============================================================

Our cassandra.yaml configurations are as under;

============================================================
cluster_name: 'ABC Cluster'
initial_token: 0
hinted_handoff_enabled: true
max_hint_window_in_ms: 2147483647 # one hour
hinted_handoff_throttle_delay_in_ms: 0
authenticator: org.apache.cassandra.auth.AllowAllAuthenticator
authority: org.apache.cassandra.auth.AllowAllAuthority
partitioner: org.apache.cassandra.dht.RandomPartitioner

data_file_directories:
     - /u/cassandra/data

commitlog_directory: /var/log/cassandra/commitlog
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
row_cache_provider: SerializingCacheProvider
saved_caches_directory: /var/log/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32

seed_provider:
           # Ex: "<ip1>,<ip2>,<ip3>"
           - seeds: "10.0.0.3,10.0.0.4"

flush_largest_memtables_at: 1.0
reduce_cache_sizes_at: 1.0
reduce_cache_capacity_to: 0.6
concurrent_reads: 8
concurrent_writes: 32
memtable_flush_queue_size: 4
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: 10.0.0.3
rpc_address: 10.0.0.3
rpc_port: 9160
rpc_keepalive: true
rpc_server_type: sync
rpc_min_threads: 16
rpc_max_threads: 2147483647
thrift_framed_transport_size_in_mb: 15
thrift_max_message_length_in_mb: 16
incremental_backups: false
snapshot_before_compaction: false
auto_snapshot: true
column_index_size_in_kb: 64
in_memory_compaction_limit_in_mb: 256
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
compaction_preheat_key_cache: true
rpc_timeout_in_ms: 15000
phi_convict_threshold: 8
endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.0
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
index_interval: 128
encryption_options:
     internode_encryption: none
     keystore: conf/.keystore
     keystore_password: cassandra
     truststore: conf/.truststore
     truststore_password: cassandra
============================================================

Please help me to fix this issue permanently and smooth running of  
Cassandra nodes.

Regards,

Adeel Akbar

Mime
View raw message