Reduce these settings for the CF
row_cache (disable it)
key_cache (disable it)

Increase these settings for the CF
bloom_filter_fp_chance

Reduce these settings in cassandra.yaml

flush_largest_memtables_at
memtable_flush_queue_size
sliced_buffer_size_in_kb
in_memory_compaction_limit_in_mb
concurrent_compactors


Increase these settings 
index_interval


While it obviously depends on load, I would not be surprised if you had a lot of trouble running cassandra with that setup. 

Cheers
A


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 11:02 PM, Tamar Fraenkel wrote:

Arron, Thanks for your response. I was afraid this is the issue.
Can you give me some direction regarding the fine tuning of my VMs, I would like to explore that option some more.
Thanks!

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 6, 2012 at 11:58 AM, aaron morton <aaron@thelastpickle.com> wrote:
You do not have enough memory allocated to the JVM and are suffering from excessive GC as a result.

There are some tuning things you can try, but 480MB is not enough. 1GB would be a better start, 2 better than that. 

Consider using https://github.com/pcmanus/ccm for testing multiple instances on a single server rather than a VM.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 10:21 PM, Tamar Fraenkel wrote:

I have some more info, after couple of hours running the problematic node became again 100% CPU and I had to reboot it, last lines from log show it did GC:

 INFO [ScheduledTasks:1] 2012-03-06 10:28:00,880 GCInspector.java (line 122) GC for Copy: 203 ms for 1 collections, 185983456 used; max is 513802240
 INFO [ScheduledTasks:1] 2012-03-06 10:28:50,595 GCInspector.java (line 122) GC for Copy: 3927 ms for 1 collections, 156572576 used; max is 513802240
 INFO [ScheduledTasks:1] 2012-03-06 10:28:55,434 StatusLogger.java (line 50) Pool Name                    Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,298 StatusLogger.java (line 65) ReadStage                         2         2         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,499 StatusLogger.java (line 65) RequestResponseStage              0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) ReadRepairStage                   0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) MutationStage                     0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) ReplicateOnWriteStage             0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) GossipStage                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) AntiEntropyStage                  0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) MigrationStage                    0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) StreamStage                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) MemtablePostFlusher               0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) FlushWriter                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) MiscStage                         0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) InternalResponseStage             0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) HintedHandoff                     0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,553 StatusLogger.java (line 69) CompactionManager               n/a         0

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 6, 2012 at 9:12 AM, Tamar Fraenkel <tamar@tok-media.com> wrote:
Works..

But during the night my setup encountered a problem.
I have two VMs on my cluster (running on VmWare ESXi).
Each VM has1GB memory, and two Virtual Disks of 16 GB
They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory (together with two other VMs)
I put cassandra data on the second disk of each machine.
VMs are running Ubuntu 11.10 and cassandra 1.0.7.

I left them running overnight and this morning when I came:
In one node cassandra was down, and the last thing in the system.log is:

 INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 CompactionTask.java (line 113) Compacting [SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')]
 INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 CompactionTask.java (line 221) Compacted to [/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,].  32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at 8.144165MB/s.  Time: 3,097ms.


The other node was using all it's CPU and I had to restart it.
After that, I can see that the last lines in it's system.log are that the other node is down...

 INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line 246) Writing Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556 serialized/live bytes, 21173 ops)
 INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line 283) Completed flushing /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930 bytes)
 INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818) InetAddress /10.0.0.31 is now dead.

How can I trace why that happened?
Also, I brought cassandra up in both nodes. They both spend long time reading commit logs, but now they seem to run.
Any idea how to debug or improve my setup?
Thanks,
Tamar



Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Mon, Mar 5, 2012 at 7:30 PM, aaron morton <aaron@thelastpickle.com> wrote:
Create nodes that do not share seeds, and give the clusters different names as a safety measure. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:

I want tow separate clusters.
Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Mon, Mar 5, 2012 at 12:48 PM, aaron morton <aaron@thelastpickle.com> wrote:
Do you want to create two separate clusters or a single cluster with two data centres ? 

If it's the later, token selection is discussed here http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
 
Moreover all tokens must be unique (even across datacenters), although - from pure curiosity - I wonder what is the rationale behind this.
Otherwise data is not evenly distributed.

By the way, can someone enlighten me about the first line in the output of the nodetool. Obviously it contains a token, but nothing else. It seems like a formatting glitch, but maybe it has a role. 
It's the exclusive lower bound token for the first node in the ring. This also happens to be the token for the last node in the ring. 

In your setup 
10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864

(does not imply primary replica, just used to map keys to nodes.)
 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a multi-datacenter setup with two circles. You can start reading from this page:
http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

Moreover all tokens must be unique (even across datacenters), although - from pure curiosity - I wonder what is the rationale behind this.

By the way, can someone enlighten me about the first line in the output of the nodetool. Obviously it contains a token, but nothing else. It seems like a formatting glitch, but maybe it has a role.

On 2012.03.05. 11:06, Tamar Fraenkel wrote:
Hi!
I have a Cassandra  cluster with two nodes

nodetool ring -h localhost
Address         DC          Rack        Status State   Load            Owns    Token
                                                                               85070591730234615865843651857942052864
10.0.0.19       datacenter1 rack1       Up     Normal  488.74 KB       50.00%  0
10.0.0.28       datacenter1 rack1       Up     Normal  504.63 KB       50.00%  85070591730234615865843651857942052864

I want to create a second ring with the same name but two different nodes.
using tokengentool I get the same tokens as they are affected from the number of nodes in a ring.

My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
In 10.0.0.31 cassandra.yaml I will set
initial_token: 0
seeds: "10.0.0.31"
listen_address: 10.0.0.31
rpc_address: 0.0.0.0

In 10.0.0.11 cassandra.yaml I will set
initial_token: 85070591730234615865843651857942052864
seeds: "10.0.0.31"
listen_address: 10.0.0.11
rpc_address: 0.0.0.0 

Would the rings be separate?

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<Mail Attachment.png>