It is in the DEB package, and I guess it is also in the RPM.

On 2012.03.15. 12:02, Tamar Fraenkel wrote:
I don't see it in dsc-cassandra-1.0.7-bin.tar.gz.
Thanks
Tamar Fraenkel 
Senior Software Engineer, TOK Media 






2012/3/15 Hontvári József Levente <hontvari@flyordie.com>
You can copy the init.d script from the DataStax package.

On 2012.03.15. 11:06, Tamar Fraenkel wrote:
Yes I am using the cassandra community.
Re-installing will be a hassle...
Any idea how to just fix the daemon issue?
Thanks
Tamar Fraenkel 
Senior Software Engineer, TOK Media 






On Thu, Mar 15, 2012 at 11:56 AM, aaron morton <aaron@thelastpickle.com> wrote:
I have a problem though. I installed cassandra following DataStax, and cassandra is not a daemon
Are you using Cassandra for Data Stax Community ? 

This will give you a nice install http://wiki.apache.org/cassandra/DebianPackaging

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 15/03/2012, at 10:38 PM, Tamar Fraenkel wrote:

Thanks for your prompt response.
I have a problem though. I installed cassandra following DataStax, and cassandra is not a daemon. i.e. I have to manyally start it, and I don't have a script /etc/init.d/cassandra.

For this reason, I need to restart it after rebooting my vm manually, and it is not part of the rc...

Any good points on how to set up cassandra as a daemon?

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Thu, Mar 15, 2012 at 11:34 AM, aaron morton <aaron@thelastpickle.com> wrote:
1. How can I prevent this? I guess my setup is limited, and this may happen, but is there a way to improve things.
Not really, you need more memory on the box.
 
2. Assuming that I will run out of memory from time to time, how do I setup a monit \ god task to restart cassandra in case it does.
Super simple, to the point of not been very good, /etc/monit/conf.d/cassandra.monitrc 
The monit docs are pretty good. 

check process cassandra with pidfile /var/run/cassandra.pid
  start program = "/etc/init.d/cassandra start"
  stop program = "/etc/init.d/cassandra stop"

You will also need to prevent the init.d script from starting, i used update-rc.d

I'm not an ops guy; google is your friend; the monit docs are good.

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 15/03/2012, at 8:44 PM, Tamar Fraenkel wrote:

I added a third node to the cluster. Sure enough, this morning I come and only one node is up, in the other two the cassandra process is not running.

In the cassandra log there is nothing, but in /var/log/syslog I see
In one node:
Mar 15 07:50:51 Cassandra3 kernel: [58566.666906] Out of memory: Kill process 2840 (java) score 383 or sacrifice child
Mar 15 07:50:51 Cassandra3 kernel: [58566.667066] Killed process 2840 (java) total-vm:956792kB, anon-rss:689752kB, file-rss:21680kB
And in the other:
Mar 14 18:36:02 Cassandra2 kernel: [16262.267300] Out of memory: Kill process 2611 (java) score 409 or sacrifice child
Mar 14 18:36:02 Cassandra2 kernel: [16262.267325] Killed process 2611 (java) total-vm:968040kB, anon-rss:748644kB, file-rss:18436kB

Two questions:
1. How can I prevent this? I guess my setup is limited, and this may happen, but is there a way to improve things.
2. Assuming that I will run out of memory from time to time, how do I setup a monit \ god task to restart cassandra in case it does.

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 13, 2012 at 11:12 AM, aaron morton <aaron@thelastpickle.com> wrote:
If you are on Ubuntu it may be this http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs

otherwise I would look for GC problems. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 13/03/2012, at 7:53 PM, Tamar Fraenkel wrote:

Done it. Now it generally runs ok, till one of the nodes get's stuck with 100% cpu and I need to reboot it.

Last lines in the system.log just before are:
 INFO [OptionalTasks:1] 2012-03-13 07:36:43,850 MeteredFlusher.java (line 62) flushing high-traffic column family CFS(Keyspace='tok', ColumnFamily='tk_vertical_tag_story_indx') (estimated 35417890 bytes)
 INFO [OptionalTasks:1] 2012-03-13 07:36:43,869 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-tk_vertical_tag_story_indx@2002820169(1620316/35417890 serialized/live bytes, 30572 ops)
 INFO [FlushWriter:76] 2012-03-13 07:36:43,869 Memtable.java (line 246) Writing Memtable-tk_vertical_tag_story_indx@2002820169(1620316/35417890 serialized/live bytes, 30572 ops)
 INFO [FlushWriter:76] 2012-03-13 07:36:44,015 Memtable.java (line 283) Completed flushing /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-191-Data.db (2134123 bytes)
 INFO [OptionalTasks:1] 2012-03-13 07:37:37,886 MeteredFlusher.java (line 62) flushing high-traffic column family CFS(Keyspace='tok', ColumnFamily='tk_vertical_tag_story_indx') (estimated 34389135 bytes)
 INFO [OptionalTasks:1] 2012-03-13 07:37:37,887 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-tk_vertical_tag_story_indx@1869953681(1573252/34389135 serialized/live bytes, 29684 ops)
 INFO [FlushWriter:76] 2012-03-13 07:37:37,887 Memtable.java (line 246) Writing Memtable-tk_vertical_tag_story_indx@1869953681(1573252/34389135 serialized/live bytes, 29684 ops)
 INFO [FlushWrit

Any idea?
I am considering adding a third node, so that replication factor of 2 won't stuck my system when one node goes down. Does it make sense?

Thanks


Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 6, 2012 at 7:51 PM, aaron morton <aaron@thelastpickle.com> wrote:
Reduce these settings for the CF
row_cache (disable it)
key_cache (disable it)

Increase these settings for the CF
bloom_filter_fp_chance

Reduce these settings in cassandra.yaml

flush_largest_memtables_at
memtable_flush_queue_size
sliced_buffer_size_in_kb
in_memory_compaction_limit_in_mb
concurrent_compactors


Increase these settings 
index_interval


While it obviously depends on load, I would not be surprised if you had a lot of trouble running cassandra with that setup. 

Cheers
A


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 11:02 PM, Tamar Fraenkel wrote:

Arron, Thanks for your response. I was afraid this is the issue.
Can you give me some direction regarding the fine tuning of my VMs, I would like to explore that option some more.
Thanks!

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 6, 2012 at 11:58 AM, aaron morton <aaron@thelastpickle.com> wrote:
You do not have enough memory allocated to the JVM and are suffering from excessive GC as a result.

There are some tuning things you can try, but 480MB is not enough. 1GB would be a better start, 2 better than that. 

Consider using https://github.com/pcmanus/ccm for testing multiple instances on a single server rather than a VM.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 10:21 PM, Tamar Fraenkel wrote:

I have some more info, after couple of hours running the problematic node became again 100% CPU and I had to reboot it, last lines from log show it did GC:

 INFO [ScheduledTasks:1] 2012-03-06 10:28:00,880 GCInspector.java (line 122) GC for Copy: 203 ms for 1 collections, 185983456 used; max is 513802240
 INFO [ScheduledTasks:1] 2012-03-06 10:28:50,595 GCInspector.java (line 122) GC for Copy: 3927 ms for 1 collections, 156572576 used; max is 513802240
 INFO [ScheduledTasks:1] 2012-03-06 10:28:55,434 StatusLogger.java (line 50) Pool Name                    Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,298 StatusLogger.java (line 65) ReadStage                         2         2         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,499 StatusLogger.java (line 65) RequestResponseStage              0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) ReadRepairStage                   0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) MutationStage                     0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) ReplicateOnWriteStage             0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) GossipStage                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) AntiEntropyStage                  0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) MigrationStage                    0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) StreamStage                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) MemtablePostFlusher               0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) FlushWriter                       0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) MiscStage                         0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) InternalResponseStage             0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) HintedHandoff                     0         0         0
 INFO [ScheduledTasks:1] 2012-03-06 10:29:03,553 StatusLogger.java (line 69) CompactionManager               n/a         0

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Tue, Mar 6, 2012 at 9:12 AM, Tamar Fraenkel <tamar@tok-media.com> wrote:
Works..

But during the night my setup encountered a problem.
I have two VMs on my cluster (running on VmWare ESXi).
Each VM has1GB memory, and two Virtual Disks of 16 GB
They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory (together with two other VMs)
I put cassandra data on the second disk of each machine.
VMs are running Ubuntu 11.10 and cassandra 1.0.7.

I left them running overnight and this morning when I came:
In one node cassandra was down, and the last thing in the system.log is:

 INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 CompactionTask.java (line 113) Compacting [SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'), SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')]
 INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 CompactionTask.java (line 221) Compacted to [/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,].  32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at 8.144165MB/s.  Time: 3,097ms.


The other node was using all it's CPU and I had to restart it.
After that, I can see that the last lines in it's system.log are that the other node is down...

 INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line 246) Writing Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556 serialized/live bytes, 21173 ops)
 INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line 283) Completed flushing /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930 bytes)
 INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818) InetAddress /10.0.0.31 is now dead.

How can I trace why that happened?
Also, I brought cassandra up in both nodes. They both spend long time reading commit logs, but now they seem to run.
Any idea how to debug or improve my setup?
Thanks,
Tamar



Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Mon, Mar 5, 2012 at 7:30 PM, aaron morton <aaron@thelastpickle.com> wrote:
Create nodes that do not share seeds, and give the clusters different names as a safety measure. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:

I want tow separate clusters.
Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<tokLogo.png>





On Mon, Mar 5, 2012 at 12:48 PM, aaron morton <aaron@thelastpickle.com> wrote:
Do you want to create two separate clusters or a single cluster with two data centres ? 

If it's the later, token selection is discussed here http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
 
Moreover all tokens must be unique (even across datacenters), although - from pure curiosity - I wonder what is the rationale behind this.
Otherwise data is not evenly distributed.

By the way, can someone enlighten me about the first line in the output of the nodetool. Obviously it contains a token, but nothing else. It seems like a formatting glitch, but maybe it has a role. 
It's the exclusive lower bound token for the first node in the ring. This also happens to be the token for the last node in the ring. 

In your setup 
10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864

(does not imply primary replica, just used to map keys to nodes.)
 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton

On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:

You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a multi-datacenter setup with two circles. You can start reading from this page:
http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy

Moreover all tokens must be unique (even across datacenters), although - from pure curiosity - I wonder what is the rationale behind this.

By the way, can someone enlighten me about the first line in the output of the nodetool. Obviously it contains a token, but nothing else. It seems like a formatting glitch, but maybe it has a role.

On 2012.03.05. 11:06, Tamar Fraenkel wrote:
Hi!
I have a Cassandra  cluster with two nodes

nodetool ring -h localhost
Address         DC          Rack        Status State   Load            Owns    Token
                                                                               85070591730234615865843651857942052864
10.0.0.19       datacenter1 rack1       Up     Normal  488.74 KB       50.00%  0
10.0.0.28       datacenter1 rack1       Up     Normal  504.63 KB       50.00%  85070591730234615865843651857942052864

I want to create a second ring with the same name but two different nodes.
using tokengentool I get the same tokens as they are affected from the number of nodes in a ring.

My question is like this:
Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
In 10.0.0.31 cassandra.yaml I will set
initial_token: 0
seeds: "10.0.0.31"
listen_address: 10.0.0.31
rpc_address: 0.0.0.0

In 10.0.0.11 cassandra.yaml I will set
initial_token: 85070591730234615865843651857942052864
seeds: "10.0.0.31"
listen_address: 10.0.0.11
rpc_address: 0.0.0.0 

Would the rings be separate?

Thanks,

Tamar Fraenkel 
Senior Software Engineer, TOK Media 

<Mail Attachment.png>