From user-return-33289-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Apr 9 07:53:08 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EFBBEF006 for ; Tue, 9 Apr 2013 07:53:08 +0000 (UTC) Received: (qmail 29129 invoked by uid 500); 9 Apr 2013 07:53:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 28833 invoked by uid 500); 9 Apr 2013 07:53:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 28817 invoked by uid 99); 9 Apr 2013 07:53:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 07:53:05 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a52.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Apr 2013 07:53:01 +0000 Received: from homiemail-a52.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTP id CF8A96B8269 for ; Tue, 9 Apr 2013 00:52:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=aGP53JL0c9xXjVgAXVMcBuN4NM Y=; b=vCZrFqegAtj1Li/frxP02dfIgQ86Gs9tK5jUa/oMDNRUXTT0EjvWRO3W3H XPmpCkITB7zlrlMsx9kY/Sz4Rz1Mw/2ZZX4Skw7e+R7a0gpe3bppCIz6e/y8HifY 0ewZY6ZW3akgDE/1+2ZV7tq8hwBcgeiJhh44GyNt9OaYb4tjE= Received: from [172.16.1.8] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a52.g.dreamhost.com (Postfix) with ESMTPSA id F267C6B801C for ; Tue, 9 Apr 2013 00:52:39 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_23FE50C3-E193-484F-AD8D-D265D604355E" Message-Id: <9DDC14C3-0D7A-4273-8FC6-41F687A0012C@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Cassandra services down frequently [Version 1.1.4] Date: Tue, 9 Apr 2013 19:52:36 +1200 References: <20130404032742.xafvjl9adc4wwgcs@webmail.opentransfer.com> <5CB189A6-BAFF-4BB5-BB7D-DEFF2EBD08CA@thelastpickle.com> <20130406085319.zm7edtexqwc08cg4@webmail.opentransfer.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_23FE50C3-E193-484F-AD8D-D265D604355E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > MAX_HEAP_SIZE=3D"6G" > HEAP_NEWSIZE=3D"500M" The new heap feels a little low, I often see 800M as a good number. It = depends on the number of cores, but if that's working stick with it.=20 > key_cache_size_in_mb: 512 Have you run this at the default and checked the cache hit rate using = nodetool info ? The default size would be about 300M.=20 > row_cache_size_in_mb: 14336 This is way too high.=20 You've told the JVM to lock in 6GB and then told the row cache it can = use 14GB, but you only have 16GB on the node. At some point things are = going to go crash, bang, wallop.=20 Set it to 1GB and check the cache hit rate using nodetool info.=20 The remaining memory will be used by the OS to cache disk access.=20 > I have a querry, if Cassandra is using JVM for all operations then why = we need change above parameters separately in cassandra.yaml. The JVM params are passed to the JVM before the server starts and have = to be formatted a specific way. The yaml file is much easier for humans = to read.=20 Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 8/04/2013, at 1:16 PM, =E9=87=91=E5=89=91 = wrote: > It also use off-heap memory out of JVM. SerializingCacheProvider = should be one of the case. >=20 > Best Regards! >=20 > Jian Jin >=20 >=20 > 2013/4/6 > Thank you Aaron and Bryan for your advice. >=20 > I have changed following parameters and now Cassandra running = absolutely fine. Please review below setting and advice am I right or = right direction. >=20 > cassandra-env.sh > #JVM_OPTS=3D"$JVM_OPTS -ea" > MAX_HEAP_SIZE=3D"6G" > HEAP_NEWSIZE=3D"500M" >=20 > cassandra.yaml > # do not persist caches to disk > key_cache_save_period: 0 > row_cache_save_period: 0 >=20 > key_cache_size_in_mb: 512 > row_cache_size_in_mb: 14336 > row_cache_provider: SerializingCacheProvider >=20 > I have a querry, if Cassandra is using JVM for all operations then why = we need change above parameters separately in cassandra.yaml. >=20 >=20 > Thanks & Regards >=20 > Adeel Akbar >=20 >=20 > Quoting aaron morton : >=20 > We can see from below that you've tweaked and disabled many of the = memory "safety valve" and other memory related settings. > Agree. > Also you are running with JVM heap size of 3.81GB which is non = default. For a 16GB node I would expect 8GB. >=20 > Try restoring the yaml values to the defaults and allowing the = cassandra-env.sh file to determine the memory size. >=20 > Cheers >=20 > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand >=20 > @aaronmorton > http://www.thelastpickle.com >=20 > On 5/04/2013, at 12:36 PM, Bryan Talbot = wrote: >=20 > On Thu, Apr 4, 2013 at 1:27 AM, wrote: >=20 > After some time (1 hour / 2 hour) cassandra shut services on one or = two nodes with follwoing errors; >=20 >=20 > Wonder what the workload and schema is like ... >=20 > We can see from below that you've tweaked and disabled many of the = memory "safety valve" and other memory related settings. Those could = be causing issues too. >=20 >=20 > hinted_handoff_throttle_delay_in_ms: 0 > flush_largest_memtables_at: 1.0 > reduce_cache_sizes_at: 1.0 > reduce_cache_capacity_to: 0.6 > rpc_keepalive: true > rpc_server_type: sync > rpc_min_threads: 16 > rpc_max_threads: 2147483647 > in_memory_compaction_limit_in_mb: 256 > compaction_throughput_mb_per_sec: 16 > rpc_timeout_in_ms: 15000 > dynamic_snitch_badness_threshold: 0.0 >=20 >=20 >=20 >=20 --Apple-Mail=_23FE50C3-E193-484F-AD8D-D265D604355E Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
HEAP_NEWSIZE=3D"500M"
The new heap feels a little low, I often see 800M as a good = number. It depends on the number of cores, but if that's working stick = with it. 

key_cache_size_in_mb: = 512
Have you run this at the = default and checked the cache hit rate using nodetool info ? The default = size would be about 300M. 

row_cache_size_in_mb: = 14336
This is way too = high. 
You've told the JVM to lock in 6GB and then told = the row cache it can use 14GB, but you only have 16GB on the node. At = some point things are going to go crash, bang, = wallop. 

Set it to 1GB and check the cache = hit rate using nodetool info. 

The = remaining memory will be used by the OS to cache disk = access. 

I have a querry, if Cassandra is using JVM for = all operations then why we need change above parameters separately in = cassandra.yaml.
The JVM params are passed = to the JVM before the server starts and have to be formatted a specific = way. The yaml file is much easier for humans to = read. 

Cheers


http://www.thelastpickle.com

On 8/04/2013, at 1:16 PM, =E9=87=91=E5=89=91 <jinjian.1@gmail.com> = wrote:

It also use off-heap memory out of = JVM. SerializingCacheProv= ider should be one of the case.
Best = Regards!

Jian Jin



2013/4/6 <adeel.akbar@panasiangroup.com>
Thank you Aaron and Bryan for your advice.

I have changed following parameters and now Cassandra running absolutely = fine. Please review below setting and advice am I right or right = direction.

cassandra-env.sh
#JVM_OPTS=3D"$JVM_OPTS -ea"
MAX_HEAP_SIZE=3D"6G"
HEAP_NEWSIZE=3D"500M"

 cassandra.yaml
# do not persist caches to disk
key_cache_save_period: 0
row_cache_save_period: 0

key_cache_size_in_mb: 512
row_cache_size_in_mb: 14336
row_cache_provider: SerializingCacheProvider

I have a querry, if Cassandra is using JVM for all operations then why = we need change above parameters separately in cassandra.yaml.


Thanks & Regards

Adeel Akbar


Quoting aaron morton <aaron@thelastpickle.com>:

We can see from below that you've tweaked and disabled many of the =  memory "safety valve" and other memory related settings.
Agree.
Also you are running with JVM heap size of 3.81GB which is non =  default. For a 16GB node I would expect 8GB.

Try restoring the yaml values to the defaults and allowing the =  cassandra-env.sh file to determine the memory size.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 5/04/2013, at 12:36 PM, Bryan Talbot <btalbot@aeriagames.com> wrote:

On Thu, Apr 4, 2013 at 1:27 AM, <adeel.akbar@panasiangroup.com> = wrote:

After some time (1 hour / 2 hour) cassandra shut services on one or =  two nodes with follwoing errors;


Wonder what the workload and schema is like ...

We can see from below that you've tweaked and disabled many of the =  memory "safety valve" and other memory related settings. =  Those  could be causing issues too.


hinted_handoff_throttle_delay_in_ms: 0
flush_largest_memtables_at: 1.0
reduce_cache_sizes_at: 1.0
reduce_cache_capacity_to: 0.6
rpc_keepalive: true
rpc_server_type: sync
rpc_min_threads: 16
rpc_max_threads: 2147483647
in_memory_compaction_limit_in_mb: 256
compaction_throughput_mb_per_sec: 16
rpc_timeout_in_ms: 15000
dynamic_snitch_badness_threshold: 0.0





= --Apple-Mail=_23FE50C3-E193-484F-AD8D-D265D604355E--