From cassandra-user-return-451-apmail-incubator-cassandra-user-archive=incubator.apache.org@incubator.apache.org Tue Aug 18 22:36:59 2009 Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 24641 invoked from network); 18 Aug 2009 22:36:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Aug 2009 22:36:59 -0000 Received: (qmail 35373 invoked by uid 500); 18 Aug 2009 22:37:17 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 35336 invoked by uid 500); 18 Aug 2009 22:37:17 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 35327 invoked by uid 99); 18 Aug 2009 22:37:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 22:37:16 +0000 X-ASF-Spam-Status: No, hits=4.2 required=10.0 tests=HTML_MESSAGE,NO_RDNS_DOTCOM_HELO,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 22:37:04 +0000 Received: from sp1-ex07cas01.ds.corp.yahoo.com (sp1-ex07cas01.ds.corp.yahoo.com [216.252.116.137]) by mrout1-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n7IMaH0G000768 for ; Tue, 18 Aug 2009 15:36:17 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=received:from:to:date:subject:thread-topic:thread-index: message-id:accept-language:content-language:x-ms-has-attach: x-ms-tnef-correlator:acceptlanguage:content-type:mime-version; b=qnpYUJ0RsIg4xXx1sXvfBUWpSZ4uFeC3UAxkrKZK0v5QCLCwmr6GEQOoS7alKVXc Received: from SP1-EX07VS01.ds.corp.yahoo.com ([216.252.116.139]) by sp1-ex07cas01.ds.corp.yahoo.com ([216.252.116.137]) with mapi; Tue, 18 Aug 2009 15:36:17 -0700 From: Brian Frank Cooper To: "cassandra-user@incubator.apache.org" Date: Tue, 18 Aug 2009 15:36:16 -0700 Subject: Anybody experience one Cassandra server locking up? Thread-Topic: Anybody experience one Cassandra server locking up? Thread-Index: AcogVEx5ks+8brn9SUSOTsXQHm0Wrw== Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_C2D6929236FAC846B7A4FE1EC39910C64F1F67552CSP1EX07VS01ds_" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --_000_C2D6929236FAC846B7A4FE1EC39910C64F1F67552CSP1EX07VS01ds_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi folks, I have been loading a 6-server Cassandra cluster with 1KB records. After a = few million inserts, the insert rate drops dramatically. After investigatio= n, one of the Cassandra servers seems to be in a bad state, using 100% of o= ne core on an 8-core machine, and 0% on the other cores. Inserts to this bo= x have completely stopped, and the inserts to the other boxes have slowed w= ay down (more than a factor of 10 slower.) A "kill" or "kill -3" to the bad= java process does nothing; I have to use "kill -9" to stop it. Has anybody= experienced anything like this? Additional info: The servers are 8 core, 8GB servers. I am running 64 bit java 1.6, and here= are the JVM options: # Arguments to pass to the JVM JVM_OPTS=3D" \ -ea \ -Xdebug \ -Xrunjdwp:transport=3Ddt_socket,server=3Dy,address=3D8888,suspend= =3Dn \ -Xms128M \ -Xmx6G \ -XX:SurvivorRatio=3D8 \ -XX:TargetSurvivorRatio=3D90 \ -XX:+AggressiveOpts \ -XX:+UseParNewGC \ -XX:+UseConcMarkSweepGC \ -XX:CMSInitiatingOccupancyFraction=3D1 \ -XX:+CMSParallelRemarkEnabled \ -XX:+HeapDumpOnOutOfMemoryError \ -Dcom.sun.management.jmxremote.port=3D8080 \ -Dcom.sun.management.jmxremote.ssl=3Dfalse \ -Dcom.sun.management.jmxremote.authenticate=3Dfalse" (standard options from the Cassandra distribution, except for the 6GB of he= ap space.) Replication factor is 1 (this is just a test, not a production setup) and m= emtable size is set to 1GB. Thanks... brian --_000_C2D6929236FAC846B7A4FE1EC39910C64F1F67552CSP1EX07VS01ds_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi folks,

 

I have been loading a 6-server Cassandra cluster with 1K= B records. After a few million inserts, the insert rate drops dramatically. A= fter investigation, one of the Cassandra servers seems to be in a bad state, usi= ng 100% of one core on an 8-core machine, and 0% on the other cores. Inserts t= o this box have completely stopped, and the inserts to the other boxes have slowed way down (more than a factor of 10 slower.) A “kill” or = “kill -3” to the bad java process does nothing; I have to use “kill -= 9” to stop it. Has anybody experienced anything like this? <= /font>

 

Additional info:

 

The servers are 8 core, 8GB servers. I am running 64 bit java 1.6, and here are the JVM options:

 

# Arguments to pass to the JVM<= /p>

JVM_OPTS=3D" \

        -ea \

        -Xdebug \

        -Xrunjdwp:transport=3Ddt_socket,server=3Dy,address=3D8888,suspend=3Dn \

        -Xms128M \

        -Xmx6G \=

        -XX:SurvivorR= atio=3D8 \

        -XX:TargetSur= vivorRatio=3D90 \

        -XX:+Aggressi= veOpts \

        -XX:+UseParNe= wGC \

        -XX:+UseConcM= arkSweepGC \

        -XX:CMSInitia= tingOccupancyFraction=3D1 \

        -XX:+CMSParal= lelRemarkEnabled \

        -XX:+HeapDump= OnOutOfMemoryError \

        -Dcom.sun.man= agement.jmxremote.port=3D8080 \

        -Dcom.sun.man= agement.jmxremote.ssl=3Dfalse \

        -Dcom.sun.management.jmxremote.authenticate=3Dfalse"=

 

(standard options from the Cassandra distribution, excep= t for the 6GB of heap space.)

 

Replication factor is 1 (this is just a test, not a production setup) and memtable size is set to 1GB.

 

Thanks…

 

brian

--_000_C2D6929236FAC846B7A4FE1EC39910C64F1F67552CSP1EX07VS01ds_--