Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 85768 invoked from network); 19 Aug 2009 20:29:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 19 Aug 2009 20:29:15 -0000 Received: (qmail 70325 invoked by uid 500); 19 Aug 2009 20:29:33 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 70302 invoked by uid 500); 19 Aug 2009 20:29:33 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 70293 invoked by uid 99); 19 Aug 2009 20:29:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Aug 2009 20:29:33 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sandeep.tata@gmail.com designates 209.85.216.188 as permitted sender) Received: from [209.85.216.188] (HELO mail-px0-f188.google.com) (209.85.216.188) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Aug 2009 20:29:25 +0000 Received: by pxi26 with SMTP id 26so2778802pxi.13 for ; Wed, 19 Aug 2009 13:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=BgLhFP3V1o8oSoQNf4RuJat0mnMEEfYJ0ERH8awKW20=; b=HWYM8qgZWaoWcoLnscQUNgYcduWN5HLUN5/0CdIPDL0SC19BOiX1edzd/ZjO5/55Ni vbYkharND0GC0kDEpQ5cCp8A7nHSBtsNAwWWjwGsdwo79mP10flc3rHhZXKAzWTTQxXq ZshXQMrGnN/f9WtTVjBAUV3IryJ7gsLXqiY1I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ALdx5Rx3dj3SPU1YebU224E7FcTGjn0aLdtIuvzQxKmiRtprhEeJDgGbxIXI6HqQkK 8OtmnLBKSKBLcmJpN2Dl6oMUHASTiEdKgAqxUnkvCt1z7PhZw0KBIA9Dk17ZJNROb8Ie AUDAmJ0CPRZF87Nj0qSw+YscIq4I1monjBC20= MIME-Version: 1.0 Received: by 10.143.138.4 with SMTP id q4mr1288055wfn.38.1250713745359; Wed, 19 Aug 2009 13:29:05 -0700 (PDT) In-Reply-To: References: Date: Wed, 19 Aug 2009 13:29:05 -0700 Message-ID: Subject: Re: Anybody experience one Cassandra server locking up? From: Sandeep Tata To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Brian, Are you guys planning to run workloads at Yahoo to compare Cassandra and PN= UTS? We'd be curious to see what you learn with the 0.4/trunk code. Sandeep On Wed, Aug 19, 2009 at 10:20 AM, Brian Frank Cooper wrote: > Probably you are right; after Jun's response I looked in the log and saw = an out of memory exception. I'll try the 0.4 beta... > > Thanks! > > brian > > -----Original Message----- > From: Jonathan Ellis [mailto:jbellis@gmail.com] > Sent: Wednesday, August 19, 2009 9:12 AM > To: cassandra-user@incubator.apache.org > Subject: Re: Anybody experience one Cassandra server locking up? > > sounds like you are exhausting the memory on that instance and it is > going into "GC swap" trying to free enough to continue. =A0this is very > easy to do on 0.3 -- try upgrading to the 0.4 beta if you are using > 0.3. > > On Tue, Aug 18, 2009 at 3:36 PM, Brian Frank > Cooper wrote: >> Hi folks, >> >> >> >> I have been loading a 6-server Cassandra cluster with 1KB records. After= a >> few million inserts, the insert rate drops dramatically. After >> investigation, one of the Cassandra servers seems to be in a bad state, >> using 100% of one core on an 8-core machine, and 0% on the other cores. >> Inserts to this box have completely stopped, and the inserts to the othe= r >> boxes have slowed way down (more than a factor of 10 slower.) A "kill" o= r >> "kill -3" to the bad java process does nothing; I have to use "kill -9" = to >> stop it. Has anybody experienced anything like this? >> >> >> >> Additional info: >> >> >> >> The servers are 8 core, 8GB servers. I am running 64 bit java 1.6, and h= ere >> are the JVM options: >> >> >> >> # Arguments to pass to the JVM >> >> JVM_OPTS=3D" \ >> >> =A0=A0=A0=A0=A0=A0=A0 -ea \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Xdebug \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Xrunjdwp:transport=3Ddt_socket,server=3Dy,address= =3D8888,suspend=3Dn \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Xms128M \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Xmx6G \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:SurvivorRatio=3D8 \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:TargetSurvivorRatio=3D90 \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:+AggressiveOpts \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:+UseParNewGC \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:+UseConcMarkSweepGC \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:CMSInitiatingOccupancyFraction=3D1 \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:+CMSParallelRemarkEnabled \ >> >> =A0=A0=A0=A0=A0=A0=A0 -XX:+HeapDumpOnOutOfMemoryError \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Dcom.sun.management.jmxremote.port=3D8080 \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Dcom.sun.management.jmxremote.ssl=3Dfalse \ >> >> =A0=A0=A0=A0=A0=A0=A0 -Dcom.sun.management.jmxremote.authenticate=3Dfals= e" >> >> >> >> (standard options from the Cassandra distribution, except for the 6GB of >> heap space.) >> >> >> >> Replication factor is 1 (this is just a test, not a production setup) an= d >> memtable size is set to 1GB. >> >> >> >> Thanks. >> >> >> >> brian >