From user-return-8215-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Aug 04 12:24:03 2010 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 77633 invoked from network); 4 Aug 2010 12:24:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 Aug 2010 12:24:03 -0000 Received: (qmail 67403 invoked by uid 500); 4 Aug 2010 12:24:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 67330 invoked by uid 500); 4 Aug 2010 12:23:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67318 invoked by uid 99); 4 Aug 2010 12:23:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Aug 2010 12:23:58 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-wy0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Aug 2010 12:23:53 +0000 Received: by wyb40 with SMTP id 40so5777365wyb.31 for ; Wed, 04 Aug 2010 05:23:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=H/bfvc+gyAvGhzgeXVcrh5sPtifYGq1dD193shVsHjY=; b=Eb1ysbGC64lu9jYKA86umchDvwQdR5GUt0OglYf1+eqASUSIyVeWYWoyIgKit4uaGL YK1LpZ8ezh405zgQgSsxgp5f0ST8k+fOIiVhbI8+1w/iBBAljscr4Jj6VPrngnTJ15J8 83u3sZcBYpNa46TJtJ9wtK9/wHGogHcp9yqIg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=Jv9dXWlOdlci4Lf0Z0A7KnwHwapk5eCytxZogja4AVjhFI/GPk1UwK4zc8/NgkjzSQ bX9G0PJmlQ5HRpYR8VIpXR05pQj2IftlQHZpKLa/wvBeoOuSV8gSJF8qM2fE/BejjRjn q+wuOTU0ydmJm8cQCVgWpl6c1W88TgUY7ZR7Y= Received: by 10.216.15.68 with SMTP id e46mr1987225wee.97.1280924612491; Wed, 04 Aug 2010 05:23:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.63.196 with HTTP; Wed, 4 Aug 2010 05:23:12 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Wed, 4 Aug 2010 08:23:12 -0400 Message-ID: Subject: Re: bad behavior of my Cassandra cluster To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Put your commitlog on a different device than the data files. On Tue, Aug 3, 2010 at 10:46 PM, Mingfan Lu wrote: > Hi, > =A0I have a 4-node cassandra cluster. And I find when the 4 nodes are > flushing memtable and gc at the very similar moment, the throughput > will drop and latency will increase rapidly and the nodes are dead and > up frequently .... > =A0You could download the IOPS variance of data disk (sda here) and > system logs of these nodes from > =A0http://docs.google.com/leaf?id=3D0ByKuS81H5x1VYThjOWQxMTQtMzEzMC00NDJi= LTlhYWEtNzBjYzFmYTI3ZTk2&sort=3Dname&layout=3Dlist&num=3D50 > =A0(if you can't download it, just tell me.) > =A0What happed to the cluster? > =A0How could I avoid such scenario? > =A0* =A0Storage configuration > =A0 =A0All of nodes act as seed node > =A0 =A0Random partitioner is used, so that the data is evenly located in > the 4 nodes > =A0 =A0memtable thresholds: > =A0 =A0 =A0 =A0DiskAccessMode?: auto (in fact is mmap) > =A0 =A0 =A0 =A0MemtableThroughputInMB: 1024 > =A0 =A0 =A0 =A0MemtableOperationsInMillions?: 7 > =A0 =A0 =A0 =A0MemtableFlushAfterMinutes?: 1440 > =A0 =A0DiskAccess mode: Auto (mmap in fact) > =A0* =A0While JVM options are: > =A0 JVM_OPTS=3D"-ea \ > =A0 =A0 =A0 =A0 =A0 =A0 -Xms8G \ > =A0 =A0 =A0 =A0 =A0 =A0 -Xmx8G \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+UseParNewGC \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+UseConcMarkSweepGC \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+CMSParallelRemarkEnabled \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:SurvivorRatio=3D8 \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+UseLargePages \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:LargePageSizeInBytes=3D2m \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintHeapAtGC -Xloggc:/tmp/cloudstress/jvm.gc.log \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:MaxTenuringThreshold=3D1 \ > =A0 =A0 =A0 =A0 =A0 =A0 -XX:+HeapDumpOnOutOfMemoryError \ > =A0 =A0 =A0 =A0 =A0 =A0 -Dcom.sun.management.jmxremote.port=3D8080 \ > =A0 =A0 =A0 =A0 =A0 =A0 -Dcom.sun.management.jmxremote.ssl=3Dfalse \ > =A0 =A0 =A0 =A0 =A0 =A0 -Dcom.sun.management.jmxremote.authenticate=3Dfal= se" > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com