Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id ACE08DC84 for ; Wed, 22 May 2013 14:47:37 +0000 (UTC) Received: (qmail 36318 invoked by uid 500); 22 May 2013 14:47:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 36295 invoked by uid 500); 22 May 2013 14:47:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 36280 invoked by uid 99); 22 May 2013 14:47:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 May 2013 14:47:34 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.wirt@struq.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 May 2013 14:47:29 +0000 Received: by mail-wi0-f177.google.com with SMTP id hr14so1284744wib.4 for ; Wed, 22 May 2013 07:47:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:references:in-reply-to:subject:date:message-id:mime-version :content-type:content-transfer-encoding:x-mailer:thread-index :content-language:x-gm-message-state; bh=IsZekgaWU8OEbaNpv99xK9xYAwKx1xShotm0a4PpCno=; b=i5RY6Jhauy+ksA1YVA/t2Lp5kFS5o7mKuyCyn3UeFNQXZLzhFubaTQS4ZnGhW9tE6E DR1LpwXqsDIWdjlU0FEgPoRoeBQYwi+UfRl/7reQyy1v7/3/GwRRh8s6+c+OJ+F4XMuw q5ISgV/UGahKr/sqwXEXYZ441o6cqVr3Bw4YxMrKYFOXczVHajWG2HYkhqSZ3IzA27iq B2iao+F1itWSPznUcw0VBwZnD8adZ0S0/YGJufIZp3xw3nCdeAvBqfUcgETyt6ItFmi9 oatrva1M5fZlwwPtal1JUYfvHsjCLbqNX8zIEz4lU38+6UzGcPRT8pmzQSJKZeKWwkPd 5D7A== X-Received: by 10.180.205.200 with SMTP id li8mr15663572wic.15.1369234028391; Wed, 22 May 2013 07:47:08 -0700 (PDT) Received: from StevePereiraPC (host81-133-200-21.in-addr.btopenworld.com. [81.133.200.21]) by mx.google.com with ESMTPSA id ay7sm29385424wib.9.2013.05.22.07.47.07 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 22 May 2013 07:47:08 -0700 (PDT) From: "Christopher Wirt" To: References: <519CD0FB.6030609@4friends.od.ua> In-Reply-To: Subject: RE: High performance disk io Date: Wed, 22 May 2013 15:47:07 +0100 Message-ID: <00c601ce56fb$3bc10a90$b3431fb0$@struq.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQEHGHv9X76gpSNvG4BcShVMjW3EP5qfxBOQ Content-Language: en-gb X-Gm-Message-State: ALoCoQlnU0r+olIw5Lhe6j7Uzjpk1r0t669b3QF0DR3h7neOn0h0ASwBy5LX5BMOLO+MQ+Bt56Fu X-Virus-Checked: Checked by ClamAV on apache.org Hi Dean, Adding nodes is the easy way out. We can get three smaller SSDs for the same price as our current setup. How do we optimise performance for this? Is it worth the effort? To RAID or not to RAID, that is one of my questions. Currently I'm thinking it must be faster and given the same price tag easily worth the effort. Cheers, Chris -----Original Message----- From: Hiller, Dean [mailto:Dean.Hiller@nrel.gov] Sent: 22 May 2013 15:33 To: user@cassandra.apache.org Subject: Re: High performance disk io Well, if you just want to lower your I/O util %, you could always just add more nodes to the cluster ;). Dean From: Igor > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, May 22, 2013 8:06 AM To: "user@cassandra.apache.org" > Subject: Re: High performance disk io Hello What level of read performance do you expect? We have limit 15 ms for 99 percentile with average read latency near 0.9ms. For some CF 99 percentile actually equals to 2ms, for other - to 10ms, this depends on the data volume you read in each query. Tuning read performance involved cleaning up data model, tuning cassandra.yaml, switching from Hector to astyanax, tuning OS parameters. On 05/22/2013 04:40 PM, Christopher Wirt wrote: Hello, We're looking at deploying a new ring where we want the best possible read performance. We've setup a cluster with 6 nodes, replication level 3, 32Gb of memory, 8Gb Heap, 800Mb keycache, each holding 40/50Gb of data on a 200Gb SSD and 500Gb SATA for OS and commitlog Three column families ColFamily1 50% of the load and data ColFamily2 35% of the load and data ColFamily3 15% of the load and data At the moment we are still seeing around 20% disk utilisation and occasionally as high as 40/50% on some nodes at peak time.. we are conducting some semi live testing. CPU looks fine, memory is fine, keycache hit rate is about 80% (could be better, so maybe we should be increasing the keycache size?) Anyway, we're looking into what we can do to improve this. One conversion we are having at the moment is around the SSD disk setup.. We are considering moving to have 3 smaller SSD drives and spreading the data across those. The possibilities are: -We have a RAID0 of the smaller SSDs and hope that improves performance. Will this acutally yield better throughput? -We mount the SSDs to different directories and define multiple data directories in Cassandra.yaml. Will not having a layer of RAID controller improve the throughput? -We mount the SSDs to different columns family directories and have a single data directory declared in Cassandra.yaml. Think this is quite attractive idea. What are the drawbacks? System column families will be on the main SATA? -We don't change anything and just keep upping our keycache. -Anything you guys can think of. Ideas and thoughts welcome. Thanks for your time and expertise. Chris