Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 05FD91841F for ; Thu, 11 Feb 2016 08:36:41 +0000 (UTC) Received: (qmail 57997 invoked by uid 500); 11 Feb 2016 08:36:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 57955 invoked by uid 500); 11 Feb 2016 08:36:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57945 invoked by uid 99); 11 Feb 2016 08:36:22 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Feb 2016 08:36:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 1A04FC0D3A for ; Thu, 11 Feb 2016 08:36:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.821 X-Spam-Level: X-Spam-Status: No, score=-0.821 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id pv8EgQI7rvJT for ; Thu, 11 Feb 2016 08:36:18 +0000 (UTC) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id C99D620428 for ; Thu, 11 Feb 2016 08:36:17 +0000 (UTC) Received: by mail-lb0-f182.google.com with SMTP id cw1so23769549lbb.1 for ; Thu, 11 Feb 2016 00:36:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ppqTe5BDbCaMTUnG+V13dTV/ccsbGHHFI9uVYlf/Ezs=; b=uZK3G2057Z1IbdIqCV7AbY0BiL68zuZ60VQCjfujLQCzxdIgGf+eh8xTYQAOe3uLp+ B9cbi4va9SCoT3EfPooe3yoGJjS3UukuBhO9yDf0MKnQEdao8efA/rTj8oCaio7nBaqz zQ4g7BbI1dMGBrXBIEYUUfbDAdpWmavLQxvjDXjd0v4MvoCcw8zqOmxmV9Q9JAQxZwCp pAAee86teBU/ehNHhIJHGTBb/8RLFU0QQFpbErl/L61S1Uha9e/Zzf7+vIGu734Q68dv /lxsvkgSsXTjFbUsG0iulbGfAtNkHEQ+XM6HhV5Ewt++EGi0u3qao87MIjK5i55o0zR4 k+hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=ppqTe5BDbCaMTUnG+V13dTV/ccsbGHHFI9uVYlf/Ezs=; b=gY76dRFlpQA/OJbkja8EQVbEsA4+97SPat5uTxvKus6wbg7XJ73+smw3y4OnA2Txm1 UNdZQPw0NW1R/2Mc1pgn1LeAHeEO7BX8ClPGL2aTPCAvhSgjffyQFkzNIISQaBMzJmqC U22K+yGH8ytiVKolmsd+Z5n/L6RXTIx7Q60Q3cgEZy9cdD6zdb50LXr5RzgwjaoPUfI/ XBhaD737P7AmEuPIoxmXLZF/ji3ZKDm6IR82HXstKzYhSc6lcOaki9FnXM+eApufG0Am xJEZ2UyJDKRpkykMsTcJELQhtx3+gzUB559thOFeSW7kU8RuiD7sB5iabZTTUNzvFrLs g9Vw== X-Gm-Message-State: AG10YOQ9wy3Yk833LzHIF//QUTzfXGgJKUVJEyLN4K/x3uUJjD7K7xlQSGJmjCwiTb14QxkNqDTKitChUYneew== MIME-Version: 1.0 X-Received: by 10.112.136.103 with SMTP id pz7mr17972035lbb.3.1455179776231; Thu, 11 Feb 2016 00:36:16 -0800 (PST) Received: by 10.25.76.66 with HTTP; Thu, 11 Feb 2016 00:36:16 -0800 (PST) In-Reply-To: References: <8DBDEFD8-8348-4C37-9B1E-921D4F9151D2@crowdstrike.com> Date: Thu, 11 Feb 2016 09:36:16 +0100 Message-ID: Subject: Re: Debugging write timeouts on Cassandra 2.2.5 From: Fabrice Facorat To: user@cassandra.apache.org Cc: Peter Norton Content-Type: text/plain; charset=UTF-8 Are your commitlog and data on the same disk ? If yes, you should put commitlogs on a separate disk which don't have a lot of IO. Others IO may have great impact impact on your commitlog writing and it may even block. An example of impact IO may have, even for Async writes: https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic 2016-02-11 0:31 GMT+01:00 Mike Heffner : > Jeff, > > We have both commitlog and data on a 4TB EBS with 10k IOPS. > > Mike > > On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa > wrote: >> >> What disk size are you using? >> >> >> >> From: Mike Heffner >> Reply-To: "user@cassandra.apache.org" >> Date: Wednesday, February 10, 2016 at 2:24 PM >> To: "user@cassandra.apache.org" >> Cc: Peter Norton >> Subject: Re: Debugging write timeouts on Cassandra 2.2.5 >> >> Paulo, >> >> Thanks for the suggestion, we ran some tests against CMS and saw the same >> timeouts. On that note though, we are going to try doubling the instance >> sizes and testing with double the heap (even though current usage is low). >> >> Mike >> >> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta >> wrote: >>> >>> Are you using the same GC settings as the staging 2.0 cluster? If not, >>> could you try using the default GC settings (CMS) and see if that changes >>> anything? This is just a wild guess, but there were reports before of >>> G1-caused instabilities with small heap sizes (< 16GB - see CASSANDRA-10403 >>> for more context). Please ignore if you already tried reverting back to CMS. >>> >>> 2016-02-10 16:51 GMT-03:00 Mike Heffner : >>>> >>>> Hi all, >>>> >>>> We've recently embarked on a project to update our Cassandra >>>> infrastructure running on EC2. We are long time users of 2.0.x and are >>>> testing out a move to version 2.2.5 running on VPC with EBS. Our test setup >>>> is a 3 node, RF=3 cluster supporting a small write load (mirror of our >>>> staging load). >>>> >>>> We are writing at QUORUM and while p95's look good compared to our >>>> staging 2.0.x cluster, we are seeing frequent write operations that time out >>>> at the max write_request_timeout_in_ms (10 seconds). CPU across the cluster >>>> is < 10% and EBS write load is < 100 IOPS. Cassandra is running with the >>>> Oracle JDK 8u60 and we're using G1GC and any GC pauses are less than 500ms. >>>> >>>> We run on c4.2xl instances with GP2 EBS attached storage for data and >>>> commitlog directories. The nodes are using EC2 enhanced networking and have >>>> the latest Intel network driver module. We are running on HVM instances >>>> using Ubuntu 14.04.2. >>>> >>>> Our schema is 5 tables, all with COMPACT STORAGE. Each table is similar >>>> to the definition here: >>>> https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a >>>> >>>> This is our cassandra.yaml: >>>> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-yaml >>>> >>>> Like I mentioned we use 8u60 with G1GC and have used many of the GC >>>> settings in Al Tobey's tuning guide. This is our upstart config with JVM and >>>> other CPU settings: https://gist.github.com/mheffner/dc44613620b25c4fa46d >>>> >>>> We've used several of the sysctl settings from Al's guide as well: >>>> https://gist.github.com/mheffner/ea40d58f58a517028152 >>>> >>>> Our client application is able to write using either Thrift batches >>>> using Asytanax driver or CQL async INSERT's using the Datastax Java driver. >>>> >>>> For testing against Thrift (our legacy infra uses this) we write batches >>>> of anywhere from 6 to 1500 rows at a time. Our p99 for batch execution is >>>> around 45ms but our maximum (p100) sits less than 150ms except when it >>>> periodically spikes to the full 10seconds. >>>> >>>> Testing the same write path using CQL writes instead demonstrates >>>> similar behavior. Low p99s except for periodic full timeouts. We enabled >>>> tracing for several operations but were unable to get a trace that completed >>>> successfully -- Cassandra started logging many messages as: >>>> >>>> INFO [ScheduledTasks:1] - MessagingService.java:946 - _TRACE messages >>>> were dropped in last 5000 ms: 52499 for internal timeout and 0 for cross >>>> node timeout >>>> >>>> And all the traces contained rows with a "null" source_elapsed row: >>>> https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/0327d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out >>>> >>>> >>>> We've exhausted as many configuration option permutations that we can >>>> think of. This cluster does not appear to be under any significant load and >>>> latencies seem to largely fall in two bands: low normal or max timeout. This >>>> seems to imply that something is getting stuck and timing out at the max >>>> write timeout. >>>> >>>> Any suggestions on what to look for? We had debug enabled for awhile but >>>> we didn't see any msg that pointed to something obvious. Happy to provide >>>> any more information that may help. >>>> >>>> We are pretty much at the point of sprinkling debug around the code to >>>> track down what could be blocking. >>>> >>>> >>>> Thanks, >>>> >>>> Mike >>>> >>>> -- >>>> >>>> Mike Heffner >>>> Librato, Inc. >>>> >>> >> >> >> >> -- >> >> Mike Heffner >> Librato, Inc. >> > > > > -- > > Mike Heffner > Librato, Inc. > -- Close the World, Open the Net http://www.linux-wizard.net