Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 93CEE18E4A for ; Wed, 18 Nov 2015 11:44:26 +0000 (UTC) Received: (qmail 92306 invoked by uid 500); 18 Nov 2015 11:44:19 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 92264 invoked by uid 500); 18 Nov 2015 11:44:19 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92254 invoked by uid 99); 18 Nov 2015 11:44:19 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Nov 2015 11:44:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id AB582180A4D for ; Wed, 18 Nov 2015 11:44:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=stickyads.tv Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id OR541tXROA7F for ; Wed, 18 Nov 2015 11:44:08 +0000 (UTC) Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 5846E2026D for ; Wed, 18 Nov 2015 11:44:08 +0000 (UTC) Received: by wmec201 with SMTP id c201so69005394wme.1 for ; Wed, 18 Nov 2015 03:44:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stickyads.tv; s=dkim; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=snAk9bEDYvWHRhm74QyX92FxVptCS2X+M0o6BJV66d0=; b=OmgG+Jj0HlnYHJbkuwyG8sE4aYsF8cr++a6wrFSwAZ2BQTiKOviHXy3KCV0Y026W3E 8cEa7MbXW1UOkDZn4v8W/HqljF//pMH9855OgOI+MycsYAQFRlzkFNtybPfKC2HOMRpL ycnmNjEGGkAuhRkfUpAIhAe0d2SWpMx3kQ3+A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=snAk9bEDYvWHRhm74QyX92FxVptCS2X+M0o6BJV66d0=; b=ITgw813/KBVBWH1bc4xxXS2vKTE5fIoVSb5ydPs5k+Qn6S3Rqst5oFBqbZTU3BibgS qRFMlkf4jnpjOJ5jsYsf52UHfe/Pr3BhqHd/heY/HN9eNBu1CEfYG8VYnEzLljmQKXVe uv9BeJJjl2M+dZK5bwVE5mJCzqJr+2puHFoE0ALvAFtS+x4fcVzkDdoJ/SoxIojCQK6n dirK6zO9x0WyfDi2vkptTbr2dN7rpU499N0e1S6jKMAclmm9IocUHgAmAJzmNeWxXdI0 ZCRlcpZVvig7Ryof7DndphqggnYGJ4ASN52NzHHE6gbLQD1Cv3w/99CVIuRMVeuAXJUB +gHw== X-Gm-Message-State: ALoCoQm2T5f9ISHXiv0aKnEzM3R8QXGcKYWIG3bVyI0rU1sZjV5LhGrPU5jaqw1vZqnO62NmRcoL X-Received: by 10.28.195.10 with SMTP id t10mr8333406wmf.11.1447847046943; Wed, 18 Nov 2015 03:44:06 -0800 (PST) Received: from ?IPv6:2a01:e35:2e29:a880:99e9:314c:7a55:ee67? ([2a01:e35:2e29:a880:99e9:314c:7a55:ee67]) by smtp.googlemail.com with ESMTPSA id z4sm2447000wjz.29.2015.11.18.03.44.05 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 18 Nov 2015 03:44:05 -0800 (PST) Message-ID: <564C6484.2040502@stickyads.tv> Date: Wed, 18 Nov 2015 12:44:04 +0100 From: Antoine Bonavita User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8 MIME-Version: 1.0 To: user@cassandra.apache.org, Robert Coli , sebastian.estevez@datastax.com Subject: Re: Help diagnosing performance issue References: <56499C10.8030605@stickyads.tv> <564B7312.3030307@stickyads.tv> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sebastian, Robet, First, a big thank you to both of you for your help. It looks like you were right. I used pcstat (awesome tool, thanks for that as well) and it appears some files I would not expect to be in cache actually are. Here is a sample of my output (edited for convenience, adding the file timestamp from the OS): * /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5951-big-Data.db - 000.619 % - Nov 16 12:25 * /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5954-big-Data.db - 000.681 % - Nov 16 13:44 * /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5955-big-Data.db - 000.610 % - Nov 16 14:11 * /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5956-big-Data.db - 015.621 % - Nov 16 14:26 * /var/lib/cassandra/data/views/views-451e4d8061ef11e5896f091196a360a0/la-5957-big-Data.db - 015.558 % - Nov 16 14:50 The SSTables that come before are all at about 0% and the ones that come after it are all at about 15%. As you can see the first SSTable at 15% date back from 24h. Given my application I'm pretty sure those are not from the reads (reads of data older than 1h is definitely under 0.1% of reads). Could it be that compaction is putting those in cache constantly ? If so, then I'm probably confused on the meaning/effect of max_sstable_age_days (set at 10 in my case) and base_time_seconds (not set in my case so the default of 3600 applies). I would not expect any compaction to happen beyond the first hour and the 10 days is here to make sure data still gets expired and SSTables removed (thus releasing disk space). I don't see where the 24h come from. If you guys can shed some light on this, it would be awesome. I'm sure I got something wrong. Regarding the heap configuration, both are very similar: * 32G machine: -Xms8049M -Xmx8049M -Xmn800M * 64G machine: -Xms8192M -Xmx8192M -Xmn1200M I think we can rule that out. Thanks again for you help, I truly appreciate it. A. On 11/17/2015 08:48 PM, Robert Coli wrote: > On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez > > > wrote: > > You're sstables are probably falling out of page cache on the > smaller nodes and your slow disks are killing your latencies. > > > +1 most likely. > > Are the heaps the same size on both machines? > > =Rob -- Antoine Bonavita (antoine@stickyads.tv) - CTO StickyADS.tv Tel: +33 6 34 33 47 36/+33 9 50 68 21 32 NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID