Return-Path: Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: (qmail 86750 invoked from network); 7 Jul 2010 18:08:56 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Jul 2010 18:08:56 -0000 Received: (qmail 9279 invoked by uid 500); 7 Jul 2010 18:08:56 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 9248 invoked by uid 500); 7 Jul 2010 18:08:55 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 9240 invoked by uid 99); 7 Jul 2010 18:08:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Jul 2010 18:08:55 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.190.49.36] (HELO web53006.mail.re2.yahoo.com) (206.190.49.36) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 07 Jul 2010 18:08:47 +0000 Received: (qmail 76697 invoked by uid 60001); 7 Jul 2010 18:07:25 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1278526045; bh=3lBEDAWkKLvODKFjAp+SglfVRONVr3w4eaVj5RwgZAE=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=4hDD0SRGgXuP/Xs75ne0pLoE5oY8ZHbl9KetEdS9fUdLcDNBoMmBAv2Wbc9u9QeIHo2s2y7wGlHiDz8oqy4GAkptkjtU+G9f5mh4bzGY3F0TcAcW08R1kJ6i6PL0ScbnCJvXUTSjNW6XzYhoRrm+LtyJ4gq/A+jJbUVcRkXGZLc= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=PZY8FVf2ehjQ8hUJ/cZ6GYAsK8Y9zz3s4HcvI4M8kigWA0L24GaWrc6KNIdN+k9hiRZQR6Tdwhfs1bxcRpktjWTkmxDuH/PFUpuM/BbjL7D0JLlSkcFkG/RhHVTfHeF1bij98VaaVlk5RjPd2b+91obUhBqqnONAPsr5d8MfEEI=; Message-ID: <263750.76427.qm@web53006.mail.re2.yahoo.com> X-YMail-OSG: H_JjXuIVM1musZGWyFIjLS1qsDlzyTjKVBS2ezRcl61mpcE 6WIgZJ2DWHolblfzcUNZIfaDzwrKK1EbMsAzkl33PYwq26ghLgWLK7u2Pw9m GpN0FogittFfADLt9xNUKcreyQaK0I5jzAcfWLGOhLf3KkJM9AC6_WTydeNT ia8aESzd7JLxnlol0rTJ5edJhpA3H8Yj44rfvP4D5lyPIIfji7Tj4ptJ4j66 hoIcTxLeZi0CcxeHbjoKmSKH1ZVz42RaxDNTUEdlXkkNaxUV6E5OTJrAoixi qDXq7L.MpYHD3Ni1UFp2mCKM44Kf2kwC4sfRLQM.Cgr2CP.1QAAtiljyrqMG 5OkuW8WSE0OUNGRzFllIAmT9UVu7NRw-- Received: from [64.71.26.138] by web53006.mail.re2.yahoo.com via HTTP; Wed, 07 Jul 2010 11:07:25 PDT X-Mailer: YahooMailRC/420.4 YahooMailWebService/0.8.104.274457 References: Date: Wed, 7 Jul 2010 11:07:25 -0700 (PDT) From: Rishi Bhardwaj Subject: Re: Minimizing the impact of compaction on latency and throughput To: dev@cassandra.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-1378489883-1278526045=:76427" X-Virus-Checked: Checked by ClamAV on apache.org --0-1378489883-1278526045=:76427 Content-Type: text/plain; charset=us-ascii I have done some bulk write performance tests and I saw background compaction making a big detrimental impact on the write performance. I was also wondering if there is a tunable to limit the frequency of the compaction on the sstables. If not, then adding such a configuration option would also help in controlling the performance impact of compaction operation. -Rishi ________________________________ From: Peter Schuller To: dev@cassandra.apache.org Sent: Wed, July 7, 2010 10:09:25 AM Subject: Minimizing the impact of compaction on latency and throughput Hello, I have repeatedly seen users report that background compaction is overly detrimental to the behavior of the node with respect to latency. While I have not yet deployed cassandra in a production situation where latencies are closely monitored, these reports do not really sound very surprising to me given the nature of compaction and unless otherwise stated by developers here on the list I tend to believe that it is a real issue. Ignoring implementation difficulties for a moment, a few things that could improve the situation, that seem sensible to me, are: * Utilizing posix_fadvise() on both reads and writes to avoid obliterating the operating system's caching of the sstables. * Add the ability to rate limit disk I/O (in particular writes). * Add the ability to perform direct I/O. * Add the ability to fsync() regularly on writes to force the operating system to not decide to flush hundreds of megabytes of data out in a single burst. * (Not an improvement but general observation: it seems useless for writes to the commit log to remain in cache after an fsync(), and so they are a good candidate for posix_fadvise()) None of these would be silver bullets, and the importance and appropriate settings for each would be very dependent on operating system, hardware, etc. But having the ability to control some or all of these should, I suspect, allow significantly lessening the impact of compaction under a variety of circumstances. With respect to cache eviction, the this is one area where the impact can probably be expected to be higher the more you rely on the operating systems caching, and the less you rely on in-JVM caching done by cassandra. The most obvious problem points to me include: * posix_fadvise() and direct I/O cause portability and building issues, necessitating native code. * rate limiting is very indirect due to read-ahead, caching, etc. in particular for writes, rate limiting them would likely be almost useless without also having fsync() or direct I/O, unless it is rate limited to an extremely small amount and the cluster is taking very few writes (such that the typical background flushing done by most OS:es is done often enough to not imply huge amounts of data) Any thoughts? Has this already been considered and rejected? Do you think compaction is in fact not a problem already? Are there other, easier, better ways to accomplish the goal? -- / Peter Schuller --0-1378489883-1278526045=:76427--