Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5718F200CA6 for ; Tue, 13 Jun 2017 16:47:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 554A9160BDC; Tue, 13 Jun 2017 14:47:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 72B87160BC9 for ; Tue, 13 Jun 2017 16:47:44 +0200 (CEST) Received: (qmail 97921 invoked by uid 500); 13 Jun 2017 14:47:43 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 97906 invoked by uid 99); 13 Jun 2017 14:47:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jun 2017 14:47:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 9770F1AFF37 for ; Tue, 13 Jun 2017 14:47:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.129 X-Spam-Level: ** X-Spam-Status: No, score=2.129 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id AVjv6uIVarmn for ; Tue, 13 Jun 2017 14:47:36 +0000 (UTC) Received: from mail-vk0-f51.google.com (mail-vk0-f51.google.com [209.85.213.51]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 3CE7D615CA for ; Tue, 13 Jun 2017 14:43:38 +0000 (UTC) Received: by mail-vk0-f51.google.com with SMTP id 191so65120062vko.2 for ; Tue, 13 Jun 2017 07:43:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=DZIDn2bW227y2V0IEnQ3XPFYrS5Gp6snaXIHquNsG9k=; b=jRdyYgjDxnxv0ldegrAtyh26m4425ckZVlhGkHURivrRNp7s6viogPOM4HmqfSMLlf V6lbNpjY8WrYWGwyIXiHIspNd8oJ/6DEnlR4kDfoHauCTHOfI2TFIM/ipyACsk19vLG3 6gnkL8b5XR2K8b+i6zuhqa2OOtpK0X4ccZBui7JsXrT8wWc7/0ED6wXghXBLFSDOqXsC XrpTmPS8N3xhIhPYEHOaQCbZRIFaTmPt4JU7lowgJ1Uq4jqDJZ2ogdlLZXaP9qbLpzvf sjzv0uEH+hThZ1g9rjrMkJ/zfKww20o+m6wxywLk8s1NQ2BAqXkgil8FydC1u+I4xk4c TRRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=DZIDn2bW227y2V0IEnQ3XPFYrS5Gp6snaXIHquNsG9k=; b=TbC75OwwlXgop7+/s6mHNRiKO/XXSZdhr6UcRsKM2y9l1iWRvfQdn15UlUtfZXVFWp FAoGV+PIiTsgu7j8PfsBxCF2R7xVYNvFVzNJb/LPWirvvpVjSGLBcs0kGOwEfm9gSiv3 1DbE3coLv0qkicCZZtdd51khPz0S+rF2WD4KtrbDsu0WrWM4KqK8sxBs1cbPJH/yOZie /ycU3U7+S2oCnhFTUyDZO1N/EtyErmAdAj2sJg9NC0vxunw40WBo3qslleGUu3euNSvE 8/QZWtWJPYvwJ3LUmhvoP9AYcTs8HZ+wgwcAYZEjYpjpxQWGcdEvapdtYQoHy7d5u/Uz 156g== X-Gm-Message-State: AKS2vOw5gt3CJsoEWuArnitwT6Sm1zDKpKH8xF65Qja9RftCirgHAO0Z CrI62E3eGr3HpgdKOQprfLdw+6Xoyw== X-Received: by 10.31.160.85 with SMTP id j82mr125154vke.42.1497365012293; Tue, 13 Jun 2017 07:43:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.149.138 with HTTP; Tue, 13 Jun 2017 07:43:31 -0700 (PDT) In-Reply-To: <78C77D1F-97C7-48BE-89CC-40575B531F90@gmail.com> References: <78C77D1F-97C7-48BE-89CC-40575B531F90@gmail.com> From: Pedro Gordo Date: Tue, 13 Jun 2017 15:43:31 +0100 Message-ID: Subject: Re: New contribution - Burst Hour Compaction Strategy To: "J. D. Jordan" Cc: Stefan Podkowinski , Cassandra DEV Content-Type: multipart/alternative; boundary="001a11427954a82faf0551d875f2" archived-at: Tue, 13 Jun 2017 14:47:45 -0000 --001a11427954a82faf0551d875f2 Content-Type: text/plain; charset="UTF-8" Hi all Although a couple of people engaged with me directly to talk about BHCS, I would also like to get the community opinion on this, so I thought I could get the discussion started by saying what the advantages would be and in which type of tables BHCS would do a good job. Please keep in mind that all my assumptions are without any real world experience on Cassandra, so this is where I expect to see some input of the C* veterans to help me steer BHCS implementation in the right direction if needed. This is a long email, so there's a TLDR if you don't want to read everything. This is intended for high-level discussion. For code level discussion, please refer to the document in JIRA. I'm aware that some might not like that no compaction occurs outside of the burst hour, but I thought of solutions for that, so please read the planned improvements below. *TL;DR* BHCS tries to address these issues with the current compaction strategies: - Necessity of allocating large storage during big compactions in STCS -> Through the sstable_max_size property of BHCS, we can keep SSTables below a certain size, so we wouldn't have issues with size during compaction - We might get to a point where to return the results of a query, we need to read from a large number of SSTables -> BHCS addresses this by making sure that the number of SSTables where a key exists will be consistently maintained at a low level after every compaction. The number of SSTables where a key exists is configurable, so in the limit, you could set it to 1 for optimal read performance. - Continuous high I/O of LCS -> addressed by the scheduling feature of BHCS. *Longer explanation:* *Where would it be advantageous using BHCS?* - Read-bound tables: due to BHCS maintaining the number of key copies at a low level, the read speed would be consistently fast. Since there's not a lot of writes in this type of table, even if there are new SSTables produced containing that key, the number SSTables containing that key would be set again to 1 after burst hour (BH). - Write-bound tables: in this scenario, there's a lot of SSTables created outside of BH, but few reads, so the issue with existing strategies would be a continuous high I/O dedicated to compaction. With BHCS during these active hours, we would have an increase in disk size, but I assume that this disk increase outside the BH would be tolerable since a lot of space would be released during the burst. Still, if that's a big issue, I plan to address this with the improvement (1). *Where is BHCS NOT recommended and what improvements can be done to make it viable?* - Read and write-heavy tables: because outside BH, SSTables would increase until the burst kicks in, there can be an increase in the read speed and disk used space. This could also be solved with improvement (1), (3) or (5). *Planned Improvements:* (1) - The user could indicate that he wants continuous compaction. This would change the strategy in such a way that outside of the Burst Hour, STCS would be used to maintain an acceptable read speed and disk used space. And then when BH would kick in, it would set key copies and disk size again to optimal levels. (2) - During table creation the user, might not be aware of the compaction configurable details, so a user-friendly configuration would be provided. If the user sets the table as a Write-and-Read heavy table, then improvement (1) would be activated. Otherwise, the strategy would default to its current config to save resources during the outside the BH. (3) - Instead, of just one burst hour, we could set several periods for BHCS to run during the day (for instance, every 3 hours or another schedule). *Ideas:* (4) - Continuously evaluate how many pending compactions we have and I/O status, and then based on that, we start (or not) the compaction. (5) - If outside the BH, the size for all the SSTables in a family set reaches a certain threshold, then background compaction can occur anyway. This threshold should be elevated due to the high CPU usage of BHCS. Please let me know your thoughts on this. Thanks! Best regards Pedro Gordo On 10 June 2017 at 22:22, J. D. Jordan wrote: > GitHub has some good guides on how to use git and make a pull request for > a project. > > https://guides.github.com/introduction/flow/ > https://guides.github.com/activities/forking/ > > On Jun 10, 2017, at 3:17 PM, Pedro Gordo > wrote: > > Hi all > > I've added to JIRA, a document explaining how BHCS works with code > snippets, and the motivation behind it. Because I'm not sure we can send > attachments to the mailing list, please get the document from JIRA: > https://issues.apache.org/jira/browse/CASSANDRA-12201 > > I'll check how to address the Git history in the next days. Can you please > point me to a repo that you merged into C*, with a good history, so I can > check it out and replicate the format in mine? > > Best regards > Pedro Gordo > > --001a11427954a82faf0551d875f2--