Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 63185200C6C for ; Fri, 5 May 2017 16:56:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 61B97160BAF; Fri, 5 May 2017 14:56:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A7272160B97 for ; Fri, 5 May 2017 16:56:17 +0200 (CEST) Received: (qmail 19502 invoked by uid 500); 5 May 2017 14:56:16 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 19483 invoked by uid 99); 5 May 2017 14:56:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 May 2017 14:56:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 58DCA1883AB for ; Fri, 5 May 2017 14:56:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.48 X-Spam-Level: ** X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=weborama-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id fasLkyeqXcbr for ; Fri, 5 May 2017 14:56:13 +0000 (UTC) Received: from mail-it0-f44.google.com (mail-it0-f44.google.com [209.85.214.44]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id E66145FAF9 for ; Fri, 5 May 2017 14:56:12 +0000 (UTC) Received: by mail-it0-f44.google.com with SMTP id 131so2563503itz.1 for ; Fri, 05 May 2017 07:56:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=weborama-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=QkAgbOkwdrxhNtrXXbUkNkwhPD0JS81ebeNB15Sq4LM=; b=DO21vL+IrLu/qCWw1ZZcXAT4ylpB1OIMAKQM/TsO+5gxx9cvkBzy6dXghx2xYzx0Wb 3e5C+w8bI7sVjrPwd0Zrni7kN7tHUBMZyijodFw+o8CaElG+pBVRlElpBGUJfSKZSPy0 f+QARd/Yd5eY0xNy/1kGRNl5Eluyvl1ph0onTxebr6DVmQBAQ4JNkpr3nne/65Ywzf/Q lOFq+uGhXqpTrXF+81F2vhbrlYlKes8cuLWrNmOWRUlIqWk+/Dv8w4gfEGbZXrFG1XnR rRAlEeGyRvksmMukRwHAj9fAk5YGX9oj+ojayEaxhcGWJCWQiPyfH/bq/UXiospqSxXU QANA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=QkAgbOkwdrxhNtrXXbUkNkwhPD0JS81ebeNB15Sq4LM=; b=bNtR8hCSds5qcHY/fvHbGNMNe72DDNH4jOm8RR2+cqTmkxTvX5kdnJTUpE9uwpz8aX HYpK6XONzMV/R2Pnh2WyyKZo4CZUvoYCZsxlarTf73uGu80oVDqG2yHMNdvBGU1huU7n zq9BzxPNDqzGgEwLKs3qjBBKRRt4PN6FMbHlKoW1k3tlgBcZO5UqmOq812uleyPVe1eW STKB/lFY8QZSQ4Uu+MsE1q05WGvYxU+Q3QVKigbBYVMyRQq3qs2bjhK1LJxbb0kSJiVv WQNQUGdabpBh7+4ItVDvhcSSAQTG9OAxFjytPN7raAPfpcl98tLCqmVKPjrUd1nmhcI+ Bakg== X-Gm-Message-State: AN3rC/4T3z2dRmJWgoUdMaX/LXyJgNeqmtOPDR2nS7d+BJ0Yiar3JW63 ioPSWpj4j9FC7Fkq8OMUYlBz4ijQgRUr X-Received: by 10.36.87.15 with SMTP id u15mr8600558ita.58.1493996172233; Fri, 05 May 2017 07:56:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.103.129 with HTTP; Fri, 5 May 2017 07:56:11 -0700 (PDT) In-Reply-To: References: From: Alexander Ilyin Date: Fri, 5 May 2017 16:56:11 +0200 Message-ID: Subject: Re: Compaction monitoring To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a1134f3a22484ba054ec8175a archived-at: Fri, 05 May 2017 14:56:18 -0000 --001a1134f3a22484ba054ec8175a Content-Type: text/plain; charset=UTF-8 Kevin, Thanks for your answer. We're using Ambari to manage our cluster. I see an increase of CPU usage and IO but it's not a big one. And this increase tends to be at the beginning of off-peak window although it's difficult to tell for sure since our workload comes in bursts and the picture is not clear. That's why I was asking if there are some metrics related specifically to compaction. But probably I can shorten the window. As for region sizes, I will experiment, as you suggest. On Fri, May 5, 2017 at 4:07 PM, Kevin O'Dell wrote: > Alexander, > > That is a great series of questions. What are you using for > instrumentation of your HBase cluster? Cloudera Manager, Ambari, Ganglia, > Cacti, etc? You are really asking a lot of performance based metric > questions. I don't think you will be able to answer your questions without > first being able to answer these questions: > > Do you see the Major Compaction I/O/CPU/Memory spikes throughout the whole > "off-peak" window? > > Do you have the host resources overhead to add additional compaction > threads to shorten it if so? > > What do your responses times look like during your "off-peak hours" are you > still within your SLAs? > > Answering these questions should quickly allow you to answer your first two > questions. Your last question is very interesting: > > *how much degrades my performance if region size is becoming too large? <-- > This is 100% depends, it depends on your environment, I/O usage, SLAs etc, > I am not sure if anyone has done documented compaction times based on > Region sizes. You may have to do some trial and error here. > > I hope this helps! > > > > On Fri, May 5, 2017 at 8:47 AM, Alexander Ilyin > wrote: > > > Hi, > > > > Tuning HBase performance I've found a lot of settings which affect > > compaction process (off-peak hours, time between compactions, compaction > > ratio, region sizes, etc.). They all seem to be useful and there are > > recommendations in the doc saying which values to set. But I found no way > > to assess how they actually affect my cluster performance, i.e. how much > > resources is taken by compaction and when. I would like to figure out > which > > settings work best for my dataset and my specific workload but with only > > general recommendations in hand it seems difficult to do. > > > > For example, I have difficulties answering the following questions: > > * can I shorten my off-peak hours range? > > * can I afford to do compactions more often? or more aggressively? > > * how much degrades my performance if region size is becoming too large? > > > > HBase version I'm using is 1.1.2 > > > > > > Alexander > > > > > > -- > Kevin O'Dell > Field Engineer > 850-496-1298 | Kevin@rocana.com > @kevinrodell > > --001a1134f3a22484ba054ec8175a--