From user-return-55051-archive-asf-public=cust-asf.ponee.io@hbase.apache.org Fri Apr 6 12:14:00 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id C2722180649 for ; Fri, 6 Apr 2018 12:13:59 +0200 (CEST) Received: (qmail 2683 invoked by uid 500); 6 Apr 2018 10:13:58 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 2660 invoked by uid 99); 6 Apr 2018 10:13:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Apr 2018 10:13:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0CB3218071F for ; Fri, 6 Apr 2018 10:13:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.898 X-Spam-Level: * X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id mnKxXllEVinR for ; Fri, 6 Apr 2018 10:13:56 +0000 (UTC) Received: from mail-wr0-f173.google.com (mail-wr0-f173.google.com [209.85.128.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id F140A5F17F for ; Fri, 6 Apr 2018 10:13:55 +0000 (UTC) Received: by mail-wr0-f173.google.com with SMTP id y55so1151397wry.3 for ; Fri, 06 Apr 2018 03:13:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=+67XoTPq9vyFfAQbyr/JyQOTzLGkCD62IviYEXMMymQ=; b=t764DYOnFydlGy8r6qjLrsB1A3jBzI8yPy4600EUvuSGCuOek69wLm3jAsirJyHKsd ldSz5WiPDG3hQ+rg4K2Sw8a22l06R5YRfEd+ECGKWQZP+r/A3qvl10rifG1sA5o46OrY q4R9cjKloqfKHCsj7FCKrmY29MksBw5APJ27nM4+sB3Gp0OHhGN93ajFgGgIlwdKFabg 2HsTF/H5hUsv6F9DpIrwTrwIKEB7D5NrfeOHGTad+Mm9DLBqoCLUsv2PjdMFFCMnWJne z8eveOH8rAzsOOYdjf2T7et1nHMFGln1eRMgsvzsD/8wFtvUingT6afkU9UCMj2jffxu 69Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+67XoTPq9vyFfAQbyr/JyQOTzLGkCD62IviYEXMMymQ=; b=GBq7xQB6mCHKB5bs9heySwPM54x/gbIOQQ/d+sBWgZUMO+M+W6+NXbKwE3eApuerqy IfTb9p064VsquQITchzEBaZ1RWZFAEz2Lk3TicbffUbNUE9DCQ8fF5IP86e7cMR1gy9D gWi4PbIVt3TOeLXBRmZq/C7O2huCku9NnMYquOXEKJ6H7qkaYNX07okdhYK/MUg5Ovgf nY/8YOv7U8RpO7j+BneNrRy/hqyyXkBR/dCOfnCjl1lzm2H1n0RLbka5HEeFoBLd6gkL Ouefk3RU+lYgYGnvYDcH6FS/rkWWAPPn7hSnX+TBW+a6fEU0NaORe1Sgt6W5k7ZjlHqq sxqQ== X-Gm-Message-State: ALQs6tDSwDXdTHwD+0XOp2d5GYz67v8LUyvXAIPKvYkYbUQaINLOu22V kQYP37EPaXGrZCebC8SglGOI7w6hq1J7bwpxOfTGZBqm X-Google-Smtp-Source: AIpwx4/7P09/AXjOFE1+Zd42lf21//cBGgwDRoIZv660hqVJyNY9pCEa71Ot/FvjM4PPsPjh4A1v67CVmyT8yw7yDKE= X-Received: by 2002:a19:4402:: with SMTP id r2-v6mr15968727lfa.105.1523009634717; Fri, 06 Apr 2018 03:13:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.46.18.144 with HTTP; Fri, 6 Apr 2018 03:13:54 -0700 (PDT) From: Mark Bonetti Date: Fri, 6 Apr 2018 12:13:54 +0200 Message-ID: Subject: Which monitoring metrics to alert on? To: user@hbase.apache.org Content-Type: multipart/alternative; boundary="00000000000044574a05692b502d" --00000000000044574a05692b502d Content-Type: text/plain; charset="UTF-8" Hi, I'm building a monitoring system for HBase and want to set up default alerts (threshold or anomaly) on 2-3 key metrics everyone who uses HBase typically wants to alert on, but I don't yet have production-grade experience with HBase. Importantly, alert rules have to be generally useful, so can't be on metrics whose values vary wildly based on the size of deployment. In other words, which metrics would be most significant indicators that something went wrong with your HBase? I thought the best place to find experienced HBase users, who would find answering this question trivial, would be here. Thanks very much, Mark --00000000000044574a05692b502d--