Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C477310D71 for ; Wed, 16 Jul 2014 20:43:08 +0000 (UTC) Received: (qmail 32337 invoked by uid 500); 16 Jul 2014 20:43:05 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 32138 invoked by uid 500); 16 Jul 2014 20:43:05 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 31611 invoked by uid 99); 16 Jul 2014 20:43:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2014 20:43:04 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of amansk@gmail.com designates 209.85.213.169 as permitted sender) Received: from [209.85.213.169] (HELO mail-ig0-f169.google.com) (209.85.213.169) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Jul 2014 20:42:59 +0000 Received: by mail-ig0-f169.google.com with SMTP id r2so4522846igi.0 for ; Wed, 16 Jul 2014 13:42:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=GANS8EX7ufnJjmP0RGpGLTMpsG/aymhZo+i1yfJ0OGA=; b=JXQIFI3bx4rUc3sY/HpYzGOFOibRuplUKIkW+eKrOhHTyhEQSzEE/IXaOysYlM1Bw5 wI+ML6JrVyBuBIDiBhfwlTGrYklCO2PQjaX4BPYSrzkDTAa1KjNrGK1VEFpL+Et23D0d lmWtOXxesn5JDejP2A8Kb5q6/NrswEhzyTAFvlZOT5nXCPuguso1dj3iuznyKiaGBPJO w2QjdGG5H9Y6Ktvu4hhkOPUUOByyyDAXbyCNHtBDAEylv6xF0ltaIo+6bUAxkZY6qAyE +mRVz3hyvJz9bPb/b7yLvGUNeoi6MglJ/RkAQs7RJ3iR68aC0GLON73MK9/ACT0nurDd nL4Q== X-Received: by 10.60.123.103 with SMTP id lz7mr40133925oeb.18.1405543359138; Wed, 16 Jul 2014 13:42:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.28.67 with HTTP; Wed, 16 Jul 2014 13:41:59 -0700 (PDT) In-Reply-To: <1405526793.13450.YahooMailNeo@web140605.mail.bf1.yahoo.com> References: <1405526793.13450.YahooMailNeo@web140605.mail.bf1.yahoo.com> From: Amandeep Khurana Date: Wed, 16 Jul 2014 13:41:59 -0700 Message-ID: Subject: Re: Cluster sizing guidelines To: "user@hbase.apache.org" , lars hofhansl Content-Type: multipart/alternative; boundary=047d7b5d4840a365fe04fe559037 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5d4840a365fe04fe559037 Content-Type: text/plain; charset=UTF-8 Thanks Lars. I'm curious how we'd answer questions like: 1. How many nodes do I need to sustain a write throughput of N reqs/sec with payload of size M KB? 2. How many nodes do I need to sustain a read throughput of N reqs/sec with payload of size M KB with a latency of X ms per read. 3. How many nodes do I need to store N TB of total data with one of the above constraints? This goes into looking at the bottlenecks that need to be taken into account during write and read times and also the max number of regions and region size that a single region server can host. What are your thoughts on this? -Amandeep On Wed, Jul 16, 2014 at 9:06 AM, lars hofhansl wrote: > This is a somewhat fuzzy art. > > Some points to consider: > 1. All data is replicated three ways. Or in other words, if you run three > RegionServer/Datanodes each machine will get 100% of the writes. If you run > 6, each gets 50% of the writes. From that aspect HBase clusters with less > than 9 RegionServers are not really useful. > 2. As for the machines themselves. Just go with any reasonable machine, > and pick the cheapest you can find. At least 8 cores, at least 32GB of RAM, > at least 6 disks, no RAID needed. (we have machines with 12 cores in 2 > sockets, 96GB of RAM, 6 4TB drives, no HW RAID). HBase is not yet well > tuned for SSDs. > > > You also carefully need to consider your network topology. With HBase > you'll see quite some east-west traffic (i.e. between racks). 10ge is good > if you have it. We have 1ge everywhere so far, and we found this is a > single most bottleneck for write performance. > > > Also see this blog post about HBase memory sizing (shameless plug): > http://hadoop-hbase.blogspot.de/2013/01/hbase-region-server-memory-sizing.html > > > I'm planning a blog post about this topic with more details. > > > -- Lars > > > > ________________________________ > From: Amandeep Khurana > To: "user@hbase.apache.org" > Sent: Tuesday, July 15, 2014 10:48 PM > Subject: Cluster sizing guidelines > > > Hi > > How do users usually go about sizing HBase clusters? What are the factors > you take into account? What are typical hardware profiles you run with? Any > data points you can share would help. > > Thanks > Amandeep > --047d7b5d4840a365fe04fe559037--