Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 03F7B200B6E for ; Sun, 21 Aug 2016 08:26:38 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 026ED160AAF; Sun, 21 Aug 2016 06:26:38 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2355F160A87 for ; Sun, 21 Aug 2016 08:26:36 +0200 (CEST) Received: (qmail 17383 invoked by uid 500); 21 Aug 2016 06:26:35 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 17372 invoked by uid 99); 21 Aug 2016 06:26:35 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Aug 2016 06:26:35 +0000 Received: from mail-oi0-f42.google.com (mail-oi0-f42.google.com [209.85.218.42]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 705701A0118 for ; Sun, 21 Aug 2016 06:26:31 +0000 (UTC) Received: by mail-oi0-f42.google.com with SMTP id l203so113995410oib.1 for ; Sat, 20 Aug 2016 23:26:31 -0700 (PDT) X-Gm-Message-State: AEkoout8XmZoE+HjEZ6EHyoMSmY37L8QZOuhj+R76RNYfYFtQu7XGZgKTcmLuQj6JU6T5VU3BgyHoxzN4HYsWA== X-Received: by 10.202.171.73 with SMTP id u70mr8232465oie.42.1471760791253; Sat, 20 Aug 2016 23:26:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.229.2 with HTTP; Sat, 20 Aug 2016 23:26:00 -0700 (PDT) In-Reply-To: References: From: Mikhail Antonov Date: Sat, 20 Aug 2016 23:26:00 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Hbase federated cluster for messages To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a113c51c6283d63053a8f0301 archived-at: Sun, 21 Aug 2016 06:26:38 -0000 --001a113c51c6283d63053a8f0301 Content-Type: text/plain; charset=UTF-8 Just out of curiosity, is there anything particular about your deployment or usecase that raised this specific concern about Namenode performance? HDFS clusters with 80 datanodes shall be considered medium-sized; there are plenty of (much) bigger clusters out there in the field; and HBase clusters with 80 nodes aren't something very uncommon either. Fine tuning cluster of this size according to specific workload would certainly require some planning and work and setting a bunch of params related to heap/memstore/block cache sizing, GC settings, RPC scheduler settings, replication settings and a bunch of other things; but why the Namenode? -Mikhail On Sat, Aug 20, 2016 at 6:39 AM, Alexandr Porunov < alexandr.porunov@gmail.com> wrote: > Thank you Dima > > Best regards, > Alexandr > > On Sat, Aug 20, 2016 at 4:17 PM, Dima Spivak wrote: > > > Yup. > > > > On Saturday, August 20, 2016, Alexandr Porunov < > alexandr.porunov@gmail.com > > > > > wrote: > > > > > So, will it be ok if we have 80 data nodes (8TB on each node) and only > > one > > > namenode? Will it works for the messaging system? We will have 2x > > > replication so there are 320 TB of data (per year) (640 TB with > > > replication). 130000 R+W ops/sec. Each message 100 bytes or 1024 bytes. > > > Is it possible to handle such load with hbase? > > > > > > Sincerely, > > > Alexandr > > > > > > On Sat, Aug 20, 2016 at 8:44 AM, Dima Spivak > > > wrote: > > > > > > > You can easily store that much data as long as you don't have small > > > files, > > > > which is typically why people turn to federation. > > > > > > > > -Dima > > > > > > > > On Friday, August 19, 2016, Alexandr Porunov < > > alexandr.porunov@gmail.com > > > > > > > > wrote: > > > > > > > > > We are talking about facebook. So, there are 25 TB per month. 15 > > > billion > > > > > messages with 1024 bytes and 120 billion messages with 100 bytes > per > > > > month. > > > > > > > > > > I thought that they used only hbase to handle such a huge data If > > they > > > > used > > > > > their own implementation of hbase then I haven't questions. > > > > > > > > > > Sincerely, > > > > > Alexandr > > > > > > > > > > On Sat, Aug 20, 2016 at 1:39 AM, Dima Spivak > > > > > > > > wrote: > > > > > > > > > > > I'd +1 what Vladimir says. How much data (in TBs/PBs) and how > many > > > > files > > > > > > are we talking about here? I'd say that use cases that benefit > from > > > > HBase > > > > > > don't tend to hit the kind of HDFS file limits that federation > > seeks > > > to > > > > > > address. > > > > > > > > > > > > -Dima > > > > > > > > > > > > On Fri, Aug 19, 2016 at 2:19 PM, Vladimir Rodionov < > > > > > vladrodionov@gmail.com > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > FB has its own "federation". It is a proprietary code, I > presume. > > > > > > > > > > > > > > -Vladimir > > > > > > > > > > > > > > > > > > > > > On Fri, Aug 19, 2016 at 1:22 PM, Alexandr Porunov < > > > > > > > alexandr.porunov@gmail.com > > wrote: > > > > > > > > > > > > > > > No. There isn't. But I want to figure out how to configure > that > > > > type > > > > > of > > > > > > > > cluster in the case if there is particular reason. How > facebook > > > can > > > > > > > handle > > > > > > > > such a huge amount of ops without federation? I don't think > > that > > > > they > > > > > > > just > > > > > > > > have one namenode server and one standby namenode server. It > > > isn't > > > > > > > > possible. I am sure that they use federation. > > > > > > > > > > > > > > > > On Fri, Aug 19, 2016 at 10:08 PM, Vladimir Rodionov < > > > > > > > > vladrodionov@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> I am not sure how to do it but I have to configure > > federated > > > > > > cluster > > > > > > > > > with > > > > > > > > > >> hbase to store huge amount of messages (client to > client) > > > (40% > > > > > > > writes, > > > > > > > > > 60% > > > > > > > > > >> reads). > > > > > > > > > > > > > > > > > > Any particular reason for federated cluster? How huge is > huge > > > > > amount > > > > > > > and > > > > > > > > > what is the message size? > > > > > > > > > > > > > > > > > > -Vladimir > > > > > > > > > > > > > > > > > > On Fri, Aug 19, 2016 at 11:57 AM, Dima Spivak < > > > > > dspivak@cloudera.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > As far as I know, HBase doesn't support spreading tables > > > across > > > > > > > > > namespaces; > > > > > > > > > > you'd have to point it at one namenode at a time. I've > > heard > > > of > > > > > > > people > > > > > > > > > > trying to run multiple HBase instances in order to get > > access > > > > to > > > > > > all > > > > > > > > > their > > > > > > > > > > HDFS data, but it doesn't tend to be much fun. > > > > > > > > > > > > > > > > > > > > -Dima > > > > > > > > > > > > > > > > > > > > On Fri, Aug 19, 2016 at 11:51 AM, Alexandr Porunov < > > > > > > > > > > alexandr.porunov@gmail.com > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > I am not sure how to do it but I have to configure > > > federated > > > > > > > cluster > > > > > > > > > with > > > > > > > > > > > hbase to store huge amount of messages (client to > client) > > > > (40% > > > > > > > > writes, > > > > > > > > > > 60% > > > > > > > > > > > reads). Does somebody have any idea or examples how to > > > > > configure > > > > > > > it? > > > > > > > > > > > > > > > > > > > > > > Of course we can configure hdfs in a federated mode but > > as > > > > for > > > > > me > > > > > > > it > > > > > > > > > > isn't > > > > > > > > > > > suitable for hbase. If we want to save message from > > client > > > 1 > > > > to > > > > > > > > client > > > > > > > > > 2 > > > > > > > > > > in > > > > > > > > > > > the hbase cluster then how hbase know in which > namespace > > it > > > > > have > > > > > > to > > > > > > > > > save > > > > > > > > > > > it? Which namenode will be responsible for that > message? > > > How > > > > we > > > > > > can > > > > > > > > > read > > > > > > > > > > > client messages? > > > > > > > > > > > > > > > > > > > > > > Give me any ideas, please > > > > > > > > > > > > > > > > > > > > > > Sincerely, > > > > > > > > > > > Alexandr > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > -Dima > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > -Dima > > > > > > > > > > > > > > > > > > > > > > > -- > > > > -Dima > > > > > > > > > > > > > -- > > -Dima > > > --001a113c51c6283d63053a8f0301--