Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B09921874D for ; Wed, 19 Aug 2015 19:05:57 +0000 (UTC) Received: (qmail 37785 invoked by uid 500); 19 Aug 2015 19:05:56 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 37707 invoked by uid 500); 19 Aug 2015 19:05:56 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Received: (qmail 37503 invoked by uid 99); 19 Aug 2015 19:05:56 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Aug 2015 19:05:56 +0000 Received: from mail-vk0-f47.google.com (mail-vk0-f47.google.com [209.85.213.47]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 77E671A009B for ; Wed, 19 Aug 2015 19:05:56 +0000 (UTC) Received: by vkm66 with SMTP id 66so6768756vkm.1 for ; Wed, 19 Aug 2015 12:05:55 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.52.94.42 with SMTP id cz10mr18512828vdb.22.1440011155484; Wed, 19 Aug 2015 12:05:55 -0700 (PDT) Received: by 10.31.13.66 with HTTP; Wed, 19 Aug 2015 12:05:55 -0700 (PDT) In-Reply-To: References: <55D4C9CD.80708@gmail.com> Date: Wed, 19 Aug 2015 15:05:55 -0400 Message-ID: Subject: Re: HBase and Accumulo From: Christopher To: Accumulo Dev List Cc: dev@hbase.apache.org Content-Type: text/plain; charset=UTF-8 Forgive my ignorance about HBase, but wouldn't size of records count, also? Your response seems to imply that number of records is what matters for how many regions are needed. For what it's worth, Accumulo's tablets are split based on storage size, not number of records. I assumed the same was true for HBase. Am I wrong? -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Aug 19, 2015 at 2:28 PM, Ted Malaska wrote: > I've been doing HBase for a long time and never had an issue with region > count limits and I have clusters with 10s of billions of records. Many > there would be issues around a couple Trillion records, but never got that > high yet. > > Ted Malaska > > On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser wrote: > >> Oh, one other thing that I should mention (was prompted off-list). >> >> (definition time since cross-list now: HBase regions == Accumulo tablets) >> >> Accumulo will handle many more regions than HBase does now due to a >> splittable metadata table. While I was told this was a very long and >> arduous journey to implement correctly (WRT splitting, merges and bulk >> loading), users with "too many regions" problems are extremely few and far >> between for Accumulo. >> >> I was very happy to see effort/design being put into this in HBase. And, >> just to be fair in criticism/praises, HBase does appear to me to do >> assignments of regions much faster than Accumulo does on a small cluster >> (~5-10 nodes). Accumulo may take a few seconds to notice and reassign >> tablets. I have yet to notice this with HBase (which also could be due to >> lack of personal testing). >> >> >> Jerry He wrote: >> >>> Hi, folks >>> >>> We have people that are evaluating HBase vs Accumulo. >>> Security is an important factor. >>> >>> But I think after the Cell security was added in HBase, there is no more >>> real gap compared to Accumulo. >>> >>> I know we have both HBase and Accumulo experts on this list. >>> Could someone shred more light? >>> I am looking for real gap comparing HBase to Accumulo if there is any so >>> that I can be prepared to address them. This is not limited to the >>> security >>> area. >>> >>> There are differences in some features and implementations. But they don't >>> see like real 'gaps'. >>> >>> Any comments and feedbacks are welcome. >>> >>> Thanks, >>> >>> Jerry >>> >>>