Return-Path: X-Original-To: apmail-accumulo-dev-archive@www.apache.org Delivered-To: apmail-accumulo-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 36DEF18396 for ; Wed, 19 Aug 2015 23:18:23 +0000 (UTC) Received: (qmail 4377 invoked by uid 500); 19 Aug 2015 23:18:23 -0000 Delivered-To: apmail-accumulo-dev-archive@accumulo.apache.org Received: (qmail 4336 invoked by uid 500); 19 Aug 2015 23:18:23 -0000 Mailing-List: contact dev-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list dev@accumulo.apache.org Delivered-To: moderator for dev@accumulo.apache.org Received: (qmail 51628 invoked by uid 99); 19 Aug 2015 22:54:59 -0000 X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.98 X-Spam-Level: ** X-Spam-Status: No, score=2.98 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=J+zgovzOBqg8HE8DgrOGpfjsKLTfZjaTg90Zit4z+ks=; b=gXIEattltkb3nn9dbsrTQACtzC9qf5lXHtT1465doNp1e73Q0wlWK+609BfQWFa7OY 7t0qBcxryM5PbghZpROyH50yXOkH3JOzG4TpbTXZwgUTJPgJwqnT/7vEWP+5o3ZqvT6J 74Y/M3TZzTLeikkadtuXg8EwFaCZwm2iylBczz1k3Vf1N94GxkGakyLCv1Df2qIZCNof FH/P6h07fpYAUhLEICgSHXCWEZXAFAZktWDUI4+2Wa6t51kSDerLpHrmCnZzX0K701KQ 98h5XPmyM74gmii0Bh8tEMuN7I+pO/u3ElBlz+MAWNqadNGubCaTmIy7+npbC8Ysfwus VMYw== X-Gm-Message-State: ALoCoQnAPVvK7TFQsryo0V26+DOuS5ghaMHt26xZx1cCwoWuYduK2gEWqAiAeQhgSqGYMJ9x1mc5 MIME-Version: 1.0 X-Received: by 10.60.159.196 with SMTP id xe4mr12708970oeb.23.1440024887242; Wed, 19 Aug 2015 15:54:47 -0700 (PDT) In-Reply-To: References: <55D4C9CD.80708@gmail.com> <55D4CB62.8010008@gmail.com> <20150819184735.GA28733@ll.mit.edu> <55D4D88B.8020600@gmail.com> <20150819213406.GB29271@ll.mit.edu> Date: Wed, 19 Aug 2015 18:54:47 -0400 Message-ID: Subject: Re: HBase and Accumulo From: Ted Malaska To: dev@hbase.apache.org Cc: dev , kepner@ll.mit.edu Content-Type: multipart/alternative; boundary=047d7bd6c488df500c051db1eb1d --047d7bd6c488df500c051db1eb1d Content-Type: text/plain; charset=UTF-8 I'm on the side of benchmarking for the use case and with an expert. There a so many ways to cheat a benchmark. And the bench mark may not be anything like your use case. On Aug 19, 2015 5:43 PM, "Andrew Purtell" wrote: > I think someone who uses third party benchmarks to assess a system like > HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so > perhaps we must agree to disagree. > > > On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner wrote: > > > I agree, that performance on real apps is the most important for > > any particular organization, but as technologists how do we measure > > ourselves? > > Hence imperfect benchmarking remains our only recourse. > > > > On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote: > > > I can't speak for anyone other than myself in the HBase community, but > > I'm > > > much more interested and focused on performance analysis and > > > developing/deploying for the use cases of my employer than > participating > > in > > > generic bench-marketing to make weapons for happy OSS warriors. Perhaps > > > this does a disservice to the HBase project overall and if so then I > > > apologize to others on the project for that. > > > > > > That said, from long and bitter experience let me state the only > > benchmarks > > > that every really matter are the comparative benchmarks you make for > your > > > own use cases in your own environments, preferably exercising those > > > candidates with real data and operating conditions. See: > > > https://pbs.twimg.com/media/CMnTyKVUEAA1tOm.jpg (smile) > > > > > > > > > > > > On Wed, Aug 19, 2015 at 12:27 PM, Josh Elser > > wrote: > > > > > > > Alright, I have to ask... are you referring to the paper that cites > > > > Accumulo performance without write-ahead logs enabled? I have some > > serious > > > > reservations about the relevance of that paper to this conversation > and > > > > just want to make sure people aren't led astray by what the actual > > takeaway > > > > should be. > > > > > > > > Jeremy Kepner wrote: > > > > > > > >> A big difference between Accumulo and HBase is the published > > performance > > > >> numbers. > > > >> The Accumulo community has done a good job of continuing to publish > > > >> up-to-date performance > > > >> numbers in peer-reviewed venues which allow Accumulo to claim best > in > > the > > > >> world performance. > > > >> > > > >> The HBase community hasn't been doing that so much. It would be > > great if > > > >> they did because > > > >> the HBase points on the graphs are old and it would be good to get > new > > > >> ones. > > > >> > > > >> > > > >> > > > >> On Wed, Aug 19, 2015 at 02:30:58PM -0400, Josh Elser wrote: > > > >> > > > >>> Like I've said many times now, it's relative to your actual > problem. > > > >>> If you don't have that much data (or intend to grow into that much > > > >>> data), it's not an issue. Obviously, this is the case for you. > > > >>> > > > >>> However, it is an architectural difference between the two projects > > > >>> with known limitations for a single metadata region. It's a > > > >>> difference as what was asked for by Jerry. > > > >>> > > > >>> Ted Malaska wrote: > > > >>> > > > >>>> I've been doing HBase for a long time and never had an issue with > > region > > > >>>> count limits and I have clusters with 10s of billions of records. > > Many > > > >>>> there would be issues around a couple Trillion records, but never > > got > > > >>>> that > > > >>>> high yet. > > > >>>> > > > >>>> Ted Malaska > > > >>>> > > > >>>> On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser > > > >>>> wrote: > > > >>>> > > > >>>> Oh, one other thing that I should mention (was prompted off-list). > > > >>>>> > > > >>>>> (definition time since cross-list now: HBase regions == Accumulo > > > >>>>> tablets) > > > >>>>> > > > >>>>> Accumulo will handle many more regions than HBase does now due > to a > > > >>>>> splittable metadata table. While I was told this was a very long > > and > > > >>>>> arduous journey to implement correctly (WRT splitting, merges and > > bulk > > > >>>>> loading), users with "too many regions" problems are extremely > few > > and > > > >>>>> far > > > >>>>> between for Accumulo. > > > >>>>> > > > >>>>> I was very happy to see effort/design being put into this in > HBase. > > > >>>>> And, > > > >>>>> just to be fair in criticism/praises, HBase does appear to me to > do > > > >>>>> assignments of regions much faster than Accumulo does on a small > > > >>>>> cluster > > > >>>>> (~5-10 nodes). Accumulo may take a few seconds to notice and > > reassign > > > >>>>> tablets. I have yet to notice this with HBase (which also could > be > > due > > > >>>>> to > > > >>>>> lack of personal testing). > > > >>>>> > > > >>>>> > > > >>>>> Jerry He wrote: > > > >>>>> > > > >>>>> Hi, folks > > > >>>>>> > > > >>>>>> We have people that are evaluating HBase vs Accumulo. > > > >>>>>> Security is an important factor. > > > >>>>>> > > > >>>>>> But I think after the Cell security was added in HBase, there is > > no > > > >>>>>> more > > > >>>>>> real gap compared to Accumulo. > > > >>>>>> > > > >>>>>> I know we have both HBase and Accumulo experts on this list. > > > >>>>>> Could someone shred more light? > > > >>>>>> I am looking for real gap comparing HBase to Accumulo if there > is > > any > > > >>>>>> so > > > >>>>>> that I can be prepared to address them. This is not limited to > the > > > >>>>>> security > > > >>>>>> area. > > > >>>>>> > > > >>>>>> There are differences in some features and implementations. But > > they > > > >>>>>> don't > > > >>>>>> see like real 'gaps'. > > > >>>>>> > > > >>>>>> Any comments and feedbacks are welcome. > > > >>>>>> > > > >>>>>> Thanks, > > > >>>>>> > > > >>>>>> Jerry > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > > > > > > > -- > > > Best regards, > > > > > > - Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > --047d7bd6c488df500c051db1eb1d--